MATH3880 Introduction to Statistics and DNA MATH5880 Statistics and DNA Practical Session Monday, 16 November pm BRAGG Cluster

Size: px
Start display at page:

Download "MATH3880 Introduction to Statistics and DNA MATH5880 Statistics and DNA Practical Session Monday, 16 November pm BRAGG Cluster"

Transcription

1 MATH3880 Introduction to Statistics and DNA MATH5880 Statistics and DNA Practical Session Monday, 6 November pm BRAGG Cluster This document contains the tasks need to be done and completed by students taking the modules MATH3880 Introduction to Statistics and DNA and MATH5880 Statistics and DNA. A report needs to be submitted one week after the practical on Monday 23 November 2009 during the lecture. The report should be written using computer. If you have problem with this, for example, if you have certain disabilities that restrict you considerably in using the computer, let me know as soon as possible. Preparation Before we procedd with the practical session, make sure that you have checked or done the following: Read the limma package usersguide, available from: especially Chapter 3, Chapter 8 (Sections 8., 8.2, and 8.4), and Chapter 0 (Sections 0. and 0.2). Read the note How to install Bioconductor packages in the University of Leeds Bragg Cluster. This note is also available from the module webpage: Following the notes, please check that you have enough space on your My Documents folder. Install the limma package as directed in the note (Section Extracting and installing the packages) Open R and load the limma package as directed in the note (Section Preparation in R) Download the LPS data from the webpage to your Data directory (again, see the note). Read the background and objective of LPS experiment in the handout of Lecture 8. Set the working directory in R into M:/Data, by typing > setwd("m:/data")

2 2 Reading the LPS data into your R session 2. Reading the raw expression data Once you have done the preparation above, you can start reading the raw microarray data by using the following commands. > file.list = dir(patt="gpr") # list of microarray raw data files > file.list # Check that you have four.gpr files > f <- function(x) as.numeric(x$flags > -50) # filter out bad genes > RG = read.maimages(files=file.list, source="genepix", wt.fun=f) > show(rg) Answer the following questions:. What are the names of the microarray data files? In each file, which experimental condition is labelled with each dye? 2. What components are contained in the object RG? 3. There are four matrices in the RG list: R, G, Rb, and Gb. What information is contained in each of those matrices? What do the rows and columns correspond to? 4. Draw a scatterplot where the horizontal axis represents the expression of Green channel of array and the vertical axis represents the Red channel. What can you say about the plot? Hint: Use pch="." as an argument of the function plot(). 5. Draw the same plot where the axes are in log (base 2) scale. What can you say about the plot? 2.2 Expression data in log-ratio scale > MA = MA.RG(RG, bc.method="none") > show(ma) The above command MA.RG creates an object called MA from RG, where we do not subtract the background intensity from the foreground spot intensity. No normalisation is performed at this stage. The above command simply creates a log ratio from RG list. Answer the following questions: 6. What information is contained in MA list? 7. What do matrix M and A represent? Are they in log-scale? 8. Does matrix M contain the log-ratios of Red over Green channels or Treatment ( hour) over Control (0 hour)? 2

3 9. Draw a scatterplot from the first array, where the horizontal axis is the first column of matrix A and the vertical axis is the first column of matrix M. Repeat this for all the other arrays. What can you say about the plot? What would you expect from the distribution of log ratio in the figure if many of the genes are not differentially expressed between Control and Treatment? Hint: Use the command par(mfrow=c(2,2)) before drawing the plot, and use the argument pch="." in the function plot(). 3 Normalisation The above object MA contains log-ratio of foreground intensities without background correction (and non-normalised). In this section, we use background-adjusted intensities. The following R commands perform normalisation from the information contained in RG into an object called MA. > MA = normalizewithinarrays(rg, method="loess", + span=0.3, bc.method="subtract") The information contained in MA are already normalised (and background-corrected). The normalisation method used was loess (using argument method="loess"). Other available options for this argument are: "none" (no normalisation performed), "median" (median normalisation performed, see lecture notes), "printtiploess" (loess normalisation performed based on the configuration of microarray printer blocks, this is the default), "composite" (combination of loess and printiploess normalisation performed), "control" (normalisation based on control spots performed), and "robustspline" (normalisation using spline performed). Answer the following questions: 0. What information is contained in the object MA?. Draw an MA-plot (the type of plot in Question 9) from object MA for all arrays. What can you say about the plot? Hint: You may use the function plotma(ma, array=n ), where n is the n-th array to be plotted (n-th column of M). 4 Linear models for cdna microarray data In this section, we will perform linear model fit to the microarray data that we have. After normalisation described in Section 3 above, the log ratio of expression of RED over GREEN (software default, rows of matrix M in MA) can be modelled as y = Xβ + ε where X is the design matrix, constructed so that β represents differential expression of Treatment ( hr.) over Control (0 hr.) in these arrays (see the lecture handout). β 3

4 here is our main interest, a parameter of differential expression between two biological groups. To make β represent differential expression of Treatment ( hr.) over Control (0 hr.), we need to look into how the log-ratio data is laid out by R, and experimental design (file LPS-info.txt): > colnames(ma$m) [] "355-2" "355-5" "358-3" "358-7" > exp.design <- read.table(file="lps-info.txt", header=t) > exp.design Array Green Red The above outputs indicate that the order of file in the object MA is 355-2, 355-5, 358-3, If we look into the experimental design, the log ratio of RED over GREEN in M with the above ordering correspond to log ratio of Control (0 hr.) over Treatment ( hr.), Treatment over Control, Treatment over Control, and Control over Treatment. Therefore, to make β to represent differential expression of Treatment over Control, we need to make the design matrix X to be: Had we set X = X = then β would represent the differential expression of RED over GREEN instead of Treatment over Control (remember, y is a vector of of log ratio of RED over GREEN, corresponds to a row of matrix M in object MA). We continue the analysis with the following commands: > design.matrix = c(-,,,-) > fit = lmfit(ma, design=design.matrix) > fit The above commands perform a linear model fit (using least squares) to each of the rows of matrix M in object MA with design matrix X. The command did not perform any test nor calculate any test statistic. The limma package, by default, use an., 4

5 empirical Bayes approach in calculating a test statistic (moderated t-statistic). Our interest here is to calculate the test statistic t g = ˆβ. () SE( ˆβ) To get the test statistic, we need to compute it by either using available information in object fit or using the standard function lm() on each row of matrix M in object MA (the latter is left for your exercise, see Question 5 below). The object fit contains information on ˆβ (component coefficient), square-root of the matrix (X X) (component stdev.unscaled), and ˆσ (component sigma). ˆσ is the estimate of square root of error variance. From these information we can compute the standard error of ˆβ as multiplication of components stdev.unscaled and sigma (See handout from Lecture 7). Do the following tasks: 2. Calculate the test statistic t g in Equation (), and save it as an object called tg in your R session. (Note that the object tg should be a vector whose length should be equal to the number of rows in the matrix M in object MA). 3. Calculate the two-sided p-value of the statistic, and save it in an object called pval.tg in your R session. (Note that the degrees of freedom for each gene is contained in the component df.residual in the object fit). 4. Create a data.frame object in R, called result.table, where its columns contain the following information: Gene ID, ˆβ, SE( ˆβ), t g, and p-value (of t g ). Hint: Information on gene ID can be found in the component genes in the object fit. 5. We can use the standard R function lm() in estimating ˆβ and p-values for each gene, based on the design matrix X. Verify this by analysing the 00-th gene in the list (00-th row of matrix M in object MA), and show that the summary of the model fitting using lm() contains the same information as the 00-th row of the object result.table. Hint: By default, lm() adds an intercept to the model. In fitting our model, do not use the intercept by adding an argument - before adding design matrix. 6. Sort the data frame result.table where the gene with smallest p-value should be at the top, followed by the second most significant gene, and so forth. Show the top 0 genes, and put this in your report. 5 Two-sample t-test for single-colour arrays To explore the use of two-sample t-test with single-colour array Affymetrix data, we first download the R workspace file (ending with.rdata) from the module webpage: 5

6 and go to the section Datasets and then Breast cancer dataset. Save the file in your Data folder within your My Documents folder. Load the.rdata file into your R session, and check that it contains objects er and x. The object x is a matrix of expression, where the rows correspond to the genes/probesets and the columns correspond to the arrays. Since each single-colour array contains information of expression from one sample/individual, the columns also correspond to the breast cancer patients. The data in object x are already normalised and in log scale. The object er contains information on the ER (Estrogen receptor) status of the patients. The object er indicates that the first 5 columns of x are from ER-positive patients (er value ) and the remaining 5 columns are ER-negative (er value 0). Verify these information by checking the details of objects er and x. Our interest in this study is to identify genes that are differentially expressed between ER-positive and ER-negative patients. Do the following tasks: 7. Calculate two-sample t-statistics of differential expression between ER-positive and ER-negative patients under the assumption of equal variance between the two groups. Save the quantity into an object called t2. Please note that t2 should be a vector whose length is the same as the number of rows of x. Hint: Use the argument var.equal=t in the t-test. 8. Compute the p-values associated with the t-statistics, and save this quantity into an object called pval.t2. 9. Create a data.frame object in R, called result.table2, where its columns contain the following information: Gene ID, t-statistics, and p-values. Hint: Information on gene ID can be found as row names of matrix x. 20. Sort the data frame result.table2 where the gene with smallest p-value should be at the top, followed by the second most significant gene, and so forth. Show the top 0 genes, and put this in your report. 6

Computer Exercise - Microarray Analysis using Bioconductor

Computer Exercise - Microarray Analysis using Bioconductor Computer Exercise - Microarray Analysis using Bioconductor Introduction The SWIRL dataset The SWIRL dataset comes from an experiment using zebrafish to study early development in vertebrates. SWIRL is

More information

Course on Microarray Gene Expression Analysis

Course on Microarray Gene Expression Analysis Course on Microarray Gene Expression Analysis ::: Normalization methods and data preprocessing Madrid, April 27th, 2011. Gonzalo Gómez ggomez@cnio.es Bioinformatics Unit CNIO ::: Introduction. The probe-level

More information

The analysis of acgh data: Overview

The analysis of acgh data: Overview The analysis of acgh data: Overview JC Marioni, ML Smith, NP Thorne January 13, 2006 Overview i snapcgh (Segmentation, Normalisation and Processing of arraycgh data) is a package for the analysis of array

More information

/ Computational Genomics. Normalization

/ Computational Genomics. Normalization 10-810 /02-710 Computational Genomics Normalization Genes and Gene Expression Technology Display of Expression Information Yeast cell cycle expression Experiments (over time) baseline expression program

More information

CARMAweb users guide version Johannes Rainer

CARMAweb users guide version Johannes Rainer CARMAweb users guide version 1.0.8 Johannes Rainer July 4, 2006 Contents 1 Introduction 1 2 Preprocessing 5 2.1 Preprocessing of Affymetrix GeneChip data............................. 5 2.2 Preprocessing

More information

Microarray Data Analysis (V) Preprocessing (i): two-color spotted arrays

Microarray Data Analysis (V) Preprocessing (i): two-color spotted arrays Microarray Data Analysis (V) Preprocessing (i): two-color spotted arrays Preprocessing Probe-level data: the intensities read for each of the components. Genomic-level data: the measures being used in

More information

`Three sides of a 500 square foot rectangle are fenced. Express the fence s length f as a function of height x.

`Three sides of a 500 square foot rectangle are fenced. Express the fence s length f as a function of height x. Math 140 Lecture 9 See inside text s front cover for area and volume formulas Classwork, remember units Don t just memorize steps, try to understand instead If you understand, every test problem will be

More information

Preprocessing -- examples in microarrays

Preprocessing -- examples in microarrays Preprocessing -- examples in microarrays I: cdna arrays Image processing Addressing (gridding) Segmentation (classify a pixel as foreground or background) Intensity extraction (summary statistic) Normalization

More information

Normalization: Bioconductor s marray package

Normalization: Bioconductor s marray package Normalization: Bioconductor s marray package Yee Hwa Yang 1 and Sandrine Dudoit 2 October 30, 2017 1. Department of edicine, University of California, San Francisco, jean@biostat.berkeley.edu 2. Division

More information

WEEK 4 REVIEW. Graphing Systems of Linear Inequalities (3.1)

WEEK 4 REVIEW. Graphing Systems of Linear Inequalities (3.1) WEEK 4 REVIEW Graphing Systems of Linear Inequalities (3.1) Linear Programming Problems (3.2) Checklist for Exam 1 Review Sample Exam 1 Graphing Linear Inequalities Graph the following system of inequalities.

More information

Lab 5, part b: Scatterplots and Correlation

Lab 5, part b: Scatterplots and Correlation Lab 5, part b: Scatterplots and Correlation Toews, Math 160, Fall 2014 November 21, 2014 Objectives: 1. Get more practice working with data frames 2. Start looking at relationships between two variables

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Practical DNA Microarray Analysis http://compdiag.molgen.mpg.de/ngfn/pma0nov.shtml The following exercise will guide you

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth

Exploring cdna Data. Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth Exploring cdna Data Achim Tresch, Andreas Buness, Wolfgang Huber, Tim Beißbarth Practical DNA Microarray Analysis http://compdiag.molgen.mpg.de/ngfn/pma0nov.shtml The following exercise will guide you

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Wolfgang Huber Practical DNA Microarray Analysis, Heidelberg, March 2005 http://compdiag.molgen.mpg.de/ngfn/pma2005mar.shtml The following

More information

Organizing, cleaning, and normalizing (smoothing) cdna microarray data

Organizing, cleaning, and normalizing (smoothing) cdna microarray data Organizing, cleaning, and normalizing (smoothing) cdna microarray data All product names are given as examples only and they are not endorsed by the USDA or the University of Illinois. INTRODUCTION The

More information

Section 4.4: Parabolas

Section 4.4: Parabolas Objective: Graph parabolas using the vertex, x-intercepts, and y-intercept. Just as the graph of a linear equation y mx b can be drawn, the graph of a quadratic equation y ax bx c can be drawn. The graph

More information

Application of Hierarchical Clustering to Find Expression Modules in Cancer

Application of Hierarchical Clustering to Find Expression Modules in Cancer Application of Hierarchical Clustering to Find Expression Modules in Cancer T. M. Murali August 18, 2008 Innovative Application of Hierarchical Clustering A module map showing conditional activity of expression

More information

Introduction to GE Microarray data analysis Practical Course MolBio 2012

Introduction to GE Microarray data analysis Practical Course MolBio 2012 Introduction to GE Microarray data analysis Practical Course MolBio 2012 Claudia Pommerenke Nov-2012 Transkriptomanalyselabor TAL Microarray and Deep Sequencing Core Facility Göttingen University Medical

More information

Analysis of Spotted Microarray Data

Analysis of Spotted Microarray Data Analysis of Spotted Microarray Data John Maindonald Centre for Mathematics & its Applications, Australian National University The example data will be for spotted (two-channel) microarrays. Exactly the

More information

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber. June 17, 2005

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber. June 17, 2005 Exploring cdna Data Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber June 7, 00 The following exercise will guide you through the first steps of a spotted cdna microarray analysis.

More information

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables

More information

Image Manipulation in MATLAB Due Monday, July 17 at 5:00 PM

Image Manipulation in MATLAB Due Monday, July 17 at 5:00 PM Image Manipulation in MATLAB Due Monday, July 17 at 5:00 PM 1 Instructions Labs may be done in groups of 2 or 3 (i.e., not alone). You may use any programming language you wish but MATLAB is highly suggested.

More information

The x-intercept can be found by setting y = 0 and solving for x: 16 3, 0

The x-intercept can be found by setting y = 0 and solving for x: 16 3, 0 y=-3/4x+4 and y=2 x I need to graph the functions so I can clearly describe the graphs Specifically mention any key points on the graphs, including intercepts, vertex, or start/end points. What is the

More information

Chapter 12: Quadratic and Cubic Graphs

Chapter 12: Quadratic and Cubic Graphs Chapter 12: Quadratic and Cubic Graphs Section 12.1 Quadratic Graphs x 2 + 2 a 2 + 2a - 6 r r 2 x 2 5x + 8 2y 2 + 9y + 2 All the above equations contain a squared number. They are therefore called quadratic

More information

Practice Test (page 391) 1. For each line, count squares on the grid to determine the rise and the run. Use slope = rise

Practice Test (page 391) 1. For each line, count squares on the grid to determine the rise and the run. Use slope = rise Practice Test (page 91) 1. For each line, count squares on the grid to determine the rise and the. Use slope = rise 4 Slope of AB =, or 6 Slope of CD = 6 9, or Slope of EF = 6, or 4 Slope of GH = 6 4,

More information

Bayesian Robust Inference of Differential Gene Expression The bridge package

Bayesian Robust Inference of Differential Gene Expression The bridge package Bayesian Robust Inference of Differential Gene Expression The bridge package Raphael Gottardo October 30, 2017 Contents Department Statistics, University of Washington http://www.rglab.org raph@stat.washington.edu

More information

Package TilePlot. April 8, 2011

Package TilePlot. April 8, 2011 Type Package Package TilePlot April 8, 2011 Title This package analyzes functional gene tiling DNA microarrays for studying complex microbial communities. Version 1.1 Date 2011-04-07 Author Ian Marshall

More information

STRAIGHT LINE GRAPHS THE COORDINATES OF A POINT. The coordinates of any point are written as an ordered pair (x, y)

STRAIGHT LINE GRAPHS THE COORDINATES OF A POINT. The coordinates of any point are written as an ordered pair (x, y) THE COORDINATES OF A POINT STRAIGHT LINE GRAPHS The coordinates of any point are written as an ordered pair (x, y) Point P in the diagram has coordinates (2, 3). Its horizontal distance along the x axis

More information

Clustering Techniques

Clustering Techniques Clustering Techniques Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 16 Lopresti Fall 2007 Lecture 16-1 - Administrative notes Your final project / paper proposal is due on Friday,

More information

Bioconductor exercises 1. Exploring cdna data. June Wolfgang Huber and Andreas Buness

Bioconductor exercises 1. Exploring cdna data. June Wolfgang Huber and Andreas Buness Bioconductor exercises Exploring cdna data June 2004 Wolfgang Huber and Andreas Buness The following exercise will show you some possibilities to load data from spotted cdna microarrays into R, and to

More information

Introduction to CS databases and statistics in Excel Jacek Wiślicki, Laurent Babout,

Introduction to CS databases and statistics in Excel Jacek Wiślicki, Laurent Babout, One of the applications of MS Excel is data processing and statistical analysis. The following exercises will demonstrate some of these functions. The base files for the exercises is included in http://lbabout.iis.p.lodz.pl/teaching_and_student_projects_files/files/us/lab_04b.zip.

More information

Assumption 1: Groups of data represent random samples from their respective populations.

Assumption 1: Groups of data represent random samples from their respective populations. Tutorial 6: Comparing Two Groups Assumptions The following methods for comparing two groups are based on several assumptions. The type of test you use will vary based on whether these assumptions are met

More information

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Graphical Analysis of Data using Microsoft Excel [2016 Version] Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

UNIT 8: SOLVING AND GRAPHING QUADRATICS. 8-1 Factoring to Solve Quadratic Equations. Solve each equation:

UNIT 8: SOLVING AND GRAPHING QUADRATICS. 8-1 Factoring to Solve Quadratic Equations. Solve each equation: UNIT 8: SOLVING AND GRAPHING QUADRATICS 8-1 Factoring to Solve Quadratic Equations Zero Product Property For all numbers a & b Solve each equation: If: ab 0, 1. (x + 3)(x 5) = 0 Then one of these is true:

More information

9.1: GRAPHING QUADRATICS ALGEBRA 1

9.1: GRAPHING QUADRATICS ALGEBRA 1 9.1: GRAPHING QUADRATICS ALGEBRA 1 OBJECTIVES I will be able to graph quadratics: Given in Standard Form Given in Vertex Form Given in Intercept Form What does the graph of a quadratic look like? https://www.desmos.com/calculator

More information

slope rise run Definition of Slope

slope rise run Definition of Slope The Slope of a Line Mathematicians have developed a useful measure of the steepness of a line, called the slope of the line. Slope compares the vertical change (the rise) to the horizontal change (the

More information

PROMO 2017a - Tutorial

PROMO 2017a - Tutorial PROMO 2017a - Tutorial Introduction... 2 Installing PROMO... 2 Step 1 - Importing data... 2 Step 2 - Preprocessing... 6 Step 3 Data Exploration... 9 Step 4 Clustering... 13 Step 5 Analysis of sample clusters...

More information

Bioconductor tutorial

Bioconductor tutorial Bioconductor tutorial Adapted by Alex Sanchez from tutorials by (1) Steffen Durinck, Robert Gentleman and Sandrine Dudoit (2) Laurent Gautier (3) Matt Ritchie (4) Jean Yang Outline The Bioconductor Project

More information

Microarray Technology (Affymetrix ) and Analysis. Practicals

Microarray Technology (Affymetrix ) and Analysis. Practicals Data Analysis and Modeling Methods Microarray Technology (Affymetrix ) and Analysis Practicals B. Haibe-Kains 1,2 and G. Bontempi 2 1 Unité Microarray, Institut Jules Bordet 2 Machine Learning Group, Université

More information

LAB #1: DESCRIPTIVE STATISTICS WITH R

LAB #1: DESCRIPTIVE STATISTICS WITH R NAVAL POSTGRADUATE SCHOOL LAB #1: DESCRIPTIVE STATISTICS WITH R Statistics (OA3102) Lab #1: Descriptive Statistics with R Goal: Introduce students to various R commands for descriptive statistics. Lab

More information

Microarray Excel Hands-on Workshop Handout

Microarray Excel Hands-on Workshop Handout Microarray Excel Hands-on Workshop Handout Piali Mukherjee (pim2001@med.cornell.edu; http://icb.med.cornell.edu/) Importing Data Excel allows you to import data in tab, comma or space delimited text formats.

More information

AB1700 Microarray Data Analysis

AB1700 Microarray Data Analysis AB1700 Microarray Data Analysis Yongming Andrew Sun, Applied Biosystems sunya@appliedbiosystems.com October 30, 2017 Contents 1 ABarray Package Introduction 2 1.1 Required Files and Format.........................................

More information

Lecture 16: High-dimensional regression, non-linear regression

Lecture 16: High-dimensional regression, non-linear regression Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we

More information

Package INCATome. October 5, 2017

Package INCATome. October 5, 2017 Type Package Package INCATome October 5, 2017 Title Internal Control Analysis of Translatome Studies by Microarrays Version 1.0 Date 2017-10-03 Author Sbarrato T. [cre,aut], Spriggs R.V. [cre,aut], Wilson

More information

Package OLIN. September 30, 2018

Package OLIN. September 30, 2018 Version 1.58.0 Date 2016-02-19 Package OLIN September 30, 2018 Title Optimized local intensity-dependent normalisation of two-color microarrays Author Matthias Futschik Maintainer Matthias

More information

Practical 2: Plotting

Practical 2: Plotting Practical 2: Plotting Complete this sheet as you work through it. If you run into problems, then ask for help - don t skip sections! Open Rstudio and store any files you download or create in a directory

More information

How to use CNTools. Overview. Algorithms. Jianhua Zhang. April 14, 2011

How to use CNTools. Overview. Algorithms. Jianhua Zhang. April 14, 2011 How to use CNTools Jianhua Zhang April 14, 2011 Overview Studies have shown that genomic alterations measured as DNA copy number variations invariably occur across chromosomal regions that span over several

More information

Introduction to Bioinformatics AS Laboratory Assignment 2

Introduction to Bioinformatics AS Laboratory Assignment 2 Introduction to Bioinformatics AS 250.265 Laboratory Assignment 2 Last week, we discussed several high-throughput methods for the analysis of gene expression in cells. Of those methods, microarray technologies

More information

Package agilp. R topics documented: January 22, 2018

Package agilp. R topics documented: January 22, 2018 Type Package Title Agilent expression array processing package Version 3.10.0 Date 2012-06-10 Author Benny Chain Maintainer Benny Chain Depends R (>= 2.14.0) Package

More information

Drug versus Disease (DrugVsDisease) package

Drug versus Disease (DrugVsDisease) package 1 Introduction Drug versus Disease (DrugVsDisease) package The Drug versus Disease (DrugVsDisease) package provides a pipeline for the comparison of drug and disease gene expression profiles where negatively

More information

Lesson 19: The Graph of a Linear Equation in Two Variables is a Line

Lesson 19: The Graph of a Linear Equation in Two Variables is a Line Lesson 19: The Graph of a Linear Equation in Two Variables is a Line Classwork Exercises Theorem: The graph of a linear equation y = mx + b is a non-vertical line with slope m and passing through (0, b),

More information

How do microarrays work

How do microarrays work Lecture 3 (continued) Alvis Brazma European Bioinformatics Institute How do microarrays work condition mrna cdna hybridise to microarray condition Sample RNA extract labelled acid acid acid nucleic acid

More information

Integrated Math I. IM1.1.3 Understand and use the distributive, associative, and commutative properties.

Integrated Math I. IM1.1.3 Understand and use the distributive, associative, and commutative properties. Standard 1: Number Sense and Computation Students simplify and compare expressions. They use rational exponents and simplify square roots. IM1.1.1 Compare real number expressions. IM1.1.2 Simplify square

More information

MiChip. Jonathon Blake. October 30, Introduction 1. 5 Plotting Functions 3. 6 Normalization 3. 7 Writing Output Files 3

MiChip. Jonathon Blake. October 30, Introduction 1. 5 Plotting Functions 3. 6 Normalization 3. 7 Writing Output Files 3 MiChip Jonathon Blake October 30, 2018 Contents 1 Introduction 1 2 Reading the Hybridization Files 1 3 Removing Unwanted Rows and Correcting for Flags 2 4 Summarizing Intensities 3 5 Plotting Functions

More information

A short reference to FSPMA definition files

A short reference to FSPMA definition files A short reference to FSPMA definition files P. Sykacek Department of Genetics & Department of Pathology University of Cambridge peter@sykacek.net June 22, 2005 Abstract This report provides a brief reference

More information

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT

LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT NAVAL POSTGRADUATE SCHOOL LAB #2: SAMPLING, SAMPLING DISTRIBUTIONS, AND THE CLT Statistics (OA3102) Lab #2: Sampling, Sampling Distributions, and the Central Limit Theorem Goal: Use R to demonstrate sampling

More information

Matlab Practice Sessions

Matlab Practice Sessions Matlab Practice Sessions 1. Getting Started Startup Matlab Observe the following elements of the desktop; Command Window Current Folder Window Command History Window Workspace Window Notes: If you startup

More information

Sketching graphs of polynomials

Sketching graphs of polynomials Sketching graphs of polynomials We want to draw the graphs of polynomial functions y = f(x). The degree of a polynomial in one variable x is the highest power of x that remains after terms have been collected.

More information

Vertical and Horizontal Translations

Vertical and Horizontal Translations SECTION 4.3 Vertical and Horizontal Translations Copyright Cengage Learning. All rights reserved. Learning Objectives 1 2 3 4 Find the vertical translation of a sine or cosine function. Find the horizontal

More information

Introduction to R Programming

Introduction to R Programming Course Overview Over the past few years, R has been steadily gaining popularity with business analysts, statisticians and data scientists as a tool of choice for conducting statistical analysis of data

More information

Exploratory data analysis for microarrays

Exploratory data analysis for microarrays Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA

More information

How to use the rbsurv Package

How to use the rbsurv Package How to use the rbsurv Package HyungJun Cho, Sukwoo Kim, Soo-heang Eo, and Jaewoo Kang April 30, 2018 Contents 1 Introduction 1 2 Robust likelihood-based survival modeling 2 3 Algorithm 2 4 Example: Glioma

More information

User Guide. IR-TEx: Insecticide Resistance Transcript Explorer. V.A Ingham, D. Peng, S. Wagstaff and H. Ranson

User Guide. IR-TEx: Insecticide Resistance Transcript Explorer. V.A Ingham, D. Peng, S. Wagstaff and H. Ranson User Guide IR-TEx: Insecticide Resistance Transcript Explorer V.A Ingham, D. Peng, S. Wagstaff and H. Ranson Contents Section 1: Introduction... 1 IR-TEx basics... 1 Performance and Resources... 2 Installing

More information

Analysis of Spotted Microarray Data

Analysis of Spotted Microarray Data Analysis of Spotted Microarray Data John Maindonald Statistics Research Associates http://www.statsresearch.co.nz/ Revised August 14 2016 The example data will be for spotted (two-channel) microarrays.

More information

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression OBJECTIVES 1. Prepare a scatter plot of the dependent variable on the independent variable 2. Do a simple linear regression

More information

Package ffpe. October 1, 2018

Package ffpe. October 1, 2018 Type Package Package ffpe October 1, 2018 Title Quality assessment and control for FFPE microarray expression data Version 1.24.0 Author Levi Waldron Maintainer Levi Waldron

More information

1 StatLearn Practical exercise 5

1 StatLearn Practical exercise 5 1 StatLearn Practical exercise 5 Exercise 1.1. Download the LA ozone data set from the book homepage. We will be regressing the cube root of the ozone concentration on the other variables. Divide the data

More information

Name: THE SIMPLEX METHOD: STANDARD MAXIMIZATION PROBLEMS

Name: THE SIMPLEX METHOD: STANDARD MAXIMIZATION PROBLEMS Name: THE SIMPLEX METHOD: STANDARD MAXIMIZATION PROBLEMS A linear programming problem consists of a linear objective function to be maximized or minimized subject to certain constraints in the form of

More information

x = 12 x = 12 1x = 16

x = 12 x = 12 1x = 16 2.2 - The Inverse of a Matrix We've seen how to add matrices, multiply them by scalars, subtract them, and multiply one matrix by another. The question naturally arises: Can we divide one matrix by another?

More information

Using metama for differential gene expression analysis from multiple studies

Using metama for differential gene expression analysis from multiple studies Using metama for differential gene expression analysis from multiple studies Guillemette Marot and Rémi Bruyère Modified: January 28, 2015. Compiled: January 28, 2015 Abstract This vignette illustrates

More information

Building R objects from ArrayExpress datasets

Building R objects from ArrayExpress datasets Building R objects from ArrayExpress datasets Audrey Kauffmann October 30, 2017 1 ArrayExpress database ArrayExpress is a public repository for transcriptomics and related data, which is aimed at storing

More information

Lecture 13: Model selection and regularization

Lecture 13: Model selection and regularization Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always

More information

Nature Publishing Group

Nature Publishing Group Figure S I II III 6 7 8 IV ratio ssdna (S/G) WT hr hr hr 6 7 8 9 V 6 6 7 7 8 8 9 9 VII 6 7 8 9 X VI XI VIII IX ratio ssdna (S/G) rad hr hr hr 6 7 Chromosome Coordinate (kb) 6 6 Nature Publishing Group

More information

For more info and downloads go to: Gerrit Stols

For more info and downloads go to:   Gerrit Stols For more info and downloads go to: http://school-maths.com Gerrit Stols Acknowledgements GeoGebra is dynamic mathematics open source (free) software for learning and teaching mathematics in schools. It

More information

Gene Expression an Overview of Problems & Solutions: 1&2. Utah State University Bioinformatics: Problems and Solutions Summer 2006

Gene Expression an Overview of Problems & Solutions: 1&2. Utah State University Bioinformatics: Problems and Solutions Summer 2006 Gene Expression an Overview of Problems & Solutions: 1&2 Utah State University Bioinformatics: Problems and Solutions Summer 2006 Review DNA mrna Proteins action! mrna transcript abundance ~ expression

More information

Sec 4.1 Coordinates and Scatter Plots. Coordinate Plane: Formed by two real number lines that intersect at a right angle.

Sec 4.1 Coordinates and Scatter Plots. Coordinate Plane: Formed by two real number lines that intersect at a right angle. Algebra I Chapter 4 Notes Name Sec 4.1 Coordinates and Scatter Plots Coordinate Plane: Formed by two real number lines that intersect at a right angle. X-axis: The horizontal axis Y-axis: The vertical

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information

Topic. Section 4.1 (3, 4)

Topic. Section 4.1 (3, 4) Topic.. California Standards: 6.0: Students graph a linear equation and compute the x- and y-intercepts (e.g., graph x + 6y = ). They are also able to sketch the region defined by linear inequality (e.g.,

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

Introduction to Minitab 1

Introduction to Minitab 1 Introduction to Minitab 1 We begin by first starting Minitab. You may choose to either 1. click on the Minitab icon in the corner of your screen 2. go to the lower left and hit Start, then from All Programs,

More information

Rational functions, like rational numbers, will involve a fraction. We will discuss rational functions in the form:

Rational functions, like rational numbers, will involve a fraction. We will discuss rational functions in the form: Name: Date: Period: Chapter 2: Polynomial and Rational Functions Topic 6: Rational Functions & Their Graphs Rational functions, like rational numbers, will involve a fraction. We will discuss rational

More information

hp calculators hp 39g+ & hp 39g/40g Using Matrices How are matrices stored? How do I solve a system of equations? Quick and easy roots of a polynomial

hp calculators hp 39g+ & hp 39g/40g Using Matrices How are matrices stored? How do I solve a system of equations? Quick and easy roots of a polynomial hp calculators hp 39g+ Using Matrices Using Matrices The purpose of this section of the tutorial is to cover the essentials of matrix manipulation, particularly in solving simultaneous equations. How are

More information

Analysis of (cdna) Microarray Data: Part I. Sources of Bias and Normalisation

Analysis of (cdna) Microarray Data: Part I. Sources of Bias and Normalisation Analysis of (cdna) Microarray Data: Part I. Sources of Bias and Normalisation MICROARRAY ANALYSIS My (Educated?) View 1. Data included in GEXEX a. Whole data stored and securely available b. GP3xCLI on

More information

The Power and Sample Size Application

The Power and Sample Size Application Chapter 72 The Power and Sample Size Application Contents Overview: PSS Application.................................. 6148 SAS Power and Sample Size............................... 6148 Getting Started:

More information

Mastery. PRECALCULUS Student Learning Targets

Mastery. PRECALCULUS Student Learning Targets PRECALCULUS Student Learning Targets Big Idea: Sequences and Series 1. I can describe a sequence as a function where the domain is the set of natural numbers. Connections (Pictures, Vocabulary, Definitions,

More information

This assignment is due the first day of school. Name:

This assignment is due the first day of school. Name: This assignment will help you to prepare for Geometry A by reviewing some of the topics you learned in Algebra 1. This assignment is due the first day of school. You will receive homework grades for completion

More information

Section Graphs and Lines

Section Graphs and Lines Section 1.1 - Graphs and Lines The first chapter of this text is a review of College Algebra skills that you will need as you move through the course. This is a review, so you should have some familiarity

More information

PROCEDURE HELP PREPARED BY RYAN MURPHY

PROCEDURE HELP PREPARED BY RYAN MURPHY Module on Microarray Statistics for Biochemistry: Metabolomics & Regulation Part 2: Normalization of Microarray Data By Johanna Hardin and Laura Hoopes Instructions and worksheet to be handed in NAME Lecture/Discussion

More information

Recitation Handout 10: Experiments in Calculus-Based Kinetics

Recitation Handout 10: Experiments in Calculus-Based Kinetics Math 120 Winter 2009 Recitation Handout 10: Experiments in Calculus-Based Kinetics Today s recitation will focus on curve sketching. These are problems where you information about the first and second

More information

This is called the vertex form of the quadratic equation. To graph the equation

This is called the vertex form of the quadratic equation. To graph the equation Name Period Date: Topic: 7-5 Graphing ( ) Essential Question: What is the vertex of a parabola, and what is its axis of symmetry? Standard: F-IF.7a Objective: Graph linear and quadratic functions and show

More information

Package TilePlot. February 15, 2013

Package TilePlot. February 15, 2013 Package TilePlot February 15, 2013 Type Package Title Characterization of functional genes in complex microbial communities using tiling DNA microarrays Version 1.3 Date 2011-05-04 Author Ian Marshall

More information

Cluster Analysis for Microarray Data

Cluster Analysis for Microarray Data Cluster Analysis for Microarray Data Seventh International Long Oligonucleotide Microarray Workshop Tucson, Arizona January 7-12, 2007 Dan Nettleton IOWA STATE UNIVERSITY 1 Clustering Group objects that

More information

ft-uiowa-math2550 Assignment HW8fall14 due 10/23/2014 at 11:59pm CDT 3. (1 pt) local/library/ui/fall14/hw8 3.pg Given the matrix

ft-uiowa-math2550 Assignment HW8fall14 due 10/23/2014 at 11:59pm CDT 3. (1 pt) local/library/ui/fall14/hw8 3.pg Given the matrix me me Assignment HW8fall4 due /23/24 at :59pm CDT ft-uiowa-math255 466666666666667 2 Calculate the determinant of 6 3-4 -3 D - E F 2 I 4 J 5 C 2 ( pt) local/library/ui/fall4/hw8 2pg Evaluate the following

More information

Quadratic Functions Dr. Laura J. Pyzdrowski

Quadratic Functions Dr. Laura J. Pyzdrowski 1 Names: (8 communication points) About this Laboratory A quadratic function in the variable x is a polynomial where the highest power of x is 2. We will explore the domains, ranges, and graphs of quadratic

More information

Review for Mastery Using Graphs and Tables to Solve Linear Systems

Review for Mastery Using Graphs and Tables to Solve Linear Systems 3-1 Using Graphs and Tables to Solve Linear Systems A linear system of equations is a set of two or more linear equations. To solve a linear system, find all the ordered pairs (x, y) that make both equations

More information

Math 121. Graphing Rational Functions Fall 2016

Math 121. Graphing Rational Functions Fall 2016 Math 121. Graphing Rational Functions Fall 2016 1. Let x2 85 x 2 70. (a) State the domain of f, and simplify f if possible. (b) Find equations for the vertical asymptotes for the graph of f. (c) For each

More information

Section 4.1 Review of Quadratic Functions and Graphs (3 Days)

Section 4.1 Review of Quadratic Functions and Graphs (3 Days) Integrated Math 3 Name What can you remember before Chapter 4? Section 4.1 Review of Quadratic Functions and Graphs (3 Days) I can determine the vertex of a parabola and generate its graph given a quadratic

More information

Package snm. July 20, 2018

Package snm. July 20, 2018 Type Package Title Supervised Normalization of Microarrays Version 1.28.0 Package snm July 20, 2018 Author Brig Mecham and John D. Storey SNM is a modeling strategy especially designed

More information

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel

Non-Linear Regression. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Non-Linear Regression Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Understanding the need for non-parametric regressions 2 Familiarizing with two common

More information

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation CSSS 510: Lab 2 Introduction to Maximum Likelihood Estimation 2018-10-12 0. Agenda 1. Housekeeping: simcf, tile 2. Questions about Homework 1 or lecture 3. Simulating heteroskedastic normal data 4. Fitting

More information