Microarray Excel Hands-on Workshop Handout

Similar documents
Pre-Lab Excel Problem

Spreadsheet Warm Up for SSAC Geology of National Parks Modules, 2: Elementary Spreadsheet Manipulations and Graphing Tasks

addition + =5+C2 adds 5 to the value in cell C2 multiplication * =F6*0.12 multiplies the value in cell F6 by 0.12

Statistics with a Hemacytometer

Microsoft Excel Using Excel in the Science Classroom

Basic tasks in Excel 2013

Math 227 EXCEL / MEGASTAT Guide

3/31/2016. Spreadsheets. Spreadsheets. Spreadsheets and Data Management. Unit 3. Can be used to automatically

Chapter 3: Rate Laws Excel Tutorial on Fitting logarithmic data

KINETICS CALCS AND GRAPHS INSTRUCTIONS

SPREADSHEET (Excel 2007)

Data Management Project Using Software to Carry Out Data Analysis Tasks

Excel Tables and Pivot Tables

Tricking it Out: Tricks to personalize and customize your graphs.

Excel Primer CH141 Fall, 2017

INSTRUCTIONS FOR USING MICROSOFT EXCEL PERFORMING DESCRIPTIVE AND INFERENTIAL STATISTICS AND GRAPHING

Data Should Not be a Four Letter Word Microsoft Excel QUICK TOUR

Chapter 4. Microsoft Excel

Rockefeller College MPA Excel Workshop: Clinton Impeachment Data Example

Starting Excel application

Chart Wizard: Step 1 (Chart Types)

SUM - This says to add together cells F28 through F35. Notice that it will show your result is

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

1 Introduction to Using Excel Spreadsheets

Basics of Spreadsheet

Business Process Procedures

Intermediate Excel 2003

A Brief Word About Your Exam

San Francisco State University

MS Office for Engineers

Microsoft Excel Basics Ben Johnson

Application of Skills: Microsoft Excel 2013 Tutorial

How to Excel - Part 2

MOVING FROM CELL TO CELL

Working with Census Data Excel 2013

Intermediate Excel 2016

Generating a Custom Bill of Materials

UW Department of Chemistry Lab Lectures Online

Chemistry Excel. Microsoft 2007

Excel Level 1

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS

Introduction to Excel Workshop

AcqDemo Pay Pool Analysis Tool (PAT) User Guide

Microsoft Excel 2007

Excel Core Certification

Dealing with Data in Excel 2013/2016

Tips & Tricks: MS Excel

EXCEL 2007 TIP SHEET. Dialog Box Launcher these allow you to access additional features associated with a specific Group of buttons within a Ribbon.

Introduction to Excel Workshop

Excel. Spreadsheet functions

ADD AND NAME WORKSHEETS

Introduction to the workbook and spreadsheet

Faculty Guide to Grade Center in Blackboard 9.1

Basic Microsoft Excel 2007

Standardized Tests: Best Practices for the TI-Nspire CX

Contents Part I: Background Information About This Handbook... 2 Excel Terminology Part II: Advanced Excel Tasks...

How to use Excel Spreadsheets for Graphing

Chemistry 30 Tips for Creating Graphs using Microsoft Excel

Microsoft Office Excel 2003

1. Setup Everyone: Mount the /geobase/geo5215 drive and add a new Lab4 folder in you Labs directory.

Logger Pro Resource Sheet

Introduction to Excel

SPREADSHEETS. (Data for this tutorial at

Technology Assignment: Scatter Plots

Homework 1 Excel Basics

CSSCR Excel Intermediate 4/13/06 GH Page 1 of 23 INTERMEDIATE EXCEL

MS Excel Advanced Level

Candy is Dandy Project (Project #12)

Open Excel by following the directions listed below: Click on Start, select Programs, and the click on Microsoft Excel.

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Using Large Data Sets Workbook Version A (MEI)

Microsoft Excel 2007 Creating a XY Scatter Chart

Chemistry 1A Graphing Tutorial CSUS Department of Chemistry

Getting Started With Excel

Customizing the Ribbon

Physics 251 Laboratory Introduction to Spreadsheets

Introduction to Microsoft Excel 2010 Quick Reference Sheet

Office Excel. Charts

Appendix C. Vernier Tutorial

Excel R Tips. is used for multiplication. + is used for addition. is used for subtraction. / is used for division

EDIT202 Spreadsheet Lab Prep Sheet

Introduction to CS databases and statistics in Excel Jacek Wiślicki, Laurent Babout,

Quick Guide for Excel 2015 Data Management November 2015 Training:

Using Tables, Sparklines and Conditional Formatting. Module 5. Adobe Captivate Wednesday, May 11, 2016

Intermediate Microsoft Excel (Demonstrated using Windows XP) Using Spreadsheets in the Classroom

Scientific Graphing in Excel 2013

Exploring extreme weather with Excel - The basics

Microsoft Excel 2010 Tutorial

Excel tutorial Introduction

Introduction to Microsoft Excel

Spreadsheet View and Basic Statistics Concepts

Topics Covered. Create and format a column chart Create and format a pie chart Create and format a line chart Use a trendline Insert a sparkline

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013

Unit 12. Electronic Spreadsheets - Microsoft Excel. Desired Outcomes

Microsoft Excel Microsoft Excel

Excel Format cells Number Percentage (.20 not 20) Special (Zip, Phone) Font

Chapter 7 Inserting Spreadsheets, Charts, and Other Objects

EXCEL 98 TUTORIAL Chemistry C2407 fall 1998 Andy Eng, Columbia University 1998

LEGENDplex Data Analysis Software Version 8 User Guide

Numbers Basics Website:

Transcription:

Microarray Excel Hands-on Workshop Handout Piali Mukherjee (pim2001@med.cornell.edu; http://icb.med.cornell.edu/) Importing Data Excel allows you to import data in tab, comma or space delimited text formats. Most microarray data is generated in one of these formats. 1. From the menu select File and then Open (or click on the yellow folder icon in the menubar) 2. Browse to the location of the file using the Look in drop down menu and then select Text Files (*.prn; *.txt; *.cvs) from the Files of Type drop down menu. 3. elect the desired text file and click Open. 4. A Text Import Wizard will then guide you through the steps for the specific file type. 5. For example for tab-delimited text files select the Delimited radio button and click Next 6. Then select the Tab radio button and click Next 7. You can then select advanced formats for your columns 8. Click Finish If you want to insert a tab-delimited dataset into an already open excel workbook elect Data Get External Data Import Text File from the menu bar and then follow steps 3-8. electing Data You can select data by either dragging the cursor across the data you want to select or by clicking on one corner of the selection area and then pressing the hift key while clicking on the diagonally opposite corner cell of the selection. To select large datasets it might be easier to use the screen dividers: 1. Click and drag the dividers on the right hand top or bottom corners (see arrow) of the spreadsheet to the desired position in the dataset. 2. Using the hift key you can now select the top left hand corner and diagonally opposite (right bottom) corner of the dataset you want to select (see figure below). Click on the gray area with a letter above a column to select the entire column or Click on the gray row number to select an entire row. Clicking on the control Ctrl key while selecting allows you to select discontinuous cells or blocks of cells while clicking on the hift key allows you to select continuous blocks of cells.

Click here first Then press the hift key and click here orting Data First select the data you would like to sort and then go to Data ort in the menu bar. A ort menu should pop up allowing you to sort by the desired column. Filtering Data imilarly to filter data go to Data Filter AutoFilter or Advanced Filter in the menu bar. Calculation For basic calculations (sum, averages, logs, ratios, standard deviations etc.): 1. Click the cell where you want your result to appear and then click on the = (equal to sign) in the menu. 2. Next from the function drop down menu pick the function appropriate for your calculation (UM, AVERAGE, LOG etc.) and then follow the instructions for the specific calculation. 3. For example, to average the numbers in cells C2 D2 and display the result in cell F2 you can select the cells C2-D2 (by dragging your cursor or by using the hift key) or you can write C2:D2 in the space provided and click OK. To extend the same calculation to several cells you can click and drag the square from the corner of the formula-containing cell to the adjacent cells where you want the same formula to be applied OR you can cut a calculation from one cell and paste it to the whole column. Most statistical analysis like T-Tests, Geometric Means, correlation etc. can be accessed as described above. To access more analysis options (ANOVA, Histogram etc.) you can also go to Tools Data Analysis. (This requires installation of an Analysis Pack that is available on your Microsoft Office CD)

Graphs To draw a basic scatter plot, select the data you want to graph and click on the graph icon in the menubar or go t o Insert Chart then pick the kind of graph you want to plot (for example XY catter plot) and click Next. Then you can select details for your chart such as axes labels, legends etc. T o plot a linear regression line in your scatter plot, right click a data point on the graph and select Add Trendline from the resulting pull down menu. To display the R squared (regressio n coefficient) value on the chart click the appropriate radio button in the Options tab. T ip Right- clicking on different areas of the chart allows you to customize the look of your chart. AM (ignificance Analysis of Microarrays) A vailable at: http://www-stat.stanford.edu/~tibs/am/ (free for academic users) AM uses repeated permutations of the data to determine if the expression of given genes/clones are significantly related to the response (treated vs. untreated etc.). The cutoff for significance is determined by tuning a parameter delta, chosen by the user based on the false positive rate (also called false significance). 1. Download and install the latest stable release of AM as per the instructions on the web site. 2. Format normalized dataset for AM. The AM installation includes detailed documentation and example datasets for each supported experiment type. [Note: The default location for the AM installation is C:\Program Files\AMVB\. The doc folder contains the manual (PDF format) and the Examples folder contains the example datasets (excel format)] 3. elect the block of data for analysis (including the identifier columns but excluding the top row with labels). [Note: AM allows 2 columns of identifier information. The second column will be linked to the OURCE web site so this should be the accession numbers or gene ids or clone ids etc.] 4. Click the AM button on the Excel menu bar 5. Choose the type of experiment (The example dataset is a two class unpaired data experiment) and fill in the options available as desired. 6. If your data is in several worksheets in the same workbook select the desired sheets visible in the Additional heets text area.

7. Click OK Notes 8. A AM plot will be generated along with a AM Plot Control. Adjust the delt a value slider in the AM Plo t Controller. Different delta values give you different numbers of significant and false significant genes (see arrow). Ideally choose a del ta value that gives you the smallest false significant value with the largest number of significant genes 9. You can also spec ify a Fold Change at this point although this is not ne cessary. 10. Finally you can click the List ignificant Genes button to get a new worksheet with a list of your significantly differentially expressed genes. This list (report) also provides links to the source web site for your specified ID (in the case of our example Accession numbers) AM score (d) The T-statistic value. Numerator (r ) The numerator of the T-statistic (a score). Denominator (s + s0) The denominator of the T-statistic (correct ed standard deviation). q-value This is the lowest False Disco very Rate at which the gene is called significant. It is like the familiar p-value, adapted to the analysis of a large number of genes. The q-value measure s how significant the gene is. As d increases, the correspon ding q-value decreases. The data is put through sev eral (we can specify how many) permutations and the d value is calculated. A delta grid is then calculated from a comparison of the d values from different permutations and a number of false significant gene s (the false discovery rate) is calculated for each delta value. Detailed information about the methods and algorithms used in AM can be found in the manual located at: C:\Program Files\AMVB\doc\ on your computer; or in the publication by Tusher, Tibshirani and Chu (200 1): ignificance analysis of microarrays applied to the ionizing radiation response PNA 2001 98: 5116-5121 amster a mster can take an Excel spreadsheet or text file and extract the raw data into a text output file, which can be fed directly into Cluster or opened in Treeview. This makes it easy to cluster your significant genes from your AM ana lysis. Available at: http://falkow.stanford.edu/whatwedo/software/software.html Tips: Right-clicking on the label or tab of a worksheet allows you to move, copy, delete or rename a worksheet. Right-clicking the gray column letter or row number allows you to insert a new or copied column or row. After you have copied from a cell or column with formulae if you want to paste the value of the cell without the formula go to Edit Paste pecial and select the Values radio button. You can also use Paste pecial to transpose what you have copied (for example from a row to a column) You can access Help at any time by going to Help Microsoft Excel Help or by clicking F1 Microsoft Excel Official ite: http://www.microsoft.com/office/excel/ Excel hortcuts: http://www.asap-utilities.com/index.php?page=/p_sh.php Microarray Analysis Tutorial (Jonathan Pevsner): http://pevsnerlab.kennedykrieger.org/hinxton.html Excel tutorials on the web: http://www.usd.edu/trio/tut/excel/index.html http://www.baycongroup.com/el0.htm http://www.georgetown.edu/departmen ts/psychology/researchmethods/computer/excel.htm

Practice Exercises: The ample data is in tab delimited text format (sampledata.txt) containing microarray data for 2 samples A and B where there are 2 replicate experiments for each A and B. The experiments were performed using Amersham Codelink oligonucleotide chips and have been median normalized and flagged based on the negative threshold of each chip. The columns are labeled as follows: A1-Raw_Intensity This is the raw intensity A1-Normalized_Intensity This is the median normalized intensity (this is the intensity we will be using most in our exercise) A1-Above_Threshold This is the flag indicating whether the normalized intensity is above or below the negative threshold for the chip. [Note: This is similar to the Affymetrix Present/Absent call] 1. Open the sample data in Excel. 2. Filter out all genes/clones that have intensities below threshold (flagged NO in our sample dataset). 3. Calculate the average of the filtered replicate normalized intensities for A and for B. 4. Plot a scatter plot of the filtered average normalized intensities for A vs. B and calculate the linear regression on the chart. 5. Do a T-Test on the normalized intensities to calculate the p-values for each gene. 6. ort the dataset by ascending p-value. 7. Calculate the log to the base 2 of the average normalized intensities. 8. Calculate the mean log intensity between A and B This is an average of all the logged normalized intensities for a given gene/clone 9. Calculate the Ratio of the logged normalized intensities for A and B. 10. Plot the mean log intensity versus log ratio. 11. Format the dataset of AM and run AM