Lab 13: The What the Hell Do I Do with All These Trees Lab

Similar documents
Lab 12: The What the Hell Do I Do with All These Trees Lab

Lab 06: Support Measures Bootstrap, Jackknife, and Bremer

PRINCIPLES OF PHYLOGENETICS Spring 2008 Updated by Nick Matzke. Lab 11: MrBayes Lab

Lab 8: Using POY from your desktop and through CIPRES

10kTrees - Exercise #2. Viewing Trees Downloaded from 10kTrees: FigTree, R, and Mesquite

Lab 1: Introduction to Mesquite

Distance Methods. "PRINCIPLES OF PHYLOGENETICS" Spring 2006

"PRINCIPLES OF PHYLOGENETICS" Spring 2008

PAUP tutorial - Macintosh

Lab 07: Maximum Likelihood Model Selection and RAxML Using CIPRES

Lab 2. CSE 3, Summer 2010 In this lab you will learn about file structures and advanced features of Microsoft Word.

An introduction to plotting data

Adding content to your Blackboard 9.1 class

Lecture 3 - Template and Vectors

Project 1 Balanced binary

EXCEL BASICS: MICROSOFT OFFICE 2007

Welcome Back! Without further delay, let s get started! First Things First. If you haven t done it already, download Turbo Lister from ebay.

= 3 + (5*4) + (1/2)*(4/2)^2.

Lecture 1: Overview

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller

Performing Matrix Operations on the TI-83/84

EXCEL BASICS: MICROSOFT OFFICE 2010

9 R1 Get another piece of paper. We re going to have fun keeping track of (inaudible). Um How much time do you have? Are you getting tired?

"PRINCIPLES OF PHYLOGENETICS" Spring Lab 1: Introduction to PHYLIP

Filogeografía BIOL 4211, Universidad de los Andes 25 de enero a 01 de abril 2006

PowerPoint Basics: Create a Photo Slide Show

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Depending on the computer you find yourself in front of, here s what you ll need to do to open SPSS.

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

Paul's Online Math Notes. Online Notes / Algebra (Notes) / Systems of Equations / Augmented Matricies

Diode Lab vs Lab 0. You looked at the residuals of the fit, and they probably looked like random noise.

1.7 Limit of a Function

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2013

2016 All Rights Reserved

Divisibility Rules and Their Explanations

-Using Excel- *The columns are marked by letters, the rows by numbers. For example, A1 designates row A, column 1.

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018

Hacking FlowJo VX. 42 Time-Saving FlowJo Shortcuts To Help You Get Your Data Published No Matter What Flow Cytometer It Came From

File Triage. Work Smarter in Word, Excel, & PowerPoint. Neil Malek, MCT-ACI-CTT+

Use the Move tool to drag A around and see how the automatically constructed objects (like G or the perpendicular and parallel lines) are updated.

Three OPTIMIZING. Your System for Photoshop. Tuning for Performance

CHAPTER 1 COPYRIGHTED MATERIAL. Finding Your Way in the Inventor Interface

CSCI 1100L: Topics in Computing Lab Lab 11: Programming with Scratch

_APP A_541_10/31/06. Appendix A. Backing Up Your Project Files

How to Make a Book Interior File

Midterm Exam, October 24th, 2000 Tuesday, October 24th, Human-Computer Interaction IT 113, 2 credits First trimester, both modules 2000/2001

On the Web sun.com/aboutsun/comm_invest STAROFFICE 8 DRAW

Excel 2013 Beyond TheBasics

CS103 Handout 29 Winter 2018 February 9, 2018 Inductive Proofwriting Checklist

How To Get Your Word Document. Ready For Your Editor

Facebook Network Analysis Using Gephi

Table of Laplace Transforms

IGCSE ICT Section 16 Presentation Authoring

Lab 8 Phylogenetics I: creating and analysing a data matrix

2SKILL. Variables Lesson 6. Remembering numbers (and other stuff)...

Software Compare and Contrast

TLMC SHORT CLASS: THESIS FORMATTING

CREATING A POWERPOINT PRESENTATION BASIC INSTRUCTIONS


Computer Basics: Step-by-Step Guide (Session 2)

2. Getting Started When you start GeoGebra, you will see a version of the following window. 1

DOING MORE WITH EXCEL: MICROSOFT OFFICE 2010

Getting started with PowerPoint 2010

How To Upload Your Newsletter

A Step-by-Step Guide to getting started with Hot Potatoes

Lab 7 Unit testing and debugging

Slide 1. Slide 2. The Need. Using Microsoft Excel

ACCT 133 Excel Schmidt Excel 2007 to 2010 Conversion

Introduction. Watch the video below to learn more about getting started with PowerPoint. Getting to know PowerPoint

HORIZONTAL GENE TRANSFER DETECTION

Part 1 - Your First algorithm

Tutorial 1: Basic functions

Part 1 - Your First algorithm

Creating Hair Textures with highlights using The GIMP

Welcome to the world of .

Assignment #1: /Survey and Karel the Robot Karel problems due: 1:30pm on Friday, October 7th

Using GitHub to Share with SparkFun a

Animations involving numbers

Mehran Sahami Handout #7 CS 106A September 24, 2014

(Refer Slide Time 6:48)

How to Write Engaging s

Rev. C 11/09/2010 Downers Grove Public Library Page 1 of 41

Filter and PivotTables in Excel

A quick Matlab tutorial

THE AUDIENCE FOR THIS BOOK. 2 Ajax Construction Kit

John W. Jacobs Technology Center 450 Exton Square Parkway Exton, PA Introduction to

DIRECTV Message Board

Lutheran High North Technology The Finder

Best Practices for. Membership Renewals

Instructions for Using the Databases

Intermediate Word 2013

2013 edition (version 1.1)

If Statements, For Loops, Functions

Getting Started. Excerpted from Hello World! Computer Programming for Kids and Other Beginners

We re going to start with two.csv files that need to be imported to SQL Lite housing2000.csv and housing2013.csv

Using Microsoft Word. Working With Objects

Lab 13: Intro to Geometric Morphometrics

UV Mapping to avoid texture flaws and enable proper shading

4.1 Review - the DPLL procedure

The Institute for the Future of the Book presents. Sophie. Help. 24 June 2008 Sophie 1.0.3; build 31

Transcription:

Integrative Biology 200A University of California, Berkeley Principals of Phylogenetics Spring 2012 Updated by Michael Landis Lab 13: The What the Hell Do I Do with All These Trees Lab We ve generated a lot of trees in the last few weeks. Today we re going to explore different ways to view, compare, and manipulate those trees. First we re going to use tree-viewing software to look at a consensus tree generated in MrBayes. Then we re going to compare a bunch of trees with the same taxa but different topology using PAUP* and generate several types of consensus trees. Finally we re going to generate a consensus tree from trees that have overlapping but not identical taxa using Matrix Representation with Parsimony. We re going to be using many different files for this lab. Find them online at the 200A Syllabus and Handouts page. Note on tree-viewing programs: I believe that TreeViewX is what is available on the Computer lab computers. You may find TreeViewX is extremely limited in functionality. I recommend using FigTree, since it is as simple as TreeViewX but also offers additional features. Splitstree, or Dendroscope are also wonderful, but a little more complicated. Note on PAUP: We now have a PAUP CD, we will try to install PAUP on the Mac laptops at least. If not, use the computer lab, or work with someone who has gotten PAUP installed. To turn in: just email me screen-captures of trees from step 9 and step 26, below. Tree-viewing (with TreeViewX; try to do something similar with FigTree, Splitstree, Dendroscope) By now you will have noticed that most of the programs that generate trees either don t print trees at all or make low-quality text or graphical trees. Luckily there are a lot of different ways to view trees. Today we ll be using TreeViewX (which you have already used a few times.) It seems to be a little quirky. There are some files I have exported with branch lengths from PAUP that nonetheless do not appear to have branch lengths in TreeViewX. You may also want to try plain TreeView if you are a PC or MacClassic user. If something isn t working, it is always worth opening it in Mesquite, exporting it in a few different ways, and seeing if any of them work. Mesquite can also export trees as a PDF, they sometimes come out a bit odd but try adjusting all the parameters. But back to today s lab and TreeViewX: 1. Download MrBayesCephalopodConsensus.nex from the web page. This is a consensus tree generated by MrBayes for the Cephalopod COI dataset we ve been using. It has two trees in it with identical topology. The first has branch lengths and node support values. The second has only branch lengths. 2. Open TreeViewX and open MrBayesCephalopodConsensus.nex in it. 3. Pull down the style menu and change the font size, so that you can easily read the names.

4. Push (the radial tree button) at the top of the page. You will see your tree as an unrooted tree. 5. Push the phylogram button and the internal labels button. The tree should now appear as a square phylogram with branch lengths that can be seen in the lengths of the branches and node support values as numbers. 6. Use the arrow buttons to view the other tree in the file, which in this case is just the same tree without internal node labels. 7. To generate a picture for use in your paper pull down the File menu and select Print Preview. Then click Picture to save that tree as a metafile for use with other programs, or Copy to paste it into another program. Other important tree-manipulations (e.g. FigTree) 1. Click on the Layout item from the list of tools on the left. You can change your tree topology to be represented in rectangular, polar, and radial coordinates. 2. Change the Selection Mode to Node at the top of the window. Select a branch. See what cartoon, collapse, reroot and rotate do. 3. Change the Selection Mode to Clade at the top of the window. Select a clade. Click the Colour button and select a new color. This may be used to highlight a clade with interesting properties or characters. 4. Play around with the features until you get bored, then click on File à Export PDF to save your tree. Tree Distances Now we re going to use PAUP* to generate a number of different tree distance measures on a bunch of trees with the same taxa, like we talked about in class. 5. Download MrBayesCephalopodTprobs.nex from the web site. It contains the first 13 trees from the tprobs file for the cephalopod dataset. These are the most probable 50% of trees that I found during stationarity. It also has the estimated posterior probabilities of those trees. 6. Open PAUP*. 7. Execute MrBayesCephalopodTprobs.nex in PAUP.

8. Now, we will calculate the differences between the trees in the sample using tree-totree differences. For the first run we ll do symmetric differences. This is just a measure of the partitions that the two trees do not share. A branch can be viewed as a partition, because it separates the taxa into two groups. So if branches in both trees separate the taxa into the same two groups, then that partition is shared. Thus more similar trees will disagree on fewer partitions and so have smaller symmetric differences. Type: treedist; This should output a matrix of pairwise differences between the trees, and a frequency distribution of those differences. Are the trees with higher posterior probabilities more like the tree with the highest posterior probability? (Remember these trees are listed in the order of their posterior probabilities, from 1 to 13.) Is the tree with the highest posterior probability more similar to the other trees than they are in general to each other? Why would this be? What trees have the biggest differences? 9. Repeat this analysis, only this time use Agreement metric d. This is a measure of how many taxa you have to remove to make two trees the same with a correction added on so that if the taxa removed are further apart you get a bigger number, thus more similar trees should have smaller differences. Type: treedist metric = agreement; How do the trees compare under this measure of difference? Do the two metrics produce similar histograms? Which metric is more informative?

Consensus Trees Now we re going to generate several different consensus trees from that same tree file using PAUP*. I want to emphasize that this is not the appropriate way to generate a consensus tree from MrBayes. It is much better to use the sumt command in MrBayes, because that will consider the trees based on their estimated posterior probabilities and will also calculate branch lengths. However, there are many other situations when you would want to use this method, such as if you generate several most parsimonious trees. I m just using this tree file, because it is convenient. Last week, when we practiced bootstrap, jackknife, and decay index, we got a bit of experience generating consensus trees. Now we will explore this command in a bit more detail: 10. First let s generate a strict consensus of all the trees in memory. Type: contree all /strict=yes; This will output a tree that only contains nodes present in all your input trees. 11. When generating consensus trees, PAUP* will not hold the consensus trees in memory. If you want to keep the consensus trees, you have to save them to a file. Let s practice this now: contree all /strict=yes treefile=mytree.tre; This will save a consensus tree file to your working directory. 12. Now let s generate a Majority Rule tree with a cut off at 50%. This will output a tree with all the nodes that appear in more than 50% of the tree. It will also tell you in what percentage of those trees the nodes occurred. contree all /strict=no majrule=yes percent=50; Does this have the same topology as the strict consensus? Are they compatible? 13. Generate another Majority Rule tree, only this time up the cut off, so that you eliminate some clades. The cut off point is always kind of arbitrary, but can not be less than 50%. If it were less than 50%, then you couldn t be sure that all the clades are compatible. How high would the cut off have to be to guarantee that you are going to get the same tree as strict consensus? Matrix Representation with Parsimony (MRP) So it s easy to generate consensus trees if they all have exactly the same taxa, but what do you do if all the trees have different taxa? For example how would you put together a bunch of trees from different studies with overlapping but not identical taxa? Well, it is a matter of big debate. Maybe you shouldn t even do it at all. Maybe it is best to take the data matrices from all those studies and combine them into one supermatrix for analysis. If you do decide to combine trees it is not at all clear what the best method is. The mostly commonly used method is Matrix Representation with Parsimony (MRP). Here we re going to do a made up easy example of it. We are going to do MRP on the three trees of rays that you will find on the next page. I just took a single data set of rays, randomly deleted two taxa from it three times, and used those

reduced data sets to make trees by parsimony. This is a totally unrealistic situation for several reasons. The taxa have a lot of overlap. If theses were three trees picked from the literature they would have very little overlap. This would mean that the MRP matrix would have a lot of question marks. Also the trees are all generated from the same data set, so that you know that you won t have a contradictory signal from two different data sets, which you are likely to have in reality. However, I didn t have a time to find a more realistic set of trees, and these will make filling out the matrix easier. 14. Download the Ray matrix from the web site. Open it in Mesquite. This is just an empty matrix with the taxa names on it (and apparently lots of spelling mistakes), so that you don t have to bother filling them all in. You are just going to fill in the data. 15. Now code the three trees into the matrix. You do this by treating each interior branch from each tree as a separate character. Remember that every branch separates the taxa into two groups, one on each side of the branch. You can assign each of these partitions a separate character state, so that all the taxa on one side of a branch get a 0 and the other side a 1 for that character. All the taxa that don t appear in that tree should get a?. Every tree should have 6 internal branches. 16. For example the branch that I marked as A in the first tree should be coded: Raja polystigma 1 Raja montagui 1 Raja brachyura 1 Raja microocellata 1 Raja asterias 0 Raja undulate 0 Raja radula? Raja clavata? Leucoraja meitensis 0 Leucoraja naevus 0 Leucoraja fullonica 0 17. When you re done with the matrix save it and close Mesquite. 18. Open PAUP* and open the MRP matrix in it. 19. Run an exhaustive search for the most parsimonious tree. Type: alltrees; 20. Did you get one tree? Was it fully resolved? Is there any homoplasy (which in this case would indicate a disagreement between the trees)? 21. You should save a copy of this tree, and email it to me.