Using the UCSC genome browser

Similar documents
The UCSC Genome Browser

Tutorial 1: Exploring the UCSC Genome Browser

Genomic Analysis with Genome Browsers.

Genome Browsers - The UCSC Genome Browser

The UCSC Gene Sorter, Table Browser & Custom Tracks

Genome Browsers Guide

Introduction to Genome Browsers

Exercise 2: Browser-Based Annotation and RNA-Seq Data

Genomics 92 (2008) Contents lists available at ScienceDirect. Genomics. journal homepage:

Advanced UCSC Browser Functions

The UCSC Genome Browser

A short Introduction to UCSC Genome Browser

Wilson Leung 05/27/2008 A Simple Introduction to NCBI BLAST

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

UCSC Genome Browser ASHG 2014 Workshop

UCSC Genome Browser Pittsburgh Workshop -- Practical Exercises

Part 1: How to use IGV to visualize variants

Browser Exercises - I. Alignments and Comparative genomics

The UCSC Genome Browser: What Every Molecular Biologist Should Know

Tutorial: chloroplast genomes

When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame

CLC Server. End User USER MANUAL

4.1. Access the internet and log on to the UCSC Genome Bioinformatics Web Page (Figure 1-

The UCSC Genome Browser

SPAR outputs and report page

VectorBase Web Apollo April Web Apollo 1

RNA-Seq Analysis With the Tuxedo Suite

Tutorial: How to use the Wheat TILLING database

INTRODUCTION TO BIOINFORMATICS

MIRING: Minimum Information for Reporting Immunogenomic NGS Genotyping. Data Standards Hackathon for NGS HACKATHON 1.0 Bethesda, MD September

Practical Course in Genome Bioinformatics

Database Searching Using BLAST

Topics of the talk. Biodatabases. Data types. Some sequence terminology...

Design and Annotation Files

Tutorial 4 BLAST Searching the CHO Genome

INTRODUCTION TO BIOINFORMATICS

Tutorial: Resequencing Analysis using Tracks

Exon Probeset Annotations and Transcript Cluster Groupings

How to use KAIKObase Version 3.1.0

General Help & Instructions to use with Examples

Bioinformatics for Biologists

How to use earray to create custom content for the SureSelect Target Enrichment platform. Page 1

KisSplice. Identifying and Quantifying SNPs, indels and Alternative Splicing Events from RNA-seq data. 29th may 2013

Tutorial for Windows and Macintosh. Trimming Sequence Gene Codes Corporation

Overview. Dataset: testpos DNA: CCCATGGTCGGGGGGGGGGAGTCCATAACCC Num exons: 2 strand: + RNA (from file): AUGGUCAGUCCAUAA peptide (from file): MVSP*

GenomeStudio Software Release Notes

Geneious 5.6 Quickstart Manual. Biomatters Ltd

Lecture 5 Advanced BLAST

epigenomegateway.wustl.edu

ChIP-seq hands-on practical using Galaxy

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be

Integrated Genome browser (IGB) installation

ChromHMM: automating chromatin-state discovery and characterization

BovineMine Documentation

ChIP-Seq Tutorial on Galaxy

MacVector for Mac OS X

QIAseq Targeted RNAscan Panel Analysis Plugin USER MANUAL

Aligning reads: tools and theory

From Smith-Waterman to BLAST

Table of contents Genomatix AG 1

Analyzing ChIP- Seq Data in Galaxy

ChIP-seq (NGS) Data Formats

Multiple Sequence Alignment Gene Finding, Conserved Elements

Useful software utilities for computational genomics. Shamith Samarajiwa CRUK Autumn School in Bioinformatics September 2017

INTRODUCTION TO CONSED

Dr. Gabriela Salinas Dr. Orr Shomroni Kaamini Rhaithata

Today's outline. Resources. Genome browser components. Genome browsers: Discovering biology through genomics. Genome browser tutorial materials

Ion AmpliSeq Designer: Getting Started

NCBI News, November 2009

Public Repositories Tutorial: Bulk Downloads

Computational Genomics and Molecular Biology, Fall

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI.

Tour Guide for Windows and Macintosh

m6aviewer Version Documentation

- 1 - Web page:

Genetics 211 Genomics Winter 2014 Problem Set 4

CS313 Exercise 4 Cover Page Fall 2017

Getting Started. April Strand Life Sciences, Inc All rights reserved.

Genome Browser. Background and Strategy

Eval: A Gene Set Comparison System

From genomic regions to biology

BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J.

Fast-track to Gene Annotation and Genome Analysis

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

Goldmine: Integrating information to place sets of genomic ranges into biological contexts Jeff Bhasin

Assessing Transcriptome Assembly

How To: Run the ENCODE histone ChIP- seq analysis pipeline on DNAnexus

Bioinformatics Hubs on the Web

GSNAP: Fast and SNP-tolerant detection of complex variants and splicing in short reads by Thomas D. Wu and Serban Nacu

Finding Selection in All the Right Places TA Notes and Key Lab 9

User Manual. Ver. 3.0 March 19, 2012

Rsubread package: high-performance read alignment, quantification and mutation discovery

Supplementary Figure 1. Fast read-mapping algorithm of BrowserGenome.

Genome Browser Background and Strategy

ICB Fall G4120: Introduction to Computational Biology. Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology

Data File Formats File format v1.4 Software v1.9.0

BLAST. NCBI BLAST Basic Local Alignment Search Tool

Intro to NGS Tutorial

SNPViewer Documentation

Transcription:

Using the UCSC genome browser

Credits Terry Braun Mary Mangan, Ph.D. www.openhelix.com

UCSC Genome Browser Credits Development team: http://genome.ucsc.edu/staff.html n Led by David Haussler and Jim Kent n Dozens of staff and students bring you this software and data 3 3

What is a genome, what is a build Very cridcal to get right The genome (human) is not done revisions happen If you use a different version you will get different results Chr1:123456-123654 seems prery specific Why are you not happy with that?

UCSC FAQ hrp://genome.ucsc.edu/faq/faqreleases Comparison of UCSC and NCBI human assemblies QuesDon: "How do the human assemblies displayed in the UCSC Genome Browser differ from the NCBI human assemblies? Response: Recent human assemblies displayed in the Genome Browser (hg10 and higher) are idendcal to the NCBI assemblies. 6

Issues QuesDon: "I nodced that the chromosomal coordinates for a pardcular gene that I'm looking at have changed since the last Dme I used your browser. What happened?" Response: A common source of confusion for users arises from mixing up different assemblies. It is very important to be aware of which assembly you are looking at. Within the Genome Browser display, assemblies are labeled by organism and date. To look up the corresponding UCSC database name or NCBI build number, use the release table. 7

QuesDon: "I am confused about the start coordinates for items in the refgene table. It looks like you need to add "1" to the stardng point in order to get the same start coordinate as is shown by the Genome Browser. Why is this the case?" Response: Our internal database representadons of coordinates always have a zero- based start and a one- based end. We add 1 to the start before displaying coordinates in the Genome Browser. Therefore, they appear as one- based start, one- based end in the graphical display. The refgene.txt file is a database file, and consequently is based on the internal representadon. We use this pardcular internal representadon because it simplifies coordinate arithmedc, i.e. it eliminates the need to add or subtract 1 at every step. Unfortunately, it does create some confusion when the internal representadon is exposed or when we forget to add 1 before displaying a start coordinate. However, it saves us from much trickier bugs. 8

Basic use of the genome browser

UCSC Genome Browser Gateway 1 2 3 4 5 assembly

Customizing your view

Visual Cues on the Genome Browser Tick marks; a single loca3on (STS, SNP) 3' UTR exon < < < exon < exon < < < < ex 5' UTR Intron and direction of transcription <<< or >>> Track colors may have meaning for example, UCSC Gene track: If there is a corresponding PDB entry = black If there is a corresponding reviewed/validated seq = dark blue If there is a non-refseq seq = lightest blue Vert. cons. height of a blue bar is increased likelihood of conservation, red indicates a likelihood of faster-evolving regions Alignment indications (Conservation pairs: chain or net style) Alignments = boxes, Gaps = lines 12

Basic AnnotaDon Track Menus Defined Hide: removes a track from view n Dense: all items collapsed into a single line n Squish: each item = separate line, but 50% height + packed n Pack: each item separate, but efficiently stacked (full height) n Full: each item on separate line (may need to zoom to fit) 13

Super Tracks

Mid- page OpDons to Change Seengs Search for data types Reset to defaults Configure opdons page You control the views with numerous features Flip 5` to 3` 15

Using the Get DNA Say you have a region of interest and what to make primers. You can use the View- >DNA link to get the DNA sequence of the region you are looking at

Get Sequence from DescripDon Pages 17

Using the E- PCR So you designed the primers. Will they work? Will they amplify what you want? Tools - > Insilico PCR

Blat: Tools- > Blat An alignment tool similar to BLAST from FAQ: Blat of DNA is designed to quickly find sequences of 95% and greater similarity of length 40 bases or more. It may miss more divergent or shorter sequence alignments. On proteins, Blat uses 4- mers rather than 11- mers, finding protein sequences of 80% and greater similarity to the query of length 20+ amino acids. However, BLAST and psi- BLAST at NCBI can find much more remote matches.

POP quiz What gene is this pepdde from? (hint: human) MDEPPFSEAALEQALGEPCDLDAALLTDIEDMLQL INNQDSDFPGLFDPPYAGSGAGGTD What gene is 5` of this gene? What gene is 3` of this gen? What Dssue is this gene (the gene 3` to the pepdde) highly expressed in?

ConverDng from one build to another: Tools- >Liqover BED files? Command line? OK how about View - > In other Genomes chr17:38496978-38530994 (hg18) Find it in hg38

Using the gene sorter From Homepage click GeneSorter Similarity types Filtering criteria Ordering ExporDng results

VisiGene

Using the table browser Powerful way to pull out specific sets of data. Can be used for further visualizadon Pulling out subset of sequences Pulling out tabular data about the genome

visualize search & download Underlying Database (MySQL)

Table example Find all coding SNPs in the region:chr17:41243452-41277468 Filter based on the codon you want to work with.

CreaDng your own tracks Open notepad hrps://genome.ucsc.edu/faq/ FAQformat.html Enter: browser posidon chr7:127471196-127495720 browser hide all track name="itemrgbdemo" descripdon="item RGB demonstradon" visibility=2 itemrgb="on" chr7 127471196 127472363 Pos1 0 + 127471196 127472363 255,0,0 chr7 127472363 127473530 Pos2 0 + 127472363 127473530 255,0,0 chr7 127473530 127474697 Pos3 0 + 127473530 127474697 255,0,0 chr7 127474697 127475864 Pos4 0 + 127474697 127475864 255,0,0 chr7 127475864 127477031 Neg1 0-127475864 127477031 0,0,255

Capturing output Screen capture View- >PDF/PS Export sequences to another program

UCSC Exercise #1 Pull up SAA2 in the genome browser Make sure you have at least "UCSC Genes" and "RefSeq Genes" turned on I like "Pack" Zoom out undl you can see SAA1, SAA2, and SAA4 (hint: they are close by) Knowing what we know about the similarity between SAA1 and SAA2 what inference could we make about the origins of SAA1 and SAA2? Turn on "Alt Events" (under "Gene and Gene PredicDon Tracks" aqer reading the descripdon) What does "retained Intron" annotadon from the "Alt Events" track mean? You can also turn on "SIB Alt- Splicing" under mrna and EST tracks to see alternadvely spliced ESTs that have been observed Finally see if you can find SAA3 and view it in the same window as SAA1, SAA2 and SAA4 What do you think about the origins of SAA3? What is SAA3? 29

UCSC Exercise #2 The primers Kischner reported in the paper to amplify this exon were: CTGTGATCCTAACGCAAGAC TAGAAATCACATCATAGCAC Can you find this exon? How? Be sure to turn on the "Alt Events" track in "Gene and Gene PredicDon Tracks 30

UCSC Exercise #3 Get all regions that have a simple repeated sequence *GCGGC* That overlap a gene View your track in the genome browers Download as a bed file