Exploring long-range genome interaction data using the WashU Epigenome Browser

Similar documents
epigenomegateway.wustl.edu

W ASHU E PI G ENOME B ROWSER

WASHU EPIGENOME BROWSER 2018 epigenomegateway.wustl.edu

W ASHU E PI G ENOME B ROWSER

Tutorial: Jump Start on the Human Epigenome Browser at Washington University

You will be re-directed to the following result page.

A short Introduction to UCSC Genome Browser

Genome Browsers Guide

Genome Browsers - The UCSC Genome Browser

Table of contents Genomatix AG 1

Browser Exercises - I. Alignments and Comparative genomics

ChIP-Seq Tutorial on Galaxy

Getting Started. April Strand Life Sciences, Inc All rights reserved.

mirnet Tutorial Starting with expression data

User's guide to ChIP-Seq applications: command-line usage and option summary

Supplementary Figure 1. Fast read-mapping algorithm of BrowserGenome.

CLC Server. End User USER MANUAL

Differential Expression Analysis at PATRIC

Chromatin signature discovery via histone modification profile alignments Jianrong Wang, Victoria V. Lunyak and I. King Jordan

m6aviewer Version Documentation

Tutorial. RNA-Seq Analysis of Breast Cancer Data. Sample to Insight. November 21, 2017

ChIP-seq hands-on practical using Galaxy

ChIP-seq practical: peak detection and peak annotation. Mali Salmon-Divon Remco Loos Myrto Kostadima

Analyzing ChIP- Seq Data in Galaxy

Tutorial 4 BLAST Searching the CHO Genome

Advanced UCSC Browser Functions

Genomic Analysis with Genome Browsers.

Genome Environment Browser (GEB) user guide

Tutorial 1: Exploring the UCSC Genome Browser

Easy visualization of the read coverage using the CoverageView package

Insight: Measurement Tool. User Guide

Analysis of ChIP-seq data

Annotating a single sequence

Protocol: peak-calling for ChIP-seq data / segmentation analysis for histone modification data

How To: Run the ENCODE histone ChIP- seq analysis pipeline on DNAnexus

Tutorial for Lane County Mapping Applications

Exercise 2: Browser-Based Annotation and RNA-Seq Data

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

Press the Plus + key to zoom in. Press the Minus - key to zoom out. Scroll the mouse wheel away from you to zoom in; towards you to zoom out.

Tutorial. De Novo Assembly of Paired Data. Sample to Insight. November 21, 2017

Tour Guide for Windows and Macintosh

Tutorial: De Novo Assembly of Paired Data

BovineMine Documentation

Nature Biotechnology: doi: /nbt Supplementary Figure 1

Tutorial: RNA-Seq analysis part I: Getting started

HIPPIE User Manual. (v0.0.2-beta, 2015/4/26, Yih-Chii Hwang, yihhwang [at] mail.med.upenn.edu)

Tutorial: RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and Expression measures

Automated Bioinformatics Analysis System on Chip ABASOC. version 1.1

Using Syracuse Community Geography s MapSyracuse

Supplementary Material. Cell type-specific termination of transcription by transposable element sequences

ChIP-seq hands-on practical using Galaxy

BGGN-213: FOUNDATIONS OF BIOINFORMATICS (Lecture 14)

How to view details for your project and view the project map

ChIP-seq Analysis Practical

Click Trust to launch TableView.

Demo 1: Free text search of ENCODE data

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

Gegenees genome format...7. Gegenees comparisons...8 Creating a fragmented all-all comparison...9 The alignment The analysis...

LEGENDplex Data Analysis Software Version 8 User Guide

Integrated Genome browser (IGB) installation

Demo 1: Free text search of ENCODE data

ChIP-seq (NGS) Data Formats

Anima-LP. Version 2.1alpha. User's Manual. August 10, 1992

User Guide. v Released June Advaita Corporation 2016

Full Search Map Tab. This map is the result of selecting the Map tab within Full Search.

- 1 - Web page:

button in the lower-left corner of the panel if you have further questions throughout this tutorial.

STEM. Short Time-series Expression Miner (v1.1) User Manual

Bioinformatics in next generation sequencing projects

VisANT 4.0 User Manual. Contents

ChromHMM: automating chromatin-state discovery and characterization

Tutorial: Resequencing Analysis using Tracks

Useful software utilities for computational genomics. Shamith Samarajiwa CRUK Autumn School in Bioinformatics September 2017

Using Microsoft Word. Working With Objects

Edupen Pro User Manual

For Research Use Only. Not for use in diagnostic procedures.

Importing and processing a DGGE gel image

Appendix B: Creating and Analyzing a Simple Model in Abaqus/CAE

MetScape User Manual

Genetic Analysis. Page 1

Introduction to Genome Browsers

Wilson Leung 05/27/2008 A Simple Introduction to NCBI BLAST

SAS Visual Analytics 8.2: Working with Report Content

CPIB SUMMER SCHOOL 2011: INTRODUCTION TO BIOLOGICAL MODELLING

OnCOR Silverlight Viewer Guide

Overview of ArcGIS Online Applications. Champaign County

Pre-Lab Excel Problem

Vision Pointer Tools

Tutorial: Using the SFLD and Cytoscape to Make Hypotheses About Enzyme Function for an Isoprenoid Synthase Superfamily Sequence

City of Richmond Interactive Map (RIM) User Guide for the Public

Full Search Map Tab Overview

IMAGE STUDIO LITE. Tutorial Guide Featuring Image Studio Analysis Software Version 3.1

SPAR outputs and report page

All About PlexSet Technology Data Analysis in nsolver Software

UCSC Genome Browser ASHG 2014 Workshop

ArcView QuickStart Guide. Contents. The ArcView Screen. Elements of an ArcView Project. Creating an ArcView Project. Adding Themes to Views

MAKING MAPS WITH GOOGLE FUSION TABLES. (Data for this tutorial at

Panorama Sharing Skyline Documents

IntraMaps End User Manual

BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J.

Transcription:

Correspondence to Nature Methods Exploring long-range genome interaction data using the WashU Epigenome Browser Xin Zhou, Rebecca F. Lowdon, Daofeng Li, Heather A Lawson, Pamela A.F. Madden, Joseph Costello, Ting Wang Supplementary Figures and Legends Supplementary Methods Supplementary Notes References

Supplementary Figures and Legends Supplementary Figure 1. Supplementary Figure 1: The long-range interaction companion panel. A companion panel displays data over a locus (chr9:123603325-123608566, marked by light yellow background) that interacts with another locus displayed in the main Browser panel (chr9:132644414-132651364, marked by light blue background). Their connection was predicted in K562 cells by a ChIA-PET experiment (track name is ENCODE GIS ChIA-PET K562 POL2 rep1, shown in full mode). A schematic representation of the two loci on the chromosome can be found in the top part of the companion panel. The order and settings of the tracks in the interaction panel are identical with those of the main panel. Through this interaction, the promoters of PSMD5 and LOC253039 are in contact with the 3 end of USP20 and FNBP1. In the genome heatmap there are histone modification tracks and RNA-Seq tracks on the K562 cells. The histone tracks are the same as those displayed in Figure 1. In the metadata color map, histone mark types from top to bottom are: H3K4me3, H3K4me1, H3K9me3, H3K27me3. The RNA-Seq tracks contain strand-specific data. The top 6 tracks (in blue color) are transcription from the reverse strand, the bottom 6 tracks (in indigo color) are from forward strand, and they come from these GEO accessions: GSM765393, GSM767849, GSM765392, GSM765405, GSM765390, GSM758577. Comparison of track data patterns of the interaction panel with the main panel reveals that the two loci are both enriched in active marks and contain medium or low level of repressive marks. Consistent with their epigenetic status, genes that are involved in the interaction including PSMD5, LOC253039, and USP20 are actively transcribed (high RNA-Seq signal over the gene body regions). PSMD5 (proteasome 26S subunit) and USP20 (ubiquitin specific peptidase 20) are functionally related, while LOC253039 is uncharacterized.

Supplementary Figure 2. Supplementary Figure 2: The Circlet View. Circlet View shows the Hi-C data (K562 cells, Pearson correlation of 1 MB bins) 2 over five genes on the whole-genome scale. Arcs connect interacting loci. Arcs in magenta represent positively correlated loci (Pearson correlation coefficient >0), and those in blue represents negatively correlated loci (Pearson correlation coefficient <0). The gene names are added to the screen shot to mark the position of the genes. It shows that CYP4Z1 (chr1), CYP4V2 (chr4), CYP2C19 (chr10), and CYP26B1 (chr2) are each abundant with positive intra-chromosomal correlations, and are generally negatively correlated with external chromosomes. This pattern is not observed with CYP3A4 (chr7), which shows a distinct mixture of positive and negative correlation, both within and between chromosomes.

Supplementary Figure 3. Supplementary Figure 3: Thin glyph style for long-range interaction track display. This screen shot replicates Figure 1, with only one long-range interaction track (Figure 1b) to show the thin glyph style.

Supplementary Figure 4. Supplementary Figure 4: Full glyph style for long-range interaction track display. This screen shot replicates Supplementary Figure 3, and the long-range interaction track is shown with full glyphs.

Supplementary Figure 5. Supplementary Figure 5: Intra- and inter-chromosomal interaction pattern as revealed by a permuted arrangement of chromosomes. A Hi-C track 2 (K562 cells, Pearson correlation of 1 MB bins) is displayed as a heatmap. Chromosomes 9 and 22 are placed side by side. Both chromosomes show strong intra-chromosomal contact. The inter-molecular contact between them is generally weak. A heatmap cell stands out with strong positive correlation between chr9 and chr22 (pointed by the arrow, the two bins are chr9:134010179-135010178 and chr22:22670000-23669999, correlation coefficient is 0.7). However it might be due to the genomic translocation in K562 cells instead of actual inter-chromosomal interaction.

Supplementary Methods Here we provide additional information about Figure 1. The Tutorial provides a guide on how to recreate this figure. Following is the information on all the histone modification tracks in the genome heatmap. The displayed data is aligned to human reference genome hg19/grch37. Track name, track color, and GEO accession are given for all tracks: 1. H3K4me3 of IMR90, green, GSM469970 2. H3K4me3 of IMR90, green, GSM521901 3. H3K4me1 of IMR90, teal, GSM521895 4. H3K4me1 of IMR90, teal, GSM521897 5. H3K9me3 of IMR90, red, GSM521909 6. H3K9me3 of IMR90, red, GSM521913 7. H3K27me3 of IMR90, fuchsia, GSM469968 8. H3K27me3 of IMR90, fuchsia, GSM521889 9. H3K4me3 of K562, green, GSM608165 10. H3K4me3 of K562, green, GSM945297 11. H3K4me1 of K562, teal, GSM788085 12. H3K4me1 of K562, teal, GSM733692 13. H3K9me3 of K562, red, GSM607494 14. H3K9me3 of K562, red, GSM733776 15. H3K27me3 of K562, fuchsia, GSM945228 (replication 1) 16. H3K27me3 of K562, fuchsia, GSM945228 (replication 2) The gene model track is RefSeq genes, and the data were obtained from UCSC Genome Browser website (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/). The ChIA-PET track was generated by ENCODE Consortium and can be downloaded from here: http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodedcc/wgencodegischiapet/. The name of the track is ENCODE GIS ChIA-PET on K562, CTCF, replicate 1. The Hi-C track is described by Dixon, J.R. et al. 5 and can be downloaded here: http://chromosome.sdsc.edu/mouse/hi-c/download.html. The bin size is 40 kilobyte, the restriction enzyme is HindIII, with two biological replicates combined. In the first region, the HOXA gene cluster at the boundary of the two domains shows the bipartite histone modification pattern (enclosed by a grey dotted box) in IMR90 cells: the left part of the cluster is covered by active marks (H3K4me3 and H3K4me1) and the right part by repressive marks (H3K9me3 and H3K27me3), consistent with the normal, co-linear expression pattern of this locus. In K562 cells this pattern is replaced by a homogeneous repressive pattern (depleted of active histone marks and enriched in repressive H3K27me3 mark). In the second region, IMR90 and K562 cells do not show significant differences in their histone modification patterns. However, a cell in the Hi-C track, pointed by an arrow, suggests an interaction event between the two regions (the spot is indicated by the cursor, the underlying interacting regions are indicated by the two semi-transparent columns).

Supplementary Note Availability To access the public site, please visit http://epigenomegateway.wustl.edu/. For general questions regarding the Browser, please contact user support via email at epigenome@lists.genetics.wustl.edu. For the latest development news from WashU Epigenome Browser, users can follow us on Twitter (@washugbrowser) and our blog (http://washugb.blogspot.com/). A video tutorial is available at: http://epigenomegateway.wustl.edu/info/video/long_range_video.avi Implementation and system requirements The WashU Epigenome Browser is best viewed with open source web browsers including Firefox and Chromium. At this stage Microsoft IE is not supported. WashU Epigenome Browser continues to be built upon open source technologies, including GNU C, GNU Linux, MySQL, Tabix, and NCBI-BLAST. The source code of WashU Epigenome Browser is freely available at http://epigenomegateway.wustl.edu/info/source/. Current computational systems for long-range genomic interaction data Visualizing long-range genome interactions has long been a challenge in the design and implementation of Genome Browsers. Public resources capable of providing such services are almost non-existent at the time of writing. A few research groups excelling in long-range interaction assays have developed their own computational tools, for both data analysis and visualization. The Dekker lab (http://my5c.umassmed.edu) developed the My5C 1 (http://my5c.umassmed.edu/my5cprimers/5c.php) which is a sophisticated web-based platform. My5C designs primers for the target genomic region of the 5C experimental assay. It allows a user to upload raw 5C assay results, conduct a series of analyses, and visualize the results as heatmaps. Later, the group published the first Hi-C assay, which interrogated chromatin interactions genome-wide 2. Along with it a simple web interface was created to visualize the Hi- C data as heatmaps (http://hic.umassmed.edu/heatmap/heatmap.php). The Wei/Ruan group developed the ChIA-PET assay, which captures spatially proximal protein binding sites at genome-wide scale that are indicative of chromatin looping. The group developed the ChIA-PET Tool 3 (http://chiapet.gis.a-star.edu.sg/downloads/chia-pet-tools), which is a Java package for analyzing ChIA-PET data. It visualizes assay results via a built-in function and GBrowse (http://gmod.org/wiki/gbrowse). The group has published studies on chromatin interaction using the ChIA-PET assay 4, and data were visualized by Circos (http://circos.ca/) and the UCSC Genome Browser (http://genome.ucsc.edu/). The Ren group (http://bioinformatics-renlab.ucsd.edu/rentrac/) has recently published a study using the Hi-C assay 5. Along with the publication, a web interface was created to visualize the data (http://chromosome.sdsc.edu/mouse/hi-c/database.php). It represents interaction data as triangular-shaped heatmaps, which are coupled with the UCSC Genome Browser so that both interaction data and genomic annotation data can be displayed simultaneously.

No existing tools provide general solutions for long-range interaction data visualization. They either display the long-range interaction data as a generic track in a genome browser, or are hard-coded for specific projects and cannot be expanded or adapted easily. The genome browser tools including GBrowse and the UCSC Genome Browser are excellent for visualizing genomic annotations, but by displaying only one genomic region at a time, it becomes difficult to display interacting loci over large chromosomal distances, and a linear genome representation cannot properly display the inter-chromosomal interaction data. My5C s 2-dimensional heatmap representation is good solution, as it can set arbitrary genomic coordinates to the axes of the heatmap, making the display unbiased and intuitive. But it is difficult for a heatmap display to accommodate genomic annotation data, thus it remains difficult to perceive the essential biological context for understanding the interaction events. The long-range chromatin interaction visualization function of WashU Epigenome Browser The WashU Epigenome Browser encodes long-range interaction data in the format of two separate records for the two loci in an interaction (see next section on file format). In this way, both intra- and inter-chromosomal interaction data can be properly recorded. A locus from an interaction can be displayed on the Browser as a genomic feature, but when both interacting loci are shown, they will be connected and represented by a glyph. Five types of glyph are available: Thin (Supplementary Figure 3): thin lines that extend from one locus to the other are used to represent the interaction. No name or score is printed. It is a compact way to represent interaction data. Full (Supplementary Figure 4): a thick box is used to indicate a locus that interacts with another locus. If its mate (interacting locus) is not visible, the coordinate of the mate will be printed on the left side of the box. If the mate is visible, the mate will also be represented as a box and a straight line will connect the two boxes. The interaction score will be printed on the left of the joined pair. Arc (Figure 1b and 1c): an arc can be drawn to connect the interacting loci when both of them are visible. All arcs have the same radian (π/2) and width (1 pixel), and the radii are determined by the on-screen distance between the two loci. Heatmap (Figure 1d): squares or rectangles are drawn at the tip of an isosceles right triangle whose longest side is the genomic distance between the two interacting loci (plus the loci themselves). If the loci have equal length, the shape of the heatmap cell will be a square, otherwise it will be a rectangle. Circlet (Supplementary Figure 2): the entire chromosomes curl into a circlet and arcs are drawn inside the circle to link pairs of interacting loci. The arc and heatmap are the most representative glyphs to visualize the interactions. The arc is suitable for sparse interactions and the heatmap is suitable for dense interactions, as is usually the case for the Hi-C assays. But the heatmap plot differs from the heatmap used by My5C, as the latter requires two axes. Circlet View is especially useful at showing long-range interaction data over whole chromosomes or entire genome. The Hi-C data can be extraordinarily dense depending on how the data is analyzed. The genome is typically divided into equally sized bins, and each bin can interact with thousands of bins across the genome. When a Hi-C track is shown using the Browser, an enormous amount

of data is transmitted from the server to the client-side. The user should take caution in viewing data over large genomic regions to avoid stalling the web browser. To switch between the glyph types, right click on the track image and press a glyph type button from the context menu. In all cases, the color of the boxes, arcs, and heatmap cells are determined by their scores with respect to a user defined maximum cutoff score. The user can interact with the track by clicking on the glyphs to get the details of the interactions. Please refer to the Tutorial for detailed instructions. As mentioned above, previous genome browsers are limited to displaying only one genomic region at a time, which seriously restricts their application in long-range interaction data display. The WashU Epigenome Browser distinguishes itself with the extraordinary versatility of genomic coordinate arrangement. It is able to show data across multiple chromosomes in arbitrary order, and juxtapose data to focus the view on a subset of the genome. This provides an ideal environment for displaying long-range interaction tracks. Figure 1 combines the long-range interaction tracks with Gene Set View to compare interaction pattern between different genomic regions. Supplementary Figure 5 shows a Hi-C track on human K562 cells 2. By placing chr22 next to chr9, the interaction pattern between the two chromosomes becomes visible. File format of the long-range genome interaction data The long-range genome interaction data is stored in a compressed tabular text file and is accessed by the server, in the same way as all other genomic annotation tracks for the WashU Epigenome Browser. Tabix 6 is used to compress, index, and query the file with fast speed and minimum system footprint. It can also be used to access remote files through the network. The user can convert his or her own long-range interaction data into a tabix-formatted file, place the file on a web server, and visualize it on the WashU Epigenome Browser. This procedure does not require file upload, and it is highly efficient and secure. The tabular text file has a format similar as the BED format, albeit with variations. It has following six required fields: 1. Chromosome name. 2. Start coordinate of the locus. 3. Stop coordinate of the locus. 4. The coordinate of this locus interacting mate and the score of the interaction, in the format of chr1:567-890,3.14, where chr1:567-890 is the mate s coordinate, and 3.14 is the score. They are joined by a comma. 5. Unique integer. 6. Strand indicator. If the mate is on a different chromosome, use dot.. If the mate is downstream of the locus, use +, else -. Two records are needed to represent a pair of interacting loci. A unique integer must be assigned for each record in the fifth field. This format is versatile enough to encode arbitrary pairwise interaction data (intra-chromosomal or inter-chromosomal). While the nature of longrange interactions usually involves many loci, current assay methods all yield pairwise interaction data so this encoding approach works fine. In future developments, the format could

be expanded by including additional loci in the fourth field, so that new generations of interaction data can be encoded. Another possible solution is the SAM format 7, which is widely used to store paired-end read data from DNA sequencing experiments. Pairs of interacting loci could be treated as paired reads and thus be encoded in a SAM format file. However, compared with the custom format used by WashU Epigenome Browser, the disadvantage of SAM format is that it identifies paired reads only by their shared identifier, and it does not store locations of both reads in either one of the reads. This potentially requires the program to scan through the entire file (in the worst case) to obtain the location of the mate locus, or else this information would be missing. Besides, it is not possible to extend the SAM format for reads that have more than two segments. We have produced a blog article describing the details of how to create a custom longrange interaction track and display it on Wash U Epigenome Browser. The link to the article is: http://washugb.blogspot.com/2012/09/prepare-custom-long-range-interaction.html This post also included a link to a sample Python script to convert a publicly available ChIA-PET data file into Wash U Epigenome Browser long-range interaction track format. Acknowledgement We would like to thank the laboratories of Bing Ren, Chia-Lin Wei, Yijun Ruan, Job Dekker, the UCSC Genome Browser team and the ENCODE Consortium for providing long-range interaction data sets, and the Roadmap Epigenomics Consortium for providing epigenomics data sets. We thank many collaborators in the Roadmap Epigenomics Consortium who participated in software testing and provided helpful comments. We acknowledge support from NIH Roadmap Epigenomics Program, sponsored by the National Institute on Drug Abuse (NIDA) and the National Institute of Environmental Health Sciences (NIEHS). J.F.C., M.H. and T.W. are supported by NIH grant 5U01ES017154. X.Z. is supported by NIDA s R25 program DA027995. T.W. is supported in part by the March of Dimes Foundation, the Edward Jr. Mallinckrodt Foundation, P50CA134254 and a grant from the Foundation for Barnes-Jewish Hospital. References 1. Lajoie, B.R. et al. Nat Methods 6(10), 690-619 (2009). 2. Aiden, E.L. et al. Science 326, 289-293 (2009). 3. Li, G. et al. Genome Biology 11:R22 (2010). 4. Handoko, L. et al. Nature Genetics 43, 630-638 (2011). 5. Dixon, J.R. et al. Nature 485, 376-380 (2012). 6. Li, H. Bioinformatics 27, 718-719 (2011) 7. Li, H. et al. Bioinformatics 25, 2078-9 (2009)

Correspondence to Nature Methods Exploring long-range genome interaction data using the WashU Epigenome Browser Xin Zhou, Rebecca F. Lowdon, Daofeng Li, Heather A Lawson, Pamela A.F. Madden, Joseph Costello, Ting Wang Supplementary Note 2 Tutorial: How to Explore Long-range Chromatin Interaction Data using the WashU Epigenome Browser A video tutorial is available at: http://epigenomegateway.wustl.edu/info/video/long_range_video.avi For most updated tutorial, please visit http://epigenomegateway.wustl.edu/info/ under User Manual and Video Demo. This brief guide illustrates the usage of the long-range genome interaction track display in WashU Epigenome Browser, and contains following sections: 1 Recreate Figure 1 of Main Text 2 Circlet View case 1: visualize heterogeneous contact patterns of the HOXA cluster in IMR90 cells 3 Circlet View case 2: explore inter-chromosomal interaction data 4 Circlet View case 3: draw quantitative tracks as wreath tracks 5 Examine remotely connected loci with the companion panel Open http://epigenomegateway.wustl.edu/ on a web browser. Click the indicated button to launch the WashU Epigenome Browser. This site is best viewed on Open-Source web browsers including Chromium, Google Chrome, and Firefox. Other web browsers may not be supported. Select Human hg19 at the prompt dialog. By default, a few dozen quantitative assay tracks are displayed in the genome heatmap and a gene model track is displayed below the genome heatmap. 1. Recreate Figure 1 of Main Text Add long-range interaction tracks to the Browser display. At the navigation bar on the bottom left, click Tracks, then Genomic annotation. The genomic annotation track management panel will be shown on the right:

Click the Add more» button to show a small panel with all available genomic annotation tracks. The left half of the panel contains group names, the right half contains the name of tracks in the selected group: On the left side of this panel click the last tab Long-range interaction, a table is shown on right: In this table, the rows refer to the names of the samples (in this case, human cell lines), the columns are the assay types. Numbers in the table cells indicate the number of tracks belonging to each category. Click the cell between Hi-C and IMR90 (as pointed by the arrow), and the names of the five tracks are shown beneath the table:

To view detailed information on a track, click the icon following the track name. Click to select the track UCSD IMR90 Hi-C (40kb HindIII combined) 1. Click another table cell between ChIA-PET and K562 (as pointed by the arrow), and select two tracks ENCODE GIS ChIA-PET K562 CTCF and ENCODE GIS ChIA-PET K562 POL2 rep1 : Hide this panel and go back to the Browser. The selected tracks are now added in the list:

Use the drop-down menus to turn on the tracks. Select Heatmap for the Hi-C track, and Arc for the two ChIA-PET tracks. These three tracks are now shown in the Browser panel. At this stage the user can explore by scrolling or zooming. In the following screen shot, the first longrange interaction track is ChIA-PET K562 POL2, the second is ChIA-PET K562 CTCF, and the last is the Hi-C IMR90. Actual order may vary:

The Hi-C heatmap track is truncated in the above screen shot as it contains many low-scoring interactions, shown at the far bottom of the track. To limit the display to high scoring interactions only, right click on the track to invoke the context menu and select Configure. The configuration options will be displayed in the toolbox panel: The two text fields with values 10 and 2 control the display of positively scored interactions. Replace the filter threshold value with 4 and then press the set filter threshold button. The heatmap cells with scores below 4 are removed and the track height decreases: Then set the color threshold to 30, and heatmap cells with scores lower than 30 are shown with attenuated color:

Changing the values of the negative score options will have no effect, as all the interactions in this track are positively scored. Move the cursor over the heatmap cells in the Hi-C track and two transparent columns are shown marking out the two interacting loci joined by the heatmap cell under the cursor:

Next, go to the navigation bar and click tabs Genomic view then Gene Set View to open the Gene Set View panel:

Clear the text area and enter two coordinates, chr7:26663835-28123541 and chr7:101818721-102732764, separated by a line break. Submit to run the Gene Set View. The following screen shot shows the appearance of the tracks with the Gene Set View at these coordinates:

The first region contains the HOXA gene cluster, which lies at the boundary of two chromatin domains. The second region 80 MB apart from the first region but associates with the first region as predicted by the Hi-C data. The Hi-C data shows clear evidence of the topological domains over the two regions in IMR90 cells. The ChIA-PET data on K562 cells also matches the Hi-C data, indicating the chromatin structure over these two regions is stable between the IMR90 and K562 cells. At this stage, the user can add the histone modification tracks used in Figure 1 and compare the histone mark patterns between the two cell types. Please refer to the online manual for instructions (http://epigenomegateway.wustl.edu/browser/manual/). By combining histone modification tracks from both cell types, it can be seen that the HOXA cluster is marked by a distinct combination of active and repressive histone marks in IMR90, and this pattern is replaced by the repressive histone mark in K562. On the other hand, the second region does not show strong differences of histone modification patterns between K562 and IMR90 cells. 2. Circlet View case 1: heterogeneous contact patterns of the HOXA cluster in IMR90 cells To further examine the dual-domain formation over the HOXA cluster, drag on the chromosome ideogram beneath the genome heatmap and zoom into the domain on the left of HOXA cluster:

The following screen shot shows the magnified tracks after the zoom. The two ChIA-PET tracks are switched to thin mode to reduce image height:

Right click on the Hi-C track and select Circlet view from the menu. The Circlet View will be shown in a new panel:

Hide this panel for the moment (by clicking the button on the top right corner). In the main Browser panel, move the view to the other domain on the right of the HOXA cluster (take caution not to include any of the second region). The following screen shot shows the Browser image over this domain (the ChIA-PET tracks are in thin mode to limit image height):

Again, right click on the Hi-C track and invoke the Circlet view. Two views are now displayed:

The first view refers to the left-side domain of the HOXA cluster. The second view refers to the right-side domain. In both views, chromosome 7 curls to form a circle. Arcs are drawn inside the circle to indicate the interaction events. As described in Figure 1, the two domains are epigenetically distinct in IMR90 cells (the left-side domain is marked by active histone marks on its right boundary, the right-side domain is marked by repressive histone marks on its left boundary). It can be seen here that the two domains also tend to contact different parts within chromosome 7 in the IMR90 cells. 3. Circlet View case 2: explore inter-chromosomal interaction data The Circlet View can be very useful to visualize the global contact pattern of densely connected regions. The following screen shot shows a Hi-C track from Aiden, E.L. et al. (K562 cells, Pearson correlation of 1 MB bins). 2 in the form of heatmaps over five genes (CYP4Z1, CYP3A4, CYP4V2, CYP2C19, CYP26B1):

Very few details of the Hi-C track can be seen in this way. Invoke the Circlet View on the Hi-C track, and the full details are displayed in Supplementary Figure 2. 4. Circlet View case 3: draw quantitative tracks as wreath tracks Following screen shot shows data of a protein ChIP-Seq track (GSM822275) in wiggle mode on the top, and a ChIA-PET track in arc mode on the bottom. Both tracks were generated from experiments on K562 cells with the RNA polymerase Pol2 protein. The data are shown over entire chromosome 17. (To quickly display an entire chromosome, use the Relocation app from the toolbox panel, click show entire scaffold and choose chr17.)

Invoke the Circlet View on the ChIA-PET track:

Click the tab labeled Chromosomes on the bottom:

Uncheck all the other chromosomes and leave only chr17 checked, the Circlet View will update into single chromosome: Click the tab Wreath tracks on the bottom. Click the button Add a track, then Genome heatmap track, and add the ChIP-Seq track:

This way, the ChIP-Seq track is shown as a wreath around the Circlet View. It is seen that the location of the arcs from the ChIA-PET track usually correlates with peaks from the ChIP-Seq track, as both tracks are generated from experiments using the Pol2 factor. 5. Examine remotely connected loci with the companion panel

Finally, the Browser supports examining data patterns across remote loci. When the user clicks on a glyph representing an interaction, a tooltip balloon is shown with details: Two buttons can be found in the tooltip balloon, each corresponds to one locus. By clicking one of the buttons, the new panel will be shown to display the track data over the locus:

The upper part of this panel is a schematic representation of the two interacting loci. The locus shown in this panel is highlighted with a yellow background. Tracks inside this small panel are in sync with the main panel. Turn the long-range interaction track to full mode to see:

The track in the main panel is marked by a thin gray column, pointed to by a black arrow (the arrow is added to the screen shot and is not part of the Browser). It marks the pairing locus with the one shown in the small panel. By zooming in to the region surrounding the interacting locus, the track data over these two loci can be compared in the same view in detail (Supplementary Figure 1). 6. Sticky notes Users may wish to create a sticky note or memo upon the genomic location where an interaction of interest occurs. To create a sticky note, simply right click on the chromosome ideogram beneath the genome heatmap. An option appears from the context menu. Click this option and an input panel is shown to make a new note. When a note is made, a small image appears on the chromosome ideogram where user has right-clicked. Each note adheres to a specific genomic coordinate, and can be edited, removed, and saved inside a Session. References 1 Dixon, J.R. et al. Nature 485, 376-380 (2012) 2 Lieberman, A.E. et al. Science 326, 289-293 (2009)