Blast2GO PRO Plugin for Geneious User Manual

Size: px
Start display at page:

Download "Blast2GO PRO Plugin for Geneious User Manual"

Transcription

1 Blast2GO PRO Plugin for Geneious User Manual Geneious 8.0 Version 1.0 October 2015 BioBam Bioinformatics S.L. Valencia, Spain

2 Contents Introduction Blast2GO methodology Blast2GO and BioBam Geneious and Biomatters Quick-Start 4 User Interface Overview Toolbar Document Table Document Viewer Statistics Tab Graph Viewer Tab Sources Panel Blast2GO functions Load Example Data Convert to Blast2GO Add Blast Hits CloudBlast Add InterProScan InterProScan Mapping Show GO Description Annotation Merge InterProScan ANNEX GO-Slim Remove Results Fisher s Exact Test Selection Activate Subscription Key Remove Subscription Key CloudBlast History Geneious functions Import Data Export Data Workflows Copyright BioBam Bioinformatics S.L. 1

3 Introduction Support: Website: Blast2GO methodology Blast2GO Conesa et al. (2005) is a methodology for the functional annotation and analysis of gene or protein sequences. The method uses local sequence alignments (BLAST) to find similar sequences (potential homologous) for one or several input sequences. The program extracts all GO terms associated to each of the obtained hits and returns an evaluated GO annotation for the query sequence(s). Enzyme codes are obtained by mapping from equivalent GOs while InterPro motifs can directly be queried at the InterProScan web service. A basic annotation process with Blast2GO consists of 3 steps: blasting, mapping and annotation. These steps will be described in this manual including further explanations and information on additional functions Götz et al. (2008). Figure 1.1: Table and Graph Viewer visualizing a small data-set Copyright BioBam Bioinformatics S.L. 2

4 1.2 Blast2GO and BioBam Blast2GO PRO Plugin for Geneious is developed, maintained and distributed by Biobam Bioinformatics S.L. 1.3 Geneious and Biomatters Geneious is a powerful and comprehensive suite of molecular biology software tools. Geneious provides a an easy-to-use interface with simple and intuitive data management capabilities within a customizable and extendable platform. This allows researcher working with sequencing data to gain immediate access to a wide-range of essential data analysis features. Blast2GO PRO is now part of this feature-set in form of the plugin described in this document. Geneious is developed by Biomatters, a New Zealand based company, founded in 2003, with a mission to create bioinformatics solutions for the analysis, interpretation, and application of molecular sequence data. Copyright BioBam Bioinformatics S.L. 3

5 Quick-Start This section provides an overview of an typical Blast2GO usage. Detailed descriptions of the different steps and possibilities of this plugin are given in the remaining sections of this manual. 1. To start an annotation process with Blast2GO load a sequence file in fasta format containing your nucleotide or amino acid sequences: File Import From File... Alternatively you may use any sequence data already available in the Sources navigator on the left. 2. Convert your data to Blast2GO sequences: Select the sequence document list (or one or various loose sequence documents) and Right Click Blast2GO Convert to Blast2GO. This converts your sequences in Blast2GO sequences and activates three Blast2GO viewer in the bottom editor area (B2G Table, B2G Statistics, B2G Graph). You can now select B2G Table to view the sequence names in a list. Observe that all sequence rows are white and to not contain information. Select the B2G Statistics Tab and click on the white colored statistic names in the right side panel (one of the first 4 options. 3. Blast sequences: Various options are available to add sequence alignments to your Blast2GO sequences in Geneious. We will test the CloudBlast option. More information about other options can be found in the section Blast. Right Click Blast2GO CloudBlast In most cases default settings can be applied. However, depending on the data-set the following parameters might be adjusted: Select a Blast DB that fits your data-set. In the advanced section (More Options), choose where a copy of your Blast results should be created. Now sequences will be send to the CloudBlast resource. The progress can be observed under the Sources panel. While processing, the color of the B2G Table changes from white to red or orange. Right-clicking on an orange sequence allows to Show Blast Result via a context menu. This option opens an extra window called Blast2GO viewer component. This is an independent window containing a list of result tabs. These tabs will not change its contents even a user changed to a different data-set (as the integrated Document Viewers). Once we have obtained Blast results the corresponding statistics can be viewed in the B2G Statistics viewer. 4. Perform Gene Ontology Mapping: Orange sequences contain at least one blast hit (probably more than one) and are suitable for the mapping (the number of hits is shown in the #hits column in the B2G Table). Select all blasted sequences (whether they are white, red or orange) and Right Click Blast2GO Mapping. Now orange sequences may changed their color to green. Green sequences contain GO candidate term which will be considered for annotation in the next step. 5. Annotation: Once GO mapped sequences (green) it is possible to apply the annotation step: Right Click Blast2GO Annotation We will use default parameters. Some sequences will now change from green to blue and are know successfully annotated. We can now use the green and blue options from the B2G Statistics viewer to review the annotation process. Annotated sequences (blue) can also be visualized with the B2G Graph tab. Copyright BioBam Bioinformatics S.L. 4

6 6. InterProScan: To complement the blast-based annotations with domain-based annotations, run an InterProScan search. Select your data-set, Right Click Blast2GO InterProScan, introduce your address and press OK. This adds information to the InterPro column in the B2G Table viewer, which must be merged to your already obtained GOs with Right Click Blast2GO Merge InterProScan. 7. Export Results: Once the annotation process is finished we can consider several options to export the results: File Export Selected Documents. The Files of Type combo-box shows various possible export types and the ones beginning with Blast2GO are most suitable for our data-set. annot-file: The annot file is the standard format to export GO annotations. It is a tab-separated text file, each row contains one GO term. dat-file: The standard Blast2GO project file. This file can also be opened with the standalone Blast2GO application. Copyright BioBam Bioinformatics S.L. 5

7 User Interface Overview This section provides an overview of the Blast2GO PRO Plugin for Geneious user interface. 3.1 Toolbar By default, the main Geneious toolbar hosts an icon for all Blast2GO functions and a Blast2GO toolbox icon. The functions icon provide access to the algorithms like Blast, Mapping or Annotation. The toolbox contains options like Activate Subscription or the CloudBlast History viewer. All menu options are also available via the context menu of the Document Table (via right-click on a document). Figure 3.1: An overview about the graphical elements the Blast2GO plugin makes available to the user 3.2 Document Table The Document Table shows all available documents. Within the plugin we treat sequences documents (nucleotide or amino acid) which have been converted into B2G documents. These sequences can be loose sequences or grouped into lists of B2G Sequences. Note: At the moment Blast2GO list-documents can not be distinguished by their icon from normal nucleotide or amino-acid lists (Figure 3.3). Copyright BioBam Bioinformatics S.L. 6

8 (a) Group sequences into a list document (b) Extract sequences from list Figure 3.2: Managing sequences and lists of sequences Note: The Blast2GO viewers works significantly faster with un-grouped sequences. Un-group sequences requiring more RAM and this is not feasible when working with large lists. 3.3 Document Viewer The Document Viewer in the lower-right area visualizes the data selected in the Document Table. There are 3 different viewers to visualize grouped or un-grouped B2G Sequences: B2G Table, B2G Statistics, B2G Graph B2G Table Tab The Blast2GO table shows the obtained results for each sequences including the analysis status (blast, mapped, annotated, etc.). When working on a list, selection can also be performed via the section tool (toolbox). B2G Statistics Tab The statistics viewer can be used to create charts for all analysis steps and allows to export results in various file formats via the sidepanel. Statistics related to blasted sequences are colored in orange, annotation step related statistics are blue etc. Note: A dataset with green sequences e.g. can not be used to generate e.g. blue statistics. A GO Distribution Level statistic for example does not make any sense when applied on only mapped sequences. Section gives an overview about the different available charts and their meaning. B2G Graph Tab To visualize a GO combined graph the viewer offers many different options in the sidepanel (Figure 3.3(b)). A first execution had to be started via the Make it so button. Subsequent parameter changes of the Graph Options have a direct impact on the shown graph. Please see section for more information. Jump to Node searches and centers the graph view on the GO term matching the search criteria. The options available in the Charts area are explained in section 3.3.2, but basically the graphs information is reduced to a bar or pie chart. The graph information can be exported in four different file formats (svg, pdf, png, txt) Statistics Tab General Blast Data Distribution - This bar chart shows the distribution of un-blasted, blasted, mapped and annotated sequences over the whole data-set. Data Distribution (pie) - The same as the former but pie-style. Sequence Length - Plots the sequence length for all sequences. Analysis Progress - Gives an overview about the current analysis progress of the data-set. Copyright BioBam Bioinformatics S.L. 7

9 E-Value Distribution - This chart plots the distribution of E-values for all selected BLAST hits. It is useful to evaluate the success of the alignment for a given sequence database and help to adjust the E-Value cutoff in the annotation step. Similarity Distribution - This chart displays the distribution of all calculated sequence similarities (percentages), shows the overall performance of the alignments and helps to adjust the annotation score in the annotation step. Species Distribution - This chart gives a listing of the different species to which most sequences were aligned during the BLAST step. Top-Hit Distribution - Bar chart showing the species distribution of all Top-Blast hits. Hit Distribution - This chart shows a distribution of the number of hits for the blasted sequences in a data-set. Hsp Distribution - This bar chart shows the distribution of hsps per hit. Hsp/Seq Distribution - This chart shows a distribution of percentages which represents the coverage between the hsps and their corresponding sequences. Hsp/Hit Distribution - Same as above but for hits instead of sequences. Mapping EC Distribution for Sequences - This chart shows the distribution of GO evidence codes for the functional terms obtained during the mapping step. It gives an idea about how many annotations derive from automatic/computational annotations or manually curated ones. EC Distribution for Blast Hits - Same as above but per Blast hit. DB Resources of Mapping - This chart gives the distribution of the number of annotations (GO-terms) retrieved from the different source databases e.g. UniProt, PDB, TAIR etc. Annotation IPS Annotation Distribution - This chart informs about the number of GO terms assigned per sequence. GO Annotation Lvl Distribution - A bar chart which shows all GO terms for all 3 categories for a given GO level taking into account the GO hierarchy (parent-child relationships). Annotation Score Distribution - A chart that shows the number of sequences per annotation score. Seq/Length Relative - This chart shows the relative correlation between length of the sequences and the number of assigned annotations. Seq/Length Absolute - Same as above but absolute. GO Distribution Level - A bar chart which shows all GO terms for all 3 categories for GO level 2, taking into account the GO hierarchy. Direct GO Count MF - A chart for the Molecular Function GO category, which shows the most frequent GO terms within a data-set without taking into account the GO hierarchy. Direct GO Count BP - Same as above but for Biological Process. Direct GO Count CC - Same as abode but for Cellular Component. InterProScan Overview - This chart reflects the effect of adding the GO-terms retrieved through the InterProScan results. Enzymes Main Enzyme Classes - Shows the distribution of the 6 main enzyme classes over all sequences. Second Level Classes - Same as above but for the corresponding subclass. Copyright BioBam Bioinformatics S.L. 8

10 3.3.2 Graph Viewer Tab Once the Graph has been created via the Make it so button, the Graph Options will have a direct impact on the shown graph. The sidepanel options are explained in the next section in detail. Directed Acyclic Graphs Blast2GO offers the possibility of visualizing the hierarchical structure of the gene ontology by directed acyclic graphs (DAG). This functionality is available to visualize results at different stages of the application and although configuration dialogs may vary, there are some shared features when generating graphs. 1.Software. Blast2GO integrates a viewer based on the ZVTM framework developed by Emmanuel Pietriga at the INRA (France) for graph visualization Pietriga (2005). This high-performing vectored visualization framework allows fast navigation and zooms on the GO DAG. A graph overview is permanently shown at the upper right corner of the graphical tab to easy follow exploring across the DAG surface. Zoom in/out is supported on the mouse wheel and fast zoom to readability is reached by double click on a DAG node. Information about the current node is given on the lower application bar 2.Parameters. Node Filters. A potential drawback during drawing Gene Ontology DAGs where numerous sequences are involved is the presence of an excessive number of nodes that would make the graph hard to visualize and will demand large memory resources. Blast2GO allows modulation of graph size by introducing node filters that depend of the type of graph considered. Additionally, there are a maximum possible number of nodes to be displayed. Coloring mode. Blast2GO highlights nodes proportionally to some parameter of the analysis which result is visualized on the DAG. By this intensity variation of node color relevant terms get more visual weight which is a useful way to guide visual inspection of the results. Sidepanel Graph Options Blast2GO generates combined graphs where the combined annotation of a group of sequences is visualized together. This can be used to study the joined biological meaning of a set of sequences. Combined graphs are a good alternative to an enrichment analysis where there is no reference set to be considered or when the number of involved sequences is low. To get a better understanding of the different types of shapes please see section Graph Options: Ontology - Choose which type of the Gene Ontology category should be to visualized. In case of All, the three graphs will be visualized at once. Graph Coloring: By Node-Score - A Score is computed for each node according to the formula: score = GOs seq α dist (3.1) Where seq is the number of different sequences annotated at a child GO term and dist the distance to the child node. Coloring by Node-Score will highlight areas of high annotation density. By Sequence Count - Node color intensity will be proportional to the number of contributing sequences at each node. By Ontology - Each node takes the color of its ontology. By Sequence Percentage - Node color intensity will be proportional to the percentage of sequences compared to 100%. The root node of each graph is indirectly present in all GO nodes, but the more specific it gets the lower the percentages. If the root node is present in all 10 sequences and one GO node is annotated to 6 sequences of your data-set, then we will have this GO node colored with 60% of intensity. Without Colors - Self-explanatory. Copyright BioBam Bioinformatics S.L. 9

11 Sequence Filter - The minimal number of sequences a GO node must have assigned, to be displayed. This filter is used to control the number of nodes present in the graph. It is recommended to start the analysis with a high number that, depending on the number of total sequences, is expected not to overload the graph. Depending on the result, adjust this value until the graph is satisfactory (10% of the total amount of annotated sequences is a good start). Additionally, nodes can be filtered out by the Node Score Filter (see below). Score Alpha - The value for parameter alpha in equation 3.1. Only nodes with a Node-Score higher than the Filter will be shown. Use this parameter to thin out the GO-Dag and to remove little informative nodes. Node Score Filter - By setting this value graphs can be thinned out, deleting nodes with a score lower than the given value. The following checkboxes allow the user to modify the graph s visual appearance. Edge Labels - Show edge labels like is a or part of. GO ID - If checked, the GO ID will be included in the node. GO Name - The GO Name is shown in the node. GO Definition - When checked the GO Definition will be included in the node. GO Node Score - The Node-Score will be shown in the node. Sequence Name - The names of the sequences annotated at each GO are included in the node. The maximum number of names to be displayed is 15. Sequence Count - The absolute number of sequences annotated with that particular GO will be displayed in the node. Sequence Percentage - When checked the percentage of sequences annotated within the data-set with that particular GO will be included in the node. 2. Jump to Node: will try to find the entered text and center the graph view on the corresponding node, if found. 3. Charts: The options available in the Charts area are explained in section 3.3.2, but basically the graphs information is reduced into a bar or pie chart. 4. Export: The graphs can be exported in 4 different file formats. Sidepanel Graph Charts Analysis of GO term associations in a set of sequences can also be done with pie/bar charts. Once the graph is visible, the Charts area allows the creation of 4 different charts. Cuts through the graph at a specific level and generates a pie representation of the number of sequences per GO node. Allows to select a minimum filter value in order to include only GO nodes with a higher Node-Score or sequence count in the resulting pie chart. Same as the first one but in bar chart style. Will show a bar chart with the number of sequences that have been annotated with a specific GO Term. All Charts will open in the Geneious B2G Window and can be saved in different file formats. Graph Legend The GO Graphs are displayed in different shapes (Figure 3.4). octagon - Annotated GO Terms square - Intermediate GO Terms ellipsis - GO Terms linked to a Blast Hit Copyright BioBam Bioinformatics S.L. 10

12 (a) Blast2GO summary charts. (b) Configure GO graph visualization. Figure 3.3: Sidepanel options of the Statistics and Graph Viewer Figure 3.4: Graph Legend that shows the graph shapes Copyright BioBam Bioinformatics S.L. 11

13 3.4 Sources Panel In the Sources Panel we find two Blast2GO services to define the Blast2GO Database and the GO Dag file.usually it is not necessary to modify these settings unless your want to connect to a local database installation or manually define the Gene Ontology hierarchy e.g. use during the annotation or GOSlim step. Copyright BioBam Bioinformatics S.L. 12

14 Blast2GO functions The purpose of this chapter is to describe all options available in the Blast2GO PRO Plugin for Geneious. It is thought as quick reference to find information to a given menu entry. 4.1 Load Example Data This menu option will load a small data-set of 100 plant nucleotide sequences which allows to experiment with different Blast2GO functionalities. This dataset can also be use as a reference dataset in case a own dataset return dubious or no results (e.g. in case an analysis step does not with your own data you may try this function with the example dataset). 4.2 Convert to Blast2GO This option allows to convert sequences into B2G Sequences documents. Amino acid as well as nucleotide sequences can be converted. Converting sequences (or lists) into B2G Sequence documents is requiered to work with the Blast2GO Plugin. 4.3 Add Blast Hits Add Blast Hits allows to add existing Blast results to your sequence dataset. Blast hits can be added from the Geneious sources or external files. Blast hits can be obtained via the Geneious built-in Sequence Search and added by choosing to the corresponding list of hits in the Source folders. To add external Blast results one or multiple XML files can be selected. This xml file can be created anywhere but should comply with the NCBI Blast XML format (blast+ parameter: -outfmt 5), otherwise it may not be recognized correctly by the plugin. To load Blast Hits from a sequence search, please select Hits and choose the protein hits from the Sources. Note: Do not attempt to load a blast xml file via File Import From File. to then add the resulting Blast hits to your data. It will not work! 4.4 CloudBlast CloudBlast allows to blast using our cloud system (Using the official NCBI Blast+ tools as well as databases from NCBI). To be able to use this service the user needs CloudBlast Computation Units and the current balance can be viewed under Blast2GO CloudBlast History (Section 4.18). CloudBlast Computation Units are spent proportional to the amount of computing time the user consumes. It is a direct representation of time needed to blast the data. We believe that this is a fair approach, because the user only pays what he consumes and he is able to reduce the consumption of units by blasting against smaller databases for example. Imagine the user blasting his tomato (Solanum lycopersicum) genes against nr database although he is only interested in results from plants (Viridiplantae). We recommend not blasting against nr, but its subset Viridiplantae in order to finish faster and to consume less CloudBlast Computation Units. Available parameters: Blast Program - blastx for nucleotides and blastp for amino-acids are available. Copyright BioBam Bioinformatics S.L. 13

15 Blast DB - Only protein databases are available. Note: consider looking for a suitable subset of nr in order to speed up your analysis and spend less ComputationUnits. Number of Blast Hits - The number of sequence alignments one wants to retrieve per sequence. Blast Expectation Value - The statistical significance threshold for reporting matches against a sequence database. If the statistical significance of an alignment is greater than the e-value threshold, this Hit will not be reported. Lower e-value thresholds are more stringent, leading to less results. Increasing the threshold shows less stringent matches. Blast Description Annotator - Find the best possible description for a new sequence based on a given Blast result. Word Size - One of the important parameters governing the sensitivity of Blast searches is the length of the initial word of the local alignment. Low Complexity Filter - The Blast program employ the SEG algorithm to filter low complexity regions from proteins before executing a database search. HSP Length Cutoff - Cutoff value for the minimal length of the first HSP of a Blast Hit, used to exclude Hits with only small local alignments from the Blast result. The given length corresponds to amino-acids or nucleotides depending on the type of the performed Blast. Filter by Description - All Blast hits whose description line contains the text provided here will be removed from the result list. XML - Save the results additionally to an xml file. This is recommended in order to be able to use the blast results for another software or simply to have a copy of the data. The results will append to the file selected. Once done the user can visualize the blast result data by doing Right Click sequence in the B2G Table. Show Blast Result on a 4.5 Add InterProScan This function allows to import and add already existing InterProScan results (version 4.8 or 5.0) to your dataset which have been generated outside of this plugin. The function to run InterProScan from the Plugin is explained in the section 4.6. Valid xml result files can be generated with the InterProScan executable and a local installation or directly via the EBI web-page ( Multiple files can be selected. 4.6 InterProScan What is InterPro? (Ref: EBI web-page, InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several different databases (referred to as member databases) that make up the InterPro consortium. Why is InterPro useful? InterPro combines signatures from multiple, diverse databases into a single searchable resource, reducing redundancy and helping users interpret their sequence analysis results. By uniting the member databases, InterPro capitalizes on their individual strengths, producing a powerful diagnostic tool and integrated resource. Copyright BioBam Bioinformatics S.L. 14

16 InterProScan allows to find similar sequences annotated with GOs and to confirm GOs we already annotated or to annotate GOs that did not appear via the conventional Blast2GO annotation pipeline. The latter must be done explicitly, or in other words: Performing InterProScan does not automatically augment the annotations (Neither does Add InterProScan). We call the transfer of GOs from the InterPro column to the GO IDs to Merge InterProScan results to the data (Section 4.10). To perform InterProScan via the public EBI service a valid address is required. Choose one or multiple algorithms or databases for the search. Once done visualize the InterPro data from: Right Click Show InterProScan Result on a sequence in the B2G Table. 4.7 Mapping Mapping is the process of retrieving GO terms associated to the hits obtained by the blast search. Blast2GO performs four different mappings steps: 1. Blast result accessions are used to retrieve gene names or symbols making use of two mapping files provided by the NCBI (gene info, gene2accession). Identified gene names are than searched in the species specific entries of the gene-product table of the GO database. 2. GeneBank identifiers (gi), the primary blast hit ids, are used to retrieve UniProt IDs making use of a mapping file from PIR (Non-redundant Reference Protein Database) including PSD, UniProt, Swiss-Prot, TrEMBL, RefSeq, GenPept and PDB. 3. Accessions are searched directly in the dbxref table of the GO database. 4. Blast result accessions are searched directly in the gene-product table of the GO database. 4.8 Show GO Description To review certain GOs in more detail, there is the possibility to do Right Click Show GO Description for any sequence in the sequence table. This shows the Geneious B2G Window with all GOs listed together with names, descriptions and the option to obtain further information on AmiGO or to just visualize a GOs graph in the Gene Ontology structure by clicking show. 4.9 Annotation This is the process of selecting GO terms from the GO pool obtained by the Mapping step and assigning them to the query sequences. In the current Blast2GO version this is the core type of functional annotation. The GO annotation is carried out by applying an annotation rule (AR) on the found ontology terms (mapping). The rule seeks to find the most specific annotations with a certain level of reliability. This process is adjustable in specificity and stringency. For each candidate GO an annotation score (AS) is computed. The AS is composed of two additive terms. The first, direct term (DT), represents the highest hit similarity of this GO, weighted by a factor corresponding to its evidence code (EC). The second term (AT) of the AS provides the possibility of abstraction. This is defined as annotation to a parent node when several child nodes are present in the GO candidate collection. This term multiplies the number of total GOs unified at the node by a user defined GO weight factor that controls the possibility and strength of abstraction. When the GO weight is set to 0, no abstraction is done. Finally, the AR selects the lowest term per branch that lies over a user defined threshold. DT, AT and the AR terms are defined as given in Figure 4.1. To understand better how the annotation score works, the following reasoning can be done: When the EC-weight is set to 1 for all ECs (no EC influence) and the GO-weight equals zero (no abstraction), then the annotation score equals to the maximum similarity value of the hits that have that GO term and the sequence will be annotated with that GO term if that score is above the given threshold provided. The situation when the EC-weights are lower than 1 means Copyright BioBam Bioinformatics S.L. 15

17 DT = max(similarity EC weigth ) AT = (#GO 1) GO weight AR : lowest.node(as(dt + AT )) threshold Figure 4.1: Annotation Rule that higher similarities are required to reach the threshold. If the GO-weight is different from 0 this means that it is possible that a parent node will reach the threshold while its children nodes would not. The annotation rule provides a general framework for annotation. The actual way annotation occurs depends on how the different parameters at the AS are set. These can be adjusted in the Annotation dialog. E-Value Hit Filter - This value can be understood as a pre-filter: only GO terms obtained from hits with a greater e-value than given here will be used for annotation. Annotation Cut-Off (threshold) - The annotation rule selects the lowest term per branch that lies over this threshold. GO-Weight - This is the weight given to the contribution of mapped children terms to the annotation of a parent term. Hsp-HitCoverage Cutoff - Sets the minimum needed coverage between a hit and his HSP. For example a value of 80 would mean that the aligned HSP must cover at least 80% of the longitude of its hit. Only annotations from hits fulfilling this criterion will be considered for annotation transfer. Filter GO by taxonomy - The filter will remove the Gene Ontology terms known not to be in the given taxonomy using the restrictions defined by Gene Ontology. EC-Weight EC code weights can be modified between 0.0 and 1.0. Note that in case, influence by evidence codes is not wanted, they can all be set to 1. Alternatively, when one wants to exclude GO annotations of a certain EC (for example IEAs), one can set its EC weight to Merge InterProScan The Merge InterProScan function is used to transfer and merge the Gene Ontology terms identified via InterPro into the blast based GO annotations (GO ID column). Figure 4.2 shows an example how to examine this process. In order to better understand what is going on, we can configure the B2G Table as shown in the figure: Switch GO View disabled Switch IPS View disabled This way we can directly see which GOs are already annotated (or candidates) and which ones would be added after the merge ANNEX Blast2GO integrates the Second Layer Concept developed by the Norwegian University of Science and Technology Myhre et al. (2006) for augmenting GO annotation. Basically, this approach uses uni-vocal relationships between GO terms from the different GO categories to add implicit annotation GO-Slim What is a GO Slim? (Ref: Gene Ontology website, Copyright BioBam Bioinformatics S.L. 16

18 (a) Before (b) After Figure 4.2: Notice that sequence C02008E09 does show functional annotation after the merge. Sequence C02008E05 also profits from merging by adding GO: GO slims are cut-down versions of the Gene Ontology containing a subset of the terms in the whole GO. They give a broad overview of the ontology content without the detail of the specific fine grained terms. GO slims are particularly useful for giving a summary of the results of GO annotation of a genome, microarray, or cdna collection when broad classification of gene product function is required. GO slims are created by users according to their needs, and may be specific to species or to particular areas of the ontologies. GO provides a generic GO slim which, like the GO itself, is not species specific, and which should be suitable for most purposes. Alternatively, users can create their own GO slims or use one of the model organismspecific slims integrated into the GO flat file. Please the GO helpdesk for more information about creating and submitting your GO slim. To get a better understanding of what GO Slim does in practice and how it works, here (Figure 4.3) is a small visual example. Imagine figure 4.3(a) to be the subset of GO terms called GO Slim, figure 4.3(b) shows a data-set with GO 6,9 and 10 annotated. The GO Slim methodology will pull up the 3 annotated GOs as follows: The result is shown in figure 4.3(c). Keep in mind that this would be a data-set containing various sequences, because one sequence that has annotated GO 1 and 4 would remain only with GO 4 because of the true-path rule. In the application our GO Slim subset is represented by a file with the extension.obo, this file contains all GO nodes and their hierarchical structure. The Gene Ontology Consortium provides various GO Slims that can be used and accessed directly from within the application. To select a predefined GO Slim, select Obo file from GO-Website and select your preferred file, it will then be used in combination with the currently selected obo file under Sources B2G GO Dag. The latter file contains the whole set of Gene Ontology terms. If the user wants to experiment and to try something separate, he can go for Custom Obo files and select the two obo files by hand. Keep in mind that the GO Slim file has to contain a real subset of GOs, otherwise the result is undefined. Copyright BioBam Bioinformatics S.L. 17

19 (a) GO Slim subset. (b) Whole data-set with GOs 6,9 and 10 annotated (c) The final annotation of the dataset after applying the GO Slim. Figure 4.3: This shows an example of GO Slim in practice, each node represents one GO. White stands for normal, yellow for GO Slim and blue for directly annotated Remove Results During a Blast2GO analysis (Blast, GO mapping, Annotation, InterPro, etc.) in each step pieces of information are added to the dataset. In order to redo a particular analysis step it is required to remove already existing results: Right click Remove Results. In Blast2GO most analysis steps will only be applied to sequences which have not been processed for this particular step. This means on one hand that if you e.g. want to run the Blast step, only white sequences will be considered - this allows to start and stop any step and any time without redoing already processed sequences. On the other hand, this means that if you want to re-blast a sequence or whole dataset, existing Blast results have to be removed first. Please note that when e.g removing the mapping results this automatically removes the annotation and GO Slim results. This is due to the fact that in Blast2GO the data is hierarchically structured and that a sequence cannot posses GO-Slim results without being annotated nor mapped in the first place Fisher s Exact Test Enrichment analysis can be performed in Blast2GO with a Fisher s Exact Test (FET). Blast2GO implements the FatiGO Al-Shahrour et al. (2004) package for statistical assessment of annotation differences between 2 sets of sequences. FatGO includes Multiple Testing Correction. For this analysis, all annotated sequences have to be selected. After hitting Right click Fishers Exact Test a test set and a reference set can be selected. Blast2GO will perform the FET for the test Copyright BioBam Bioinformatics S.L. 18

20 against the reference set. If no reference is chosen explicitly, the whole data-set automatically becomes the reference set. Both files need to be.txt files containing one sequence ID per line. Additional options are: Name - Choose a name for the resulting FET Result. Remove Double IDs - This option allows to automatically remove all sequence IDs which are present in the test-set and in the reference set at the same time. By default double/common IDs are only removed from the reference set. Two Sided - Perform a two sided test means to test for over and under-representation i.e. tests the test set against a reference set and vice-versa. P-Value Filter Mode - Choose the type of value used for filtering. P-Value - Single Test p-value: P-Value without multiple testing corrections. FDR - Corrected p-value by False Discovery Rate control. P-Value Filter Value - GO-Term with a value higher than the given one are not shown. For further details please refer to the FatiGO publication Al-Shahrour et al. (2004). Once Enrichemnt result has been calculated two result viewers are available: B2G FET Table - A table showing information about the enriched GOs. B2G FET Graph - A graph viewer to visualize the hierarchy of the enriched GOs, the Nodes are colored proportionally to their significance value. The user can choose which type of calculated p-value to use for highlighting and the threshold for filtering out nodes. Additionally, the Thinned out Graph Node Filter will hide nodes with a significance value higher than the given value Selection The Selection function available in Blast2GO Plugin allows to select a group of sequences based on different search criteria. Selections can be made based on the analysis step, functional annotation, etc. As described earlier in Geneious we can work with loose, un-grouped sequence documents as well as sequences lists. The selection functions can only be applied to list documents. Available options: Select Type - General Selection - New will first clear the selection and then select, whereas Add and Remove works with an existing current selection. Select - Indicates the main search criteria. Match Type Contains - Matches any string within another. Exact Match - Searches for the entire string of characters, including spaces, in the same order. Obtains only a result if the query matches the result exactly and completely. Whole Word - This is an exact match applied to each word. Case Sensitive - Distinguish between lower and uppercase letters. Rename Sequence Name - When filtering for Sequence Description, the user may decide to rename all filtered sequences after their description. Include GO Parents - When filtering for GO Id or GO Name, this option will also consider all child terms. Let GO:2 be the child of GO:1. If we filter for GO:1, we will also selected sequences that have GO:2 annotated, because they are implicitly annotated as well. Select Type - Color Selection - New will first clear the selection and then select, whereas Add and Remove works with an existing current selection. Color - Select sequences based on their sequence tables color. Copyright BioBam Bioinformatics S.L. 19

21 4.16 Activate Subscription Key Brings up the initial Activate Blast2GO dialog, in order to activate the plugin. It is only available when the plugin is currently not active (e.g. After a fresh install or after calling Remove Subscription Key) Remove Subscription Key Deactivates the plugin CloudBlast History Provides information regarding the CloudBlast usage and consumed ComputationUnits. Figure 4.4: Provides information regarding the CloudBlast usage and consumed ComputationUnits. A link directs to a recharge option. Copyright BioBam Bioinformatics S.L. 20

22 Geneious functions 5.1 Import Data To import data in Geneious we can use the Import dialog ( File Import From File ). The dialog offers an auto-detect function which allows to load annot, dat or fasta files without specifying the exact import data type. Figure 5.1: Geneious offers many different possibilities to load data. All Blast2GO related options are also listed which includes Project files (.dat) and annotation files (.annot). 5.2 Export Data The Export function ( File Export Selected Documents. ) allows to export Blast2GO Projects in (.dat) files as well as to save all generated a functional information in.annot or.gff format. 5.3 Workflows The Geneious Workflow Manager allows to predefine analysis pipelines. Copyright BioBam Bioinformatics S.L. 21

23 Figure 5.2: Results can be exported with the data export dialog. Figure 5.3: A basic workflow to convert, blast, map and annotate a list of sequences. Copyright BioBam Bioinformatics S.L. 22

24 Bibliography Al-Shahrour, F., Díaz-Uriarte, R., and Dopazo, J. (2004). Fatigo: a web tool for finding significant associations of gene ontology terms with groups of genes. Bioinformatics, 20(4): Conesa, A., Götz, S., García-Gómez, J. M., Terol, J., Talón, M., and Robles, M. (2005). Blast2go: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics, 21(18): Götz, S., Garcia-Gomez, J. M., Terol, J., Williams, T. D., Nagaraj, S. H., Nueda, M. J., Robles, M., Talon, M., Dopazo, J., and Conesa, A. (2008). High-throughput functional annotation and data mining with the blast2go suite. Nucl. Acids Res., pages gkn176+. Myhre, S., Tveit, H., Mollestad, T., and Laegreid, A. (2006). Additional gene ontology structure for improved biological reasoning. Bioinformatics, 22(16): Pietriga, E. (2005). A toolkit for addressing hci issues in visual language environments. In Visual Languages and Human-Centric Computing, 2005 IEEE Symposium on, pages Copyright BioBam Bioinformatics S.L. 23

Blast2GO Command Line User Manual

Blast2GO Command Line User Manual Blast2GO Command Line User Manual Version 1.1 October 2015 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Introduction....................................... 1 1.1 Main characteristics..............................

More information

Blast2GO Teaching Exercises

Blast2GO Teaching Exercises Blast2GO Teaching Exercises Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation process with Blast2GO

More information

MDA Blast2GO Exercises

MDA Blast2GO Exercises MDA 2011 - Blast2GO Exercises Ana Conesa and Stefan Götz March 2011 Bioinformatics and Genomics Department Prince Felipe Research Center Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2

More information

Blast2GO Teaching Exercises SOLUTIONS

Blast2GO Teaching Exercises SOLUTIONS Blast2GO Teaching Exerces SOLUTIONS Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation with Blast2GO

More information

Blast2GO PRO Plug-in User Manual

Blast2GO PRO Plug-in User Manual Blast2GO PRO Plug-in User Manual CLC bio Genomics Workbench and Main Workbench Version 1.4, September 2015 BioBam Bioinformatics S.L. Valencia, Spain Contents Introduction 1 Quick-Start 3 User Manual 5

More information

Lecture 5. Functional Analysis with Blast2GO Enriched functions. Kegg Pathway Analysis Functional Similarities B2G-Far. FatiGO Babelomics.

Lecture 5. Functional Analysis with Blast2GO Enriched functions. Kegg Pathway Analysis Functional Similarities B2G-Far. FatiGO Babelomics. Lecture 5 Functional Analysis with Blast2GO Enriched functions FatiGO Babelomics FatiScan Kegg Pathway Analysis Functional Similarities B2G-Far 1 Fisher's Exact Test One Gene List (A) The other list (B)

More information

Blast2GO User Manual. Blast2GO Ortholog Group Annotation May, BioBam Bioinformatics S.L. Valencia, Spain

Blast2GO User Manual. Blast2GO Ortholog Group Annotation May, BioBam Bioinformatics S.L. Valencia, Spain Blast2GO User Manual Blast2GO Ortholog Group Annotation May, 2016 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Clusters of Orthologs 2 2 Orthologous Group Annotation Tool 2 3 Statistics for NOG

More information

Geneious 5.6 Quickstart Manual. Biomatters Ltd

Geneious 5.6 Quickstart Manual. Biomatters Ltd Geneious 5.6 Quickstart Manual Biomatters Ltd October 15, 2012 2 Introduction This quickstart manual will guide you through the features of Geneious 5.6 s interface and help you orient yourself. You should

More information

Tutorial 4 BLAST Searching the CHO Genome

Tutorial 4 BLAST Searching the CHO Genome Tutorial 4 BLAST Searching the CHO Genome Accessing the CHO Genome BLAST Tool The CHO BLAST server can be accessed by clicking on the BLAST button on the home page or by selecting BLAST from the menu bar

More information

CLC Server. End User USER MANUAL

CLC Server. End User USER MANUAL CLC Server End User USER MANUAL Manual for CLC Server 10.0.1 Windows, macos and Linux March 8, 2018 This software is for research purposes only. QIAGEN Aarhus Silkeborgvej 2 Prismet DK-8000 Aarhus C Denmark

More information

STEM. Short Time-series Expression Miner (v1.1) User Manual

STEM. Short Time-series Expression Miner (v1.1) User Manual STEM Short Time-series Expression Miner (v1.1) User Manual Jason Ernst (jernst@cs.cmu.edu) Ziv Bar-Joseph Center for Automated Learning and Discovery School of Computer Science Carnegie Mellon University

More information

Tutorial: De Novo Assembly of Paired Data

Tutorial: De Novo Assembly of Paired Data : De Novo Assembly of Paired Data September 20, 2013 CLC bio Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 Fax: +45 86 20 12 22 www.clcbio.com support@clcbio.com : De Novo Assembly

More information

EBI patent related services

EBI patent related services EBI patent related services 4 th Annual Forum for SMEs October 18-19 th 2010 Jennifer McDowall Senior Scientist, EMBL-EBI EBI is an Outstation of the European Molecular Biology Laboratory. Overview Patent

More information

Annotating a single sequence

Annotating a single sequence BioNumerics Tutorial: Annotating a single sequence 1 Aim The annotation application in BioNumerics has been designed for the annotation of coding regions on sequences. In this tutorial you will learn how

More information

Database Searching Using BLAST

Database Searching Using BLAST Mahidol University Objectives SCMI512 Molecular Sequence Analysis Database Searching Using BLAST Lecture 2B After class, students should be able to: explain the FASTA algorithm for database searching explain

More information

QDA Miner. Addendum v2.0

QDA Miner. Addendum v2.0 QDA Miner Addendum v2.0 QDA Miner is an easy-to-use qualitative analysis software for coding, annotating, retrieving and reviewing coded data and documents such as open-ended responses, customer comments,

More information

Supplementary Materials for. A gene ontology inferred from molecular networks

Supplementary Materials for. A gene ontology inferred from molecular networks Supplementary Materials for A gene ontology inferred from molecular networks Janusz Dutkowski, Michael Kramer, Michal A Surma, Rama Balakrishnan, J Michael Cherry, Nevan J Krogan & Trey Ideker 1. Supplementary

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

Basic Local Alignment Search Tool (BLAST)

Basic Local Alignment Search Tool (BLAST) BLAST 26.04.2018 Basic Local Alignment Search Tool (BLAST) BLAST (Altshul-1990) is an heuristic Pairwise Alignment composed by six-steps that search for local similarities. The most used access point to

More information

Bioinformatics Hubs on the Web

Bioinformatics Hubs on the Web Bioinformatics Hubs on the Web Take a class The Galter Library teaches a related class called Bioinformatics Hubs on the Web. See our Classes schedule for the next available offering. If this class is

More information

Tutorial: Using the SFLD and Cytoscape to Make Hypotheses About Enzyme Function for an Isoprenoid Synthase Superfamily Sequence

Tutorial: Using the SFLD and Cytoscape to Make Hypotheses About Enzyme Function for an Isoprenoid Synthase Superfamily Sequence Tutorial: Using the SFLD and Cytoscape to Make Hypotheses About Enzyme Function for an Isoprenoid Synthase Superfamily Sequence Requirements: 1. A web browser 2. The cytoscape program (available for download

More information

Daniel H. Huson and Stephan C. Schuster with contributions from Alexander F. Auch, Daniel C. Richter, Suparna Mitra and Qi Ji.

Daniel H. Huson and Stephan C. Schuster with contributions from Alexander F. Auch, Daniel C. Richter, Suparna Mitra and Qi Ji. User Manual for MEGAN V3.9 Daniel H. Huson and Stephan C. Schuster with contributions from Alexander F. Auch, Daniel C. Richter, Suparna Mitra and Qi Ji March 30, 2010 Contents Contents 1 1 Introduction

More information

COPYRIGHTED MATERIAL. Making Excel More Efficient

COPYRIGHTED MATERIAL. Making Excel More Efficient Making Excel More Efficient If you find yourself spending a major part of your day working with Excel, you can make those chores go faster and so make your overall work life more productive by making Excel

More information

Tutorial. Variant Detection. Sample to Insight. November 21, 2017

Tutorial. Variant Detection. Sample to Insight. November 21, 2017 Resequencing: Variant Detection November 21, 2017 Map Reads to Reference and Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com

More information

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment An Introduction to NCBI BLAST Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment Resources: The BLAST web server is available at https://blast.ncbi.nlm.nih.gov/blast.cgi

More information

BovineMine Documentation

BovineMine Documentation BovineMine Documentation Release 1.0 Deepak Unni, Aditi Tayal, Colin Diesh, Christine Elsik, Darren Hag Oct 06, 2017 Contents 1 Tutorial 3 1.1 Overview.................................................

More information

Intro to NGS Tutorial

Intro to NGS Tutorial Intro to NGS Tutorial Release 8.6.0 Golden Helix, Inc. October 31, 2016 Contents 1. Overview 2 2. Import Variants and Quality Fields 3 3. Quality Filters 10 Generate Alternate Read Ratio.........................................

More information

Performing a resequencing assembly

Performing a resequencing assembly BioNumerics Tutorial: Performing a resequencing assembly 1 Aim In this tutorial, we will discuss the different options to obtain statistics about the sequence read set data and assess the quality, and

More information

Tutorial. De Novo Assembly of Paired Data. Sample to Insight. November 21, 2017

Tutorial. De Novo Assembly of Paired Data. Sample to Insight. November 21, 2017 De Novo Assembly of Paired Data November 21, 2017 Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com AdvancedGenomicsSupport@qiagen.com

More information

SciMiner User s Manual

SciMiner User s Manual SciMiner User s Manual Copyright 2008 Junguk Hur. All rights reserved. Bioinformatics Program University of Michigan Ann Arbor, MI 48109, USA Email: juhur@umich.edu Homepage: http://jdrf.neurology.med.umich.edu/sciminer/

More information

Genome Browsers - The UCSC Genome Browser

Genome Browsers - The UCSC Genome Browser Genome Browsers - The UCSC Genome Browser Background The UCSC Genome Browser is a well-curated site that provides users with a view of gene or sequence information in genomic context for a specific species,

More information

CRITERION Vantage 3 Admin Training Manual Contents Introduction 5

CRITERION Vantage 3 Admin Training Manual Contents Introduction 5 CRITERION Vantage 3 Admin Training Manual Contents Introduction 5 Running Admin 6 Understanding the Admin Display 7 Using the System Viewer 11 Variables Characteristic Setup Window 19 Using the List Viewer

More information

Tutorial: RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and Expression measures

Tutorial: RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and Expression measures : RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and February 24, 2014 Sample to Insight : RNA-Seq Analysis Part II (Tracks): Non-Specific Matches, Mapping Modes and : RNA-Seq Analysis

More information

High-throughput functional annotation and data mining with the Blast2GO suite

High-throughput functional annotation and data mining with the Blast2GO suite Nucleic Acids Research Advance Access published April 29, 2008 Nucleic Acids Research, 2008, 1 16 doi:10.1093/nar/gkn176 High-throughput functional annotation and data mining with the Blast2GO suite Stefan

More information

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading:

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading: 24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, 2010 3 BLAST and FASTA This lecture is based on the following papers, which are all recommended reading: D.J. Lipman and W.R. Pearson, Rapid

More information

Mascot Insight is a new application designed to help you to organise and manage your Mascot search and quantitation results. Mascot Insight provides

Mascot Insight is a new application designed to help you to organise and manage your Mascot search and quantitation results. Mascot Insight provides 1 Mascot Insight is a new application designed to help you to organise and manage your Mascot search and quantitation results. Mascot Insight provides ways to flexibly merge your Mascot search and quantitation

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2017 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

CS313 Exercise 4 Cover Page Fall 2017

CS313 Exercise 4 Cover Page Fall 2017 CS313 Exercise 4 Cover Page Fall 2017 Due by the start of class on Thursday, October 12, 2017. Name(s): In the TIME column, please estimate the time you spent on the parts of this exercise. Please try

More information

Genome Browsers Guide

Genome Browsers Guide Genome Browsers Guide Take a Class This guide supports the Galter Library class called Genome Browsers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

DREM. Dynamic Regulatory Events Miner (v1.0.9b) User Manual

DREM. Dynamic Regulatory Events Miner (v1.0.9b) User Manual DREM Dynamic Regulatory Events Miner (v1.0.9b) User Manual Jason Ernst (jernst@cs.cmu.edu) Ziv Bar-Joseph Machine Learning Department School of Computer Science Carnegie Mellon University Contents 1 Introduction

More information

De novo genome assembly

De novo genome assembly BioNumerics Tutorial: De novo genome assembly 1 Aims This tutorial describes a de novo assembly of a Staphylococcus aureus genome, using single-end and pairedend reads generated by an Illumina R Genome

More information

COMOS. Lifecycle 3D Integration Operation. COMOS PDMS Integration 1. Material management 2. COMOS 3D viewing 3. References 4.

COMOS. Lifecycle 3D Integration Operation. COMOS PDMS Integration 1. Material management 2. COMOS 3D viewing 3. References 4. 1 Material management 2 COMOS Lifecycle COMOS 3D viewing 3 References 4 Operating Manual 03/2017 V 10.2.1 A5E37098336-AB Legal information Warning notice system This manual contains notices you have to

More information

Lecture 5 Advanced BLAST

Lecture 5 Advanced BLAST Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 5 Advanced BLAST BLAST Recap Sequence Alignment Complexity and indexing BLASTN and BLASTP Basic parameters

More information

SUM - This says to add together cells F28 through F35. Notice that it will show your result is

SUM - This says to add together cells F28 through F35. Notice that it will show your result is COUNTA - The COUNTA function will examine a set of cells and tell you how many cells are not empty. In this example, Excel analyzed 19 cells and found that only 18 were not empty. COUNTBLANK - The COUNTBLANK

More information

Step-by-Step Guide to Relatedness and Association Mapping Contents

Step-by-Step Guide to Relatedness and Association Mapping Contents Step-by-Step Guide to Relatedness and Association Mapping Contents OBJECTIVES... 2 INTRODUCTION... 2 RELATEDNESS MEASURES... 2 POPULATION STRUCTURE... 6 Q-K ASSOCIATION ANALYSIS... 10 K MATRIX COMPRESSION...

More information

ClueGO - CluePedia Frequently asked questions

ClueGO - CluePedia Frequently asked questions ClueGO - CluePedia Frequently asked questions Gabriela Bindea, Bernhard Mlecnik Laboratory of Integrative Cancer Immunology INSERM U872 Cordeliers Research Center Paris, France Contents License...............................................................

More information

Performing whole genome SNP analysis with mapping performed locally

Performing whole genome SNP analysis with mapping performed locally BioNumerics Tutorial: Performing whole genome SNP analysis with mapping performed locally 1 Introduction 1.1 An introduction to whole genome SNP analysis A Single Nucleotide Polymorphism (SNP) is a variation

More information

ViTraM: VIsualization of TRAnscriptional Modules

ViTraM: VIsualization of TRAnscriptional Modules ViTraM: VIsualization of TRAnscriptional Modules Version 1.0 June 1st, 2009 Hong Sun, Karen Lemmens, Tim Van den Bulcke, Kristof Engelen, Bart De Moor and Kathleen Marchal KULeuven, Belgium 1 Contents

More information

Differential Expression Analysis at PATRIC

Differential Expression Analysis at PATRIC Differential Expression Analysis at PATRIC The following step- by- step workflow is intended to help users learn how to upload their differential gene expression data to their private workspace using Expression

More information

Topaz Workbench Data Visualizer User Guide

Topaz Workbench Data Visualizer User Guide Topaz Workbench Data Visualizer User Guide Table of Contents Displaying Properties... 1 Entering Java Regular Expressions in Filter Fields... 3 Related Topics... 3 Exporting the Extract Trace Events View...

More information

MetaPhyler Usage Manual

MetaPhyler Usage Manual MetaPhyler Usage Manual Bo Liu boliu@umiacs.umd.edu March 13, 2012 Contents 1 What is MetaPhyler 1 2 Installation 1 3 Quick Start 2 3.1 Taxonomic profiling for metagenomic sequences.............. 2 3.2

More information

Gegenees genome format...7. Gegenees comparisons...8 Creating a fragmented all-all comparison...9 The alignment The analysis...

Gegenees genome format...7. Gegenees comparisons...8 Creating a fragmented all-all comparison...9 The alignment The analysis... User Manual: Gegenees V 1.1.0 What is Gegenees?...1 Version system:...2 What's new...2 Installation:...2 Perspectives...4 The workspace...4 The local database...6 Populate the local database...7 Gegenees

More information

CLC Sequence Viewer USER MANUAL

CLC Sequence Viewer USER MANUAL CLC Sequence Viewer USER MANUAL Manual for CLC Sequence Viewer 8.0.0 Windows, macos and Linux June 1, 2018 This software is for research purposes only. QIAGEN Aarhus Silkeborgvej 2 Prismet DK-8000 Aarhus

More information

Tutorial: How to use the Wheat TILLING database

Tutorial: How to use the Wheat TILLING database Tutorial: How to use the Wheat TILLING database Last Updated: 9/7/16 1. Visit http://dubcovskylab.ucdavis.edu/wheat_blast to go to the BLAST page or click on the Wheat BLAST button on the homepage. 2.

More information

01 Launching Delft FEWS Delft-FEWS User Guide

01 Launching Delft FEWS Delft-FEWS User Guide 02 FEWS Explorer 01 Launching Delft FEWS Delft-FEWS User Guide 03 Dropdown Menu FEWS Explorer Overview Map display Filters Drop down menus File - Export timeseries Button bar Log Panel Status Bar Map Display

More information

Getting to know Blast2GO. Functional annotation: from sequences to functional labels

Getting to know Blast2GO. Functional annotation: from sequences to functional labels Getting to know Blast2GO Functional annotation: from sequences to functional labels Outline Concepts on Functional Annotation: Biological Databases Blast2GO annotation strategy ------------------------------------------------------------------The

More information

QIAseq DNA V3 Panel Analysis Plugin USER MANUAL

QIAseq DNA V3 Panel Analysis Plugin USER MANUAL QIAseq DNA V3 Panel Analysis Plugin USER MANUAL User manual for QIAseq DNA V3 Panel Analysis 1.0.1 Windows, Mac OS X and Linux January 25, 2018 This software is for research purposes only. QIAGEN Aarhus

More information

HymenopteraMine Documentation

HymenopteraMine Documentation HymenopteraMine Documentation Release 1.0 Aditi Tayal, Deepak Unni, Colin Diesh, Chris Elsik, Darren Hagen Apr 06, 2017 Contents 1 Welcome to HymenopteraMine 3 1.1 Overview of HymenopteraMine.....................................

More information

Quick Reference Card Business Objects Toolbar Design Mode

Quick Reference Card Business Objects Toolbar Design Mode Icon Description Open in a new window Pin/Unpin this tab Close this tab File Toolbar New create a new document Open Open a document Select a Folder Select a Document Select Open Save Click the button to

More information

ViTraM: VIsualization of TRAnscriptional Modules

ViTraM: VIsualization of TRAnscriptional Modules ViTraM: VIsualization of TRAnscriptional Modules Version 2.0 October 1st, 2009 KULeuven, Belgium 1 Contents 1 INTRODUCTION AND INSTALLATION... 4 1.1 Introduction...4 1.2 Software structure...5 1.3 Requirements...5

More information

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction Molecular Recognition Features (MoRFs) are short, intrinsically disordered regions in proteins that undergo

More information

Homology Modeling FABP

Homology Modeling FABP Homology Modeling FABP Homology modeling is a technique used to approximate the 3D structure of a protein when no experimentally determined structure exists. It operates under the principle that protein

More information

SEEK User Manual. Introduction

SEEK User Manual. Introduction SEEK User Manual Introduction SEEK is a computational gene co-expression search engine. It utilizes a vast human gene expression compendium to deliver fast, integrative, cross-platform co-expression analyses.

More information

Viewing Molecular Structures

Viewing Molecular Structures Viewing Molecular Structures Proteins fulfill a wide range of biological functions which depend upon their three dimensional structures. Therefore, deciphering the structure of proteins has been the quest

More information

Pathway Analysis of Untargeted Metabolomics Data using the MS Peaks to Pathways Module

Pathway Analysis of Untargeted Metabolomics Data using the MS Peaks to Pathways Module Pathway Analysis of Untargeted Metabolomics Data using the MS Peaks to Pathways Module By: Jasmine Chong, Jeff Xia Date: 14/02/2018 The aim of this tutorial is to demonstrate how the MS Peaks to Pathways

More information

MetScape User Manual

MetScape User Manual MetScape 2.3.2 User Manual A Plugin for Cytoscape National Center for Integrative Biomedical Informatics July 2012 2011 University of Michigan This work is supported by the National Center for Integrative

More information

User s Guide. Using the R-Peridot Graphical User Interface (GUI) on Windows and GNU/Linux Systems

User s Guide. Using the R-Peridot Graphical User Interface (GUI) on Windows and GNU/Linux Systems User s Guide Using the R-Peridot Graphical User Interface (GUI) on Windows and GNU/Linux Systems Pitágoras Alves 01/06/2018 Natal-RN, Brazil Index 1. The R Environment Manager...

More information

23 - Report & Export

23 - Report & Export 23 - Report & Export Contents 23 - REPORT & EXPORT... 1 SMART PUBLISHER... 1 Opening Smart Publisher... 1 Smart Publisher Settings... 2 The Finished Report... 5 Alias Names for Codes... 6 The Word Template

More information

How to view details for your project and view the project map

How to view details for your project and view the project map Tutorial How to view details for your project and view the project map Objectives This tutorial shows how to access EPANET model details and visualize model results using the Map page. Prerequisites Login

More information

Finding and Exporting Data. BioMart

Finding and Exporting Data. BioMart September 2017 Finding and Exporting Data Not sure what tool to use to find and export data? BioMart is used to retrieve data for complex queries, involving a few or many genes or even complete genomes.

More information

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files.

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files. Structure Viewers Take a Class This guide supports the Galter Library class called Structure Viewers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

Finding data. HMMER Answer key

Finding data. HMMER Answer key Finding data HMMER Answer key HMMER input is prepared using VectorBase ClustalW, which runs a Java application for the graphical representation of the results. If you get an error message that blocks this

More information

MindView Online - Quick Start Guide

MindView Online - Quick Start Guide MindView Online - Quick Start Guide Overview MindView Online is an online concept mapping program that allows users to organize their thoughts visually to create, share, and export mind maps to Microsoft

More information

Bioinformatics explained: BLAST. March 8, 2007

Bioinformatics explained: BLAST. March 8, 2007 Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics

More information

T-ACE Manual IKMB, UK S-H Lars Kraemer

T-ACE Manual IKMB, UK S-H Lars Kraemer T-ACE Manual 30.03.2012 IKMB, UK S-H Lars Kraemer Why T-ACE Installation o Setting up a T-ACE Client o Setting up a T-ACE database server o T-ACE versions o Required software T-ACE DB Manager T-ACE o Introduction

More information

The software comes with 2 installers: (1) SureCall installer (2) GenAligners (contains BWA, BWA- MEM).

The software comes with 2 installers: (1) SureCall installer (2) GenAligners (contains BWA, BWA- MEM). Release Notes Agilent SureCall 4.0 Product Number G4980AA SureCall Client 6-month named license supports installation of one client and server (to host the SureCall database) on one machine. For additional

More information

EBI is an Outstation of the European Molecular Biology Laboratory.

EBI is an Outstation of the European Molecular Biology Laboratory. EBI is an Outstation of the European Molecular Biology Laboratory. InterPro is a database that groups predictive protein signatures together 11 member databases single searchable resource provides functional

More information

Tutorial 1: Exploring the UCSC Genome Browser

Tutorial 1: Exploring the UCSC Genome Browser Last updated: May 12, 2011 Tutorial 1: Exploring the UCSC Genome Browser Open the homepage of the UCSC Genome Browser at: http://genome.ucsc.edu/ In the blue bar at the top, click on the Genomes link.

More information

m6aviewer Version Documentation

m6aviewer Version Documentation m6aviewer Version 1.6.0 Documentation Contents 1. About 2. Requirements 3. Launching m6aviewer 4. Running Time Estimates 5. Basic Peak Calling 6. Running Modes 7. Multiple Samples/Sample Replicates 8.

More information

Help Guide DATA INTERACTION FOR PSSA /PASA CONTENTS

Help Guide DATA INTERACTION FOR PSSA /PASA CONTENTS Help Guide Help Guide DATA INTERACTION FOR PSSA /PASA 2015+ CONTENTS 1. Introduction... 4 1.1. Data Interaction Overview... 4 1.2. Technical Support... 4 2. Access... 4 2.1. Single Sign-On Accoutns...

More information

The Allen Human Brain Atlas offers three types of searches to allow a user to: (1) obtain gene expression data for specific genes (or probes) of

The Allen Human Brain Atlas offers three types of searches to allow a user to: (1) obtain gene expression data for specific genes (or probes) of Microarray Data MICROARRAY DATA Gene Search Boolean Syntax Differential Search Mouse Differential Search Search Results Gene Classification Correlative Search Download Search Results Data Visualization

More information

GOView User Manual. December 22, 2015

GOView User Manual. December 22, 2015 GOView User Manual December 22, 2015 GOView is a web-based application, which allows users visualize and compare multiple provided GO term lists in a directed acyclic graph (DAG) to reveal relationships

More information

Heuristic methods for pairwise alignment:

Heuristic methods for pairwise alignment: Bi03c_1 Unit 03c: Heuristic methods for pairwise alignment: k-tuple-methods k-tuple-methods for alignment of pairs of sequences Bi03c_2 dynamic programming is too slow for large databases Use heuristic

More information

Tutorial 7: Automated Peak Picking in Skyline

Tutorial 7: Automated Peak Picking in Skyline Tutorial 7: Automated Peak Picking in Skyline Skyline now supports the ability to create custom advanced peak picking and scoring models for both selected reaction monitoring (SRM) and data-independent

More information

Understanding Acrobat Form Tools

Understanding Acrobat Form Tools CHAPTER Understanding Acrobat Form Tools A Adobe Acrobat X PDF Bible PDF Forms Using Adobe Acrobat and LiveCycle Designer Bible Adobe Acrobat X PDF Bible PDF Forms Using Adobe Acrobat and LiveCycle Designer

More information

BLAST, Profile, and PSI-BLAST

BLAST, Profile, and PSI-BLAST BLAST, Profile, and PSI-BLAST Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 26 Free for academic use Copyright @ Jianlin Cheng & original sources

More information

MacVector for Mac OS X

MacVector for Mac OS X MacVector 10.6 for Mac OS X System Requirements MacVector 10.6 runs on any PowerPC or Intel Macintosh running Mac OS X 10.4 or higher. It is a Universal Binary, meaning that it runs natively on both PowerPC

More information

SmartView. User Guide - Analysis. Version 2.0

SmartView. User Guide - Analysis. Version 2.0 SmartView User Guide - Analysis Version 2.0 Table of Contents Page i Table of Contents Table Of Contents I Introduction 1 Dashboard Layouts 2 Dashboard Mode 2 Story Mode 3 Dashboard Controls 4 Dashboards

More information

PFstats User Guide. Aspartate/ornithine carbamoyltransferase Case Study. Neli Fonseca

PFstats User Guide. Aspartate/ornithine carbamoyltransferase Case Study. Neli Fonseca PFstats User Guide Aspartate/ornithine carbamoyltransferase Case Study 1 Contents Overview 3 Obtaining An Alignment 3 Methods 4 Alignment Filtering............................................ 4 Reference

More information

Environmental Sample Classification E.S.C., Josh Katz and Kurt Zimmer

Environmental Sample Classification E.S.C., Josh Katz and Kurt Zimmer Environmental Sample Classification E.S.C., Josh Katz and Kurt Zimmer Goal: The task we were given for the bioinformatics capstone class was to construct an interface for the Pipas lab that integrated

More information

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be 48 Bioinformatics I, WS 09-10, S. Henz (script by D. Huson) November 26, 2009 4 BLAST and BLAT Outline of the chapter: 1. Heuristics for the pairwise local alignment of two sequences 2. BLAST: search and

More information

CLC Sequence Viewer 6.5 Windows, Mac OS X and Linux

CLC Sequence Viewer 6.5 Windows, Mac OS X and Linux CLC Sequence Viewer Manual for CLC Sequence Viewer 6.5 Windows, Mac OS X and Linux January 26, 2011 This software is for research purposes only. CLC bio Finlandsgade 10-12 DK-8200 Aarhus N Denmark Contents

More information

Press the Plus + key to zoom in. Press the Minus - key to zoom out. Scroll the mouse wheel away from you to zoom in; towards you to zoom out.

Press the Plus + key to zoom in. Press the Minus - key to zoom out. Scroll the mouse wheel away from you to zoom in; towards you to zoom out. Navigate Around the Map Interactive maps provide many choices for displaying information, searching for more details, and moving around the map. Most navigation uses the mouse, but at times you may also

More information

Intro to GIS (requirements: basic Windows computer skills and a flash drive)

Intro to GIS (requirements: basic Windows computer skills and a flash drive) Introduction to GIS Intro to GIS (requirements: basic Windows computer skills and a flash drive) Part 1. What is GIS. 1. System: hardware (computers, devices), software (proprietary or free), people. 2.

More information

BASICS OF SPATIAL MODELER etraining

BASICS OF SPATIAL MODELER etraining Introduction BASICS OF SPATIAL MODELER etraining Describes the Spatial Modeler workspace and functions and shows how to run Spatial Models. Software Data Spatial Modeler N/A Transcript 0:10 Thank you for

More information

CONTENTS 1. Contents

CONTENTS 1. Contents BIANA Tutorial CONTENTS 1 Contents 1 Getting Started 6 1.1 Starting BIANA......................... 6 1.2 Creating a new BIANA Database................ 8 1.3 Parsing External Databases...................

More information

ScholarOne Manuscripts. COGNOS Reports User Guide

ScholarOne Manuscripts. COGNOS Reports User Guide ScholarOne Manuscripts COGNOS Reports User Guide 1-May-2018 Clarivate Analytics ScholarOne Manuscripts COGNOS Reports User Guide Page i TABLE OF CONTENTS USE GET HELP NOW & FAQS... 1 SYSTEM REQUIREMENTS...

More information

Tour Guide for Windows and Macintosh

Tour Guide for Windows and Macintosh Tour Guide for Windows and Macintosh 2011 Gene Codes Corporation Gene Codes Corporation 775 Technology Drive, Suite 100A, Ann Arbor, MI 48108 USA phone 1.800.497.4939 or 1.734.769.7249 (fax) 1.734.769.7074

More information

Geneious 2.0. Biomatters Ltd

Geneious 2.0. Biomatters Ltd Geneious 2.0 Biomatters Ltd August 2, 2006 2 Contents 1 Getting Started 5 1.1 Downloading & Installing Geneious.......................... 5 1.2 Using Geneious for the first time............................

More information