BHSAI Biotechnology HPC Software Applications Institute QuartetS-DB An Orthology Database for Species User s Guide May 0
The QuartetS database (QuartetS-DB) contains orthology predictions for species ( bacterial, 9 archaeal, and eukaryotic) distributed across phyla, which cover more than seven million proteins and four million pairwise orthologs. The Web interface of QuartetS-DB provides features for browsing, querying, and downloading orthology information that together may not be readily available elsewhere. These include: ) a userspecified cutoff parameter to tailor its application by balancing prediction accuracy and coverage (the user can choose to obtain fewer, more accurate ortholog predictions, or more, less-accurate ortholog predictions); ) the ability to retrieve a list of all orthologs across multiple, user-specified genomes (a convenient feature for comparative studies); and ) the ability to browse more than 000 gene trees of the corresponding orthologous groups, including large trees covering over 900 taxa, a desirable feature in evolutionary studies of protein families across species. This brief guide provides step-by-step instructions for four types of applications to: ) identify orthologs between two species; ) identify all orthologs among multiple species; ) identify orthologous groups that contain proteins which meet a user-specified criterion; and ) identify inparalogs within one species.
Application : Identify orthologs between two species Description: The user selects two species and the system returns a list of the corresponding pairwise orthologs.. Select the Pairwise Orthologs link. Select two species from the two drop-down lists. Set a QuartetS cutoff value (the default is 0, and setting a smaller value will result in fewer, but more accurate, pairwise orthologs). Adjust the number of pairwise orthologs to be displayed in one page and use the pagenavigation buttons to display the selected information. Follow the links,, to access additional information for each protein from external resources, such as the National Center for Biotechnology Information (, ) and UniProt ( ). If a protein has inparalogs, follow the link to view them 7. Press the Download link to export pairwise orthologs with/without inparalog information Application 7
Application : Identify all orthologs among multiple species Description: The user selects a set of species and the system returns a list of orthologous groups, where each group contains at least one protein from the selected species.. Select the Orthologous Groups link. Select multiple entries from the three species lists for bacteria, archaea and eukaryotes, respectively, (to select non-contiguous species in a list, press and hold the Ctrl key on your keyboard). Press the Search ALL / Search ANY button while leaving the Criterion box empty to retrieve groups of orthologs in ALL/ANY of the selected species. Adjust the number of orthologous groups to be displayed in one page and use the pagenavigation buttons to display the selected information. Press the Tabular View or List View to switch between the two ways to view multiple orthologous groups (the Tabular View displays a maximum of 0 species). Follow the Group ID link in either Tabular View or List View to view detailed information about each orthologous group in a separate, single-group page 7. Follow the View Gene Tree link (, ) in the List View (or on the single-group page) to view the corresponding gene tree [each group has two links: one for viewing the entire gene tree ( species ( )] ), and one for viewing a portion of the tree containing the user-selected 8. Follow other links (,, in the Tabular View or on the single-group page) to access information from external resources 9. Download the list of orthologous groups in a list format or a table format via links in the List View or Table View 0. Download the list of orthologs with functional descriptions for a specific group via the link on that group s page
Applications and 9 7 9 7 8 0 8
Application : Identify orthologous groups that contain specific proteins which meet a user-specified criterion Description: The user selects a set of species and provides a search criterion (for either proteins or orthologous groups) and the system returns a list of orthologous groups, where each group contains at least one protein from the selected species that satisfies the search criterion.. Select the Orthologous Groups link. Select multiple entries from the three species lists (see Application ). Select the type of search ( Search by Protein/Group to the left of in the figure) and enter a search criterion to identify orthologous groups that satisfy the search criterion. For example: Search by Protein/Group Criterion Protein GI 889 Protein RefSeq Accession YP_00888 Protein Gene ID 79999 Protein Gene Symbol Protein Locus Tag Protein UniProt Accession Group ID Group Symbol Functional Description Group GO Description prfa APA0_000 AFZ9 QTS_ prfa 0S ribosomal protein ATP binding Group GO Accession 000. Press the Search ALL / Search ANY button to retrieve the groups of orthologs that satisfy the search criterion and contain orthologs in ALL/ANY of the selected species. Refer to Application (Steps to 0) to browse and download the query results
Application : Identify inparalog groups within one species Description: The user selects one species and the system returns a list of inparalog groups, where each group contains two or more proteins that are inparalogs in the selected species.. Select the Inparalog Groups link. Select one species from the drop-down list. Adjust the number of inparalog groups to be displayed in one page and use the pagenavigation buttons to display the selected information. Follow the links,, to access additional information for each protein from external resources, such as the National Center for Biotechnology Information (, ) and UniProt ( ). Press expand to view all inparalogs in a group that has more than inparalogs. Press the Download link to export the inparalog groups Application