Software review. Biomolecular Interaction Network Database

Size: px
Start display at page:

Download "Software review. Biomolecular Interaction Network Database"

Transcription

1 Biomolecular Interaction Network Database Keywords: protein interactions, visualisation, biology data integration, web access Abstract This software review looks at the utility of the Biomolecular Interaction Network Database (BIND) as a web database. BIND offers methods common to related biology databases and specialisations for its protein interaction data. Searching and browsing this database is easy and well integrated with the underlying data and the needs of scientists. Interaction networks are visualised with software that offers many useful options. The innovative ontoglyphs are used throughout to provide visual cues to protein functions, localisation and other aspects one needs to know for this data set. One can expect to get useful results that may be well integrated with one s research needs. INTRODUCTION Web-accessible databases are increasingly the primary sources of data and information for many biologists in their research. These cover a gamut of bioinformation from literature (PubMed, journals), sequences (NCBI, EBI), genomes (model organism and other databases), gene expression, molecular dynamics and proteomics, taxonomic and phylogenetics databases and others. To support access and use of these databases, the database maintainers develop or deploy web software ( webware ) to let you search, retrieve and analyse their data. Currently much of this database webware is developed for each individual project. There are efforts to produce and use among bio-databases a set of common methods, database tools and webware, such as the Generic Model Organism Database 1 and Ensembl/BioMart. 2 The Biomolecular Interaction Network Database (BIND) 3 is one web database with important information used by many bioscientists. It contains a wealth of protein interaction data with supporting curated literature and molecular function information. This includes data automatically captured from high-throughput projects, human-curated information from the scientific literature, as well as data integrated from other biology databases. Started in 1998, BIND is a project of the Blueprint Initiative 4 for public biomolecular data. This resource includes over 175,000 interactions, nearly 3,000 protein complexes and several pathways, drawn from publications and high-throughput experiments. How well does BIND support access to its data via web functions? Does a biologist find common access methods when using BIND and other web databases? Does BIND offer methods uniquely suited to its data? WEB DATABASE COMPONENTS Biology web databases share common components that are needed to make them most useful to scientists. The usefulness of a database depends as much on how fully its webware implements such components, as on the value of the underlying data. Table 1 indicates the major components of BIND s web database. This website database has been updated significantly in 2005, and offers improved browsing and search features, a 194 & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE 2005

2 Table 1: BIND database web components Component Options Comments Browsing All or limited by types, taxonomy Usable access to all interactions for discovery Searches Any text, fields, IDs, BLAST Extensive, appropriate choices Results View formats, export formats Filter results; missing refine-search Interaction reports Pairs, complexes, graphs Highly interactive and customisable Submit data Not reviewed Data exports All data contents, multiple formats Easy to get any or all data in useful formats Documents Tutorials, FAQ, publications Good selection, more complete user manual would help Related data (BIND) PreBIND, SMID Well integrated, supports main data External data Imports from sequence, molecular and genome databases, gene ontologies, external IDs Strong external integration new downloading interface, taxonomy identifier searching and expanded use of ontoglyphs. Searching and browsing BIND offers several search options: simple text searches, using IDs from several databases, and by specific data fields. BLAST searches of proteins in the database are also a choice. One can also start without search questions, by browsing through all data. This is handy for new customers who want see what is available. The searches allow you to focus on all aspects of this data set: literature information, molecule structure, gene information including functions, IDs and sequences, and taxonomy. After a search, one can get hundreds or possibly many thousands of interaction results. There are methods to limit or filter these to a smaller interesting set, or one can repeat a search with other terms. A method to refine a search, by adding new limiting terms, as one finds in other web databases, would be a useful addition. A specialised PreBIND data set is available for searching. It uses a supervised learning algorithm (SVM) to search for interactions described in literature. This will find papers that may not have obvious simple interaction references. This is an example of applying newer data-mining techniques to biological data that yields more useful results than common simpler database and text search methods. Visualising molecular interactions The BIND Interaction viewer shows interaction networks for complexes. The basic display is a network graph of relations between molecules, with glyphs for each molecule designed from the common ontoglyph components. This viewer offers numerous options for customising the view based on protein binding, function and localisation, and for manipulating the network graph. Such customisability allows one to better dissect and extract useful information from complex networks. This viewer comes in two versions, both Java driven. The newer version 3 is a Java application that will run on most computers with a Java runtime system installed. Version 2 is designed as an applet whose function is dependent on quirks of web browser versions. Ontoglyphs Ontoglyphs are a special feature of BIND that provides pictorial representations of gene ontology and product information, as shown in Figure 1. These can be used in filtering results, data summaries and other parts of the web access. Some 83 primitive symbols (glyphs) are used to represent function, binding and cellular localisation. Several of these glyphs are combined to represent a molecule s overall function in a pictorial way that is rather intuitive. It allows one to select and & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE

3 Protein Binding Protein Synthesis, Processing Viral life cycle Signal transduction Death and Regulation Defence/ immune response Cell multiplication Cell periphery Glyphs Figure 1: Ontoglyph example. The example protein (CD28) ontoglyph at bottom has functions, binding and cell location properties as identified by several component glyphs at top ONTOGLYPH CD28 see related molecular interactions more readily than reading text descriptions. Overall usability The many components of this web database system are well integrated with each other. These include common userinterface components, such as paged results and listings, common export-thisview sections, drop-down expansion of information, and related information links. The overall web page design is clean, not cluttered with extraneous information, and focuses on the presentation of interactions, which can be rather complex. The graphic ontoglyphs are a common feature that, once learned, are an invaluable aid to reduce complexity and to focus on those components a scientist is most interested in. Integration with other databases BIND imports data from several other databases with protein information, including model organism genome projects for mouse, fruitfly and yeast. High-throughput experiments with hundreds or thousands of protein interactions are targeted by BIND engineers who developed necessary tools to import each data set. Primary databases with protein-related information that are drawn on include NCBI sequences and structure, PubMed literature, Gene Ontology and others. Molecular structures from the MMDB 5 protein structure database are imported. Identifiers from several chemical and molecular registries are included, such as CAS and Beilstein. The Small Molecule Interaction Database (SMID) is a related database at Blueprint that focuses on small molecule components of proteins. The recently released SMID-Genomes section provides a highly useful integration of genome sequence data and small molecules from the PDB. 6 With this web database, a researcher can find molecules unique to a given organism genome, or molecules shared among various species. One can select up to five organisms, and find all the small molecules that are either unique to them, or shared in common among them. The ability to screen molecules by species through this web database has numerous uses. One important use is identifying pharmaceutical and agricultural target molecules. For instance, insect pesticides benign to a crop plant may be identified by finding molecules used in Drosophila that are not found in Arabidopsis. Documentation and help The web site provides a range of documentation on using BIND, how 196 & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE 2005

4 BIND data are collected and curated, lists of data sources and how these are integrated. This includes tutorial documents for getting started, and getting the most use out of this database. Descriptions and schema for the data structures, database and software tools are available. Publications on this database and answers to frequently asked questions are available. The tutorials provided did not appear to cover all current features, results, viewers and functions of this web database. Though many aspects can be learned by use, a more complete user manual would be a welcome addition. DISCUSSION The BIND web database offers a wellintegrated, fully featured portal into an important resource for understanding biomolecular interactions. Its innovative features and common components for finding, using and exporting its data provide a useful example for bioinformatics web databases. A scientist can expect to find what they are looking for if it exists in this database, and also to get data out of it into their spreadsheet, or other analysis tools with little problem. There are features that make it possible to start from other databases and get results one wants, or to take results from this database to another database. Support for such cross-database travel is essential to many researchers today. Web databases that centre on interactions, whether of molecules, genes or other factors, need good methods to visualise the interactions, as well as options for selecting, filtering and focusing on those portions of an interaction network that the scientist is most interested in. BIND works towards that need with interaction visualisation software, and many customising options. The Interaction visualiser works well (version 2 failed for this reviewer s web browser, but version 3 worked properly), and allows one to view and select from individual interactions. The web page results and reporting methods provide good support for finding and understanding interaction information. This project s innovation with ontoglyphs, for indicating components of molecules and interactions, is one that makes this service especially useful. This innovation could be adopted by other biology web databases to their advantage. One aspect that is missing or still in development at BIND is a method for searching graphically based on molecular structures in a more extensive way than the ontoglyphs and related methods allow. BIND s related small molecule SMID-Genomes database offers exceptional promise for species-oriented molecule and interaction discovery that one can expect to see future integrations. Related web databases for protein information include the Protein Data Bank (PDB), 6 NCBI s MMDB 5 and related protein databases, along with numerous others. KEGG 7 is a widely used database that integrates genes and protein pathways, along with chemical compound structures. The BRITE subsidiary of KEGG is a useful protein interaction data set that is integrated with other web search and protein datareporting functions at KEGG. IntAct 8 is protein interaction database with a similar basic goal to BIND, with interactions derived from literature curation and user submissions. This is a newer and smaller database, with some 50,000 pairs of interactions, and operates as a collaboration of several European database groups, including Max-Planck- Institut, Swiss Institute of Bioinformatics and the EBI. The web access to IntAct centres on database searches by several attributes: gene names, InterPro, SwissProt, Gene Ontology and PubMed identifiers. Results include tabular lists of paired interactions, along with descriptions, curated annotations and links to related protein interactions. Interaction networks are visualised with graphs that are similar in design to BIND. This service uses plain web images (GIF, JPEG) instead of a Java application as at BIND. This offers usability to a broader range of customers, while sacrificing user & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE

5 interactivity and customisability that the BIND viewer provides. Both of these services provide basic data output in XML formats. The BIND service provides a broader range of result listings and data export options, including ID lists, tabular forms and sequence FastA. Software of both projects is available as open source, although possibly not fully available. Both projects provide all data they curate and collate for public reuse without restrictions. The Blueprint Initiative and BIND have grown out of work by principal investigator Christopher Hogue, which includes related databases and software tools for biomolecular data. SeqHound is one of the popular adjunct programs from this group. It is a database of common public biological sequences and structures, along with software for efficient updating and rapid access to this collection. NBLAST, for network/cluster BLAST analyses, and a Distributed Folding Project, are some of the other useful bioinformatics tools from this group. Acknowledgments This work is supported in part by NIH grant 1R01HG to the author. Don Gilbert Biology Department, Indiana University, Bloomington, Indiana 47405, USA References Tel: gilbertd@indiana.edu 1. Stein, L. D., Mungall, C., Shu, S. et al. (2002), The generic genome browser: A building block for a model organism system database, Genome Res., Vol. 12, pp (URL: Kasprzyk, A., Keefe, D., Smedley, D. et al. (2004), EnsMart: A generic system for fast and flexible access to biological data, Genome Res., Vol. 14, pp (URL: Alfarano, C., Andrade, C. E., Anthony, K. et al. (2005), The Biomolecular Interaction Network Database and related tools 2005 update, Nucleic Acids Res., Vol. 33, pp. D418 D424 (Database issue) (URL: 4. URL: 5. Chen, J., Anderson, J. B., DeWeese-Scott, C. et al. (2003), MMDB: Entrez s 3D-structure database, Nucleic Acids Res., Vol. 31, pp (URL: Structure/MMDB/mmdb.shtml). 6. Berman, H. M., Westbrook, J., Feng, Z. et al. (2000), The Protein Data Bank, Nucleic Acids Res., Vol. 28, pp (URL: Kanehisa, M. and Goto, S. (2000). KEGG: Kyoto Encyclopedia of Genes and Genomes, Nucleic Acids Res., Vol. 28, pp (URL: 8. Hermjakob, H., Montecchi-Palazzi, l., Lewington, C. et al. (2004), IntAct an open source molecular interaction database, Nucleic Acids Res., Vol. 32, pp. D452 D455 (URL: & HENRY STEWART PUBLICATIONS BRIEFINGS IN BIOINFORMATICS. VOL 6. NO JUNE 2005

Software review. Shopping in the genome market with EnsMart

Software review. Shopping in the genome market with EnsMart Shopping in the genome market with EnsMart Keywords: genome databases, human genome, comparative genomics, data mining, open source software Abstract Life scientists who work with the supermarket of genome

More information

Bioinformatics Hubs on the Web

Bioinformatics Hubs on the Web Bioinformatics Hubs on the Web Take a class The Galter Library teaches a related class called Bioinformatics Hubs on the Web. See our Classes schedule for the next available offering. If this class is

More information

Topics of the talk. Biodatabases. Data types. Some sequence terminology...

Topics of the talk. Biodatabases. Data types. Some sequence terminology... Topics of the talk Biodatabases Jarno Tuimala / Eija Korpelainen CSC What data are stored in biological databases? What constitutes a good database? Nucleic acid sequence databases Amino acid sequence

More information

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction Molecular Recognition Features (MoRFs) are short, intrinsically disordered regions in proteins that undergo

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2017 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

NCBI News, November 2009

NCBI News, November 2009 Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved

More information

Structural Bioinformatics

Structural Bioinformatics Structural Bioinformatics Elucidation of the 3D structures of biomolecules. Analysis and comparison of biomolecular structures. Prediction of biomolecular recognition. Handles three-dimensional (3-D) structures.

More information

HymenopteraMine Documentation

HymenopteraMine Documentation HymenopteraMine Documentation Release 1.0 Aditi Tayal, Deepak Unni, Colin Diesh, Chris Elsik, Darren Hagen Apr 06, 2017 Contents 1 Welcome to HymenopteraMine 3 1.1 Overview of HymenopteraMine.....................................

More information

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,

More information

BovineMine Documentation

BovineMine Documentation BovineMine Documentation Release 1.0 Deepak Unni, Aditi Tayal, Colin Diesh, Christine Elsik, Darren Hag Oct 06, 2017 Contents 1 Tutorial 3 1.1 Overview.................................................

More information

3DProIN: Protein-Protein Interaction Networks and Structure Visualization

3DProIN: Protein-Protein Interaction Networks and Structure Visualization Columbia International Publishing American Journal of Bioinformatics and Computational Biology doi:10.7726/ajbcb.2014.1003 Research Article 3DProIN: Protein-Protein Interaction Networks and Structure Visualization

More information

BIOINFORMATICS A PRACTICAL GUIDE TO THE ANALYSIS OF GENES AND PROTEINS

BIOINFORMATICS A PRACTICAL GUIDE TO THE ANALYSIS OF GENES AND PROTEINS BIOINFORMATICS A PRACTICAL GUIDE TO THE ANALYSIS OF GENES AND PROTEINS EDITED BY Genome Technology Branch National Human Genome Research Institute National Institutes of Health Bethesda, Maryland B. F.

More information

Lecture 5. Functional Analysis with Blast2GO Enriched functions. Kegg Pathway Analysis Functional Similarities B2G-Far. FatiGO Babelomics.

Lecture 5. Functional Analysis with Blast2GO Enriched functions. Kegg Pathway Analysis Functional Similarities B2G-Far. FatiGO Babelomics. Lecture 5 Functional Analysis with Blast2GO Enriched functions FatiGO Babelomics FatiScan Kegg Pathway Analysis Functional Similarities B2G-Far 1 Fisher's Exact Test One Gene List (A) The other list (B)

More information

Viewing Molecular Structures

Viewing Molecular Structures Viewing Molecular Structures Proteins fulfill a wide range of biological functions which depend upon their three dimensional structures. Therefore, deciphering the structure of proteins has been the quest

More information

About the Edinburgh Pathway Editor:

About the Edinburgh Pathway Editor: About the Edinburgh Pathway Editor: EPE is a visual editor designed for annotation, visualisation and presentation of wide variety of biological networks, including metabolic, genetic and signal transduction

More information

TBtools, a Toolkit for Biologists integrating various HTS-data

TBtools, a Toolkit for Biologists integrating various HTS-data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 TBtools, a Toolkit for Biologists integrating various HTS-data handling tools with a user-friendly interface Chengjie Chen 1,2,3*, Rui Xia 1,2,3, Hao Chen 4, Yehua

More information

Using WebGBrowse to Visualize Genome Annotation on GBrowse

Using WebGBrowse to Visualize Genome Annotation on GBrowse Protocol Using WebGBrowse to Visualize Genome Annotation on GBrowse Ram Podicheti and Qunfeng Dong 1 Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405, USA INTRODUCTION

More information

The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases

The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases International Symposium on Integrative Bioinformatics 2010 The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases M Lange, K Spies, C Colmsee, S Flemming, M Klapperstück,

More information

Complex Query Formulation Over Diverse Information Sources Using an Ontology

Complex Query Formulation Over Diverse Information Sources Using an Ontology Complex Query Formulation Over Diverse Information Sources Using an Ontology Robert Stevens, Carole Goble, Norman Paton, Sean Bechhofer, Gary Ng, Patricia Baker and Andy Brass Department of Computer Science,

More information

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files.

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files. Structure Viewers Take a Class This guide supports the Galter Library class called Structure Viewers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

SELF-SERVICE SEMANTIC DATA FEDERATION

SELF-SERVICE SEMANTIC DATA FEDERATION SELF-SERVICE SEMANTIC DATA FEDERATION WE LL MAKE YOU A DATA SCIENTIST Contact: IPSNP Computing Inc. Chris Baker, CEO Chris.Baker@ipsnp.com (506) 721 8241 BIG VISION: SELF-SERVICE DATA FEDERATION Biomedical

More information

Geneious 5.6 Quickstart Manual. Biomatters Ltd

Geneious 5.6 Quickstart Manual. Biomatters Ltd Geneious 5.6 Quickstart Manual Biomatters Ltd October 15, 2012 2 Introduction This quickstart manual will guide you through the features of Geneious 5.6 s interface and help you orient yourself. You should

More information

Genome Browsers Guide

Genome Browsers Guide Genome Browsers Guide Take a Class This guide supports the Galter Library class called Genome Browsers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

IPA: networks generation algorithm

IPA: networks generation algorithm IPA: networks generation algorithm Dr. Michael Shmoish Bioinformatics Knowledge Unit, Head The Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering Technion Israel Institute of Technology

More information

Exploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix

Exploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as

More information

Bioinformatics approach for exploring MS/MS proteomics data

Bioinformatics approach for exploring MS/MS proteomics data Bioinformatics approach for exploring MS/MS proteomics data Mudita Singhal, Kyle Klicker, George Chin, Lynn Trease, Eric Stephan, Deborah Gracio Computational Sciences and Mathematics, Pacific Northwest

More information

Genomic pathways database and biological data management

Genomic pathways database and biological data management SHORT COMMUNICATION Genomic pathways database and biological data management Z. M. Ozsoyoglu*,, G. Ozsoyoglu*, and J. Nadeau*, *Center for Computational Genomics, Case Western Reserve University (CWRU),

More information

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit

More information

Genome Browsers - The UCSC Genome Browser

Genome Browsers - The UCSC Genome Browser Genome Browsers - The UCSC Genome Browser Background The UCSC Genome Browser is a well-curated site that provides users with a view of gene or sequence information in genomic context for a specific species,

More information

e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3

e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3 e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3 1 National Institute of Pharmaceutical Education and

More information

Data Mining Technologies for Bioinformatics Sequences

Data Mining Technologies for Bioinformatics Sequences Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment

More information

SciVerse ScienceDirect. User Guide. October SciVerse ScienceDirect. Open to accelerate science

SciVerse ScienceDirect. User Guide. October SciVerse ScienceDirect. Open to accelerate science SciVerse ScienceDirect User Guide October 2010 SciVerse ScienceDirect Open to accelerate science Welcome to SciVerse ScienceDirect: How to get the most from your subscription SciVerse ScienceDirect is

More information

Integrated Access to Biological Data. A use case

Integrated Access to Biological Data. A use case Integrated Access to Biological Data. A use case Marta González Fundación ROBOTIKER, Parque Tecnológico Edif 202 48970 Zamudio, Vizcaya Spain marta@robotiker.es Abstract. This use case reflects the research

More information

Biostatistics and Bioinformatics Molecular Sequence Databases

Biostatistics and Bioinformatics Molecular Sequence Databases . 1 Description of Module Subject Name Paper Name Module Name/Title 13 03 Dr. Vijaya Khader Dr. MC Varadaraj 2 1. Objectives: In the present module, the students will learn about 1. Encoding linear sequences

More information

Finding and Exporting Data. BioMart

Finding and Exporting Data. BioMart September 2017 Finding and Exporting Data Not sure what tool to use to find and export data? BioMart is used to retrieve data for complex queries, involving a few or many genes or even complete genomes.

More information

CAP BIOINFORMATICS Su-Shing Chen CISE. 8/19/2005 Su-Shing Chen, CISE 1

CAP BIOINFORMATICS Su-Shing Chen CISE. 8/19/2005 Su-Shing Chen, CISE 1 CAP 5510-2 BIOINFORMATICS Su-Shing Chen CISE 8/19/2005 Su-Shing Chen, CISE 1 Building Local Genomic Databases Genomic research integrates sequence data with gene function knowledge. Gene ontology to represent

More information

Facilitating Semantic Alignment of EBI Resources

Facilitating Semantic Alignment of EBI Resources Facilitating Semantic Alignment of EBI Resources 17 th March, 2017 Tony Burdett Technical Co-ordinator Samples, Phenotypes and Ontologies Team www.ebi.ac.uk What is EMBL-EBI? Europe s home for biological

More information

BioExtract Server User Manual

BioExtract Server User Manual BioExtract Server User Manual University of South Dakota About Us The BioExtract Server harnesses the power of online informatics tools for creating and customizing workflows. Users can query online sequence

More information

Protein Data Bank Japan

Protein Data Bank Japan Protein Data Bank Japan http://www.pdbj.org/ PDBj Today gene information for many species is just at the point of being revealed. To make use of this information, it is necessary to look at the proteins

More information

Tutorial:OverRepresentation - OpenTutorials

Tutorial:OverRepresentation - OpenTutorials Tutorial:OverRepresentation From OpenTutorials Slideshow OverRepresentation (about 12 minutes) (http://opentutorials.rbvi.ucsf.edu/index.php?title=tutorial:overrepresentation& ce_slide=true&ce_style=cytoscape)

More information

Down with Species-Specific Database Projects, Up with Data Services

Down with Species-Specific Database Projects, Up with Data Services 1 Down with Species-Specific Database Projects, Up with Data Services Lincoln D. Stein, Cold Spring Harbor Laboratory This whitepaper begins with an illustration drawn from a database that has nothing

More information

XML in the bipharmaceutical

XML in the bipharmaceutical XML in the bipharmaceutical sector XML holds out the opportunity to integrate data across both the enterprise and the network of biopharmaceutical alliances - with little technological dislocation and

More information

Bioinformatics Data Distribution and Integration via Web Services and XML

Bioinformatics Data Distribution and Integration via Web Services and XML Letter Bioinformatics Data Distribution and Integration via Web Services and XML Xiao Li and Yizheng Zhang* College of Life Science, Sichuan University/Sichuan Key Laboratory of Molecular Biology and Biotechnology,

More information

PlantSimLab An Innovative Web Application Tool for Plant Biologists

PlantSimLab An Innovative Web Application Tool for Plant Biologists PlantSimLab An Innovative Web Application Tool for Plant Biologists Feb. 17, 2014 Sook S. Ha, PhD Postdoctoral Associate Virginia Bioinformatics Institute (VBI) 1 Outline PlantSimLab Project A NSF proposal

More information

An overview of Cytoscape for network biology with a focus on residue interaction networks

An overview of Cytoscape for network biology with a focus on residue interaction networks An overview of Cytoscape for network biology with a focus on residue interaction networks Guillaume Brysbaert IR2 CNRS - Bioinformatics - Unit of Structural and Functional Glycobiology Team: Computational

More information

ESG: Extended Similarity Group Job Submission

ESG: Extended Similarity Group Job Submission ESG: Extended Similarity Group Job Submission Cite: Meghana Chitale, Troy Hawkins, Changsoon Park, & Daisuke Kihara ESG: Extended similarity group method for automated protein function prediction, Bioinformatics,

More information

EBI patent related services

EBI patent related services EBI patent related services 4 th Annual Forum for SMEs October 18-19 th 2010 Jennifer McDowall Senior Scientist, EMBL-EBI EBI is an Outstation of the European Molecular Biology Laboratory. Overview Patent

More information

MDA Blast2GO Exercises

MDA Blast2GO Exercises MDA 2011 - Blast2GO Exercises Ana Conesa and Stefan Götz March 2011 Bioinformatics and Genomics Department Prince Felipe Research Center Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA

Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA Journal of Computer Science 2 (3): 292-296, 2006 ISSN 1549-3636 2006 Science Publications Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA 1 E.Ramaraj and 2 M.Punithavalli

More information

Re-dock of Roscovitine Against Human Cyclin-Dependent Kinase 2 with Molegro Virtual Docker

Re-dock of Roscovitine Against Human Cyclin-Dependent Kinase 2 with Molegro Virtual Docker Tutorial Re-dock of Roscovitine Against Human Cyclin-Dependent Kinase 2 with Molegro Virtual Docker Prof. Dr. Walter Filgueira de Azevedo Jr. walter@azevedolab.net azevedolab.net 1 Introduction In this

More information

Semantic Correspondence in Federated Life Science Data Integration Systems

Semantic Correspondence in Federated Life Science Data Integration Systems Mahoui, Malika, Kulkarni, Harshad, Li, Nianhua, Ben-Miled, Zina and Börner, Katy. Semantic Correspondence in Federated Life Science Data Integration Systems Malika Mahoui 1, Harshad Kulkarni 2, Nianhua

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

What is Internet COMPUTER NETWORKS AND NETWORK-BASED BIOINFORMATICS RESOURCES

What is Internet COMPUTER NETWORKS AND NETWORK-BASED BIOINFORMATICS RESOURCES What is Internet COMPUTER NETWORKS AND NETWORK-BASED BIOINFORMATICS RESOURCES Global Internet DNS Internet IP Internet Domain Name System Domain Name System The Domain Name System (DNS) is a hierarchical,

More information

Text mining tools for semantically enriching the scientific literature

Text mining tools for semantically enriching the scientific literature Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the

More information

org.hs.ipi.db November 7, 2017 annotation data package

org.hs.ipi.db November 7, 2017 annotation data package org.hs.ipi.db November 7, 2017 org.hs.ipi.db annotation data package Welcome to the org.hs.ipi.db annotation Package. The annotation package was built using a downloadable R package - PAnnBuilder (download

More information

How to store and visualize RNA-seq data

How to store and visualize RNA-seq data How to store and visualize RNA-seq data Gabriella Rustici Functional Genomics Group gabry@ebi.ac.uk EBI is an Outstation of the European Molecular Biology Laboratory. Talk summary How do we archive RNA-seq

More information

SEEK User Manual. Introduction

SEEK User Manual. Introduction SEEK User Manual Introduction SEEK is a computational gene co-expression search engine. It utilizes a vast human gene expression compendium to deliver fast, integrative, cross-platform co-expression analyses.

More information

Lezione 7. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi

Lezione 7. Bioinformatica. Mauro Ceccanti e Alberto Paoluzzi Lezione 7 Bioinformatica Mauro Ceccanti e Alberto Paoluzzi Dip. Informatica e Automazione Università Roma Tre Dip. Medicina Clinica Università La Sapienza BioPython Installing and exploration Tutorial

More information

Customisable Curation Workflows in Argo

Customisable Curation Workflows in Argo Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:

More information

Welcome - webinar instructions

Welcome - webinar instructions Welcome - webinar instructions GoToTraining works best in Chrome or IE avoid Firefox due to audio issues with Macs To access the full features of GoToTraining, use the desktop version by clicking switch

More information

ATAQS v1.0 User s Guide

ATAQS v1.0 User s Guide ATAQS v1.0 User s Guide Mi-Youn Brusniak Page 1 ATAQS is an open source software Licensed under the Apache License, Version 2.0 and it s source code, demo data and this guide can be downloaded at the http://tools.proteomecenter.org/ataqs/ataqs.html.

More information

Using Protein Data Bank and Astex Viewer to Study Protein Structure

Using Protein Data Bank and Astex Viewer to Study Protein Structure Helsinki University of Technology S-114.500 The Basics of Cell Bio Systems 28 February 2005 Using Protein Data Bank and Astex Viewer to Study Protein Structure Teppo Valtonen ASN 50768A Contents 1.Introduction...3

More information

Curatr: a web application for creating, curating, and sharing a mass spectral library

Curatr: a web application for creating, curating, and sharing a mass spectral library Curatr: a web application for creating, curating, and sharing a mass spectral library Andrew Palmer (1), Prasad Phapale (1), Dominik Fay (1), Theodore Alexandrov (1,2) (1) European Molecular Biology Laboratory,

More information

We are painfully aware that we don't have a good, introductory tutorial for Mascot on our web site. Its something that has come up in discussions

We are painfully aware that we don't have a good, introductory tutorial for Mascot on our web site. Its something that has come up in discussions We are painfully aware that we don't have a good, introductory tutorial for Mascot on our web site. Its something that has come up in discussions many times, and we always resolve to do something but then

More information

Deliverable D4.3 Release of pilot version of data warehouse

Deliverable D4.3 Release of pilot version of data warehouse Deliverable D4.3 Release of pilot version of data warehouse Date: 10.05.17 HORIZON 2020 - INFRADEV Implementation and operation of cross-cutting services and solutions for clusters of ESFRI Grant Agreement

More information

A Framework for BioCuration (part II)

A Framework for BioCuration (part II) A Framework for BioCuration (part II) Text Mining for the BioCuration Workflow Workshop, 3rd International Biocuration Conference Friday, April 17, 2009 (Berlin) Martin Krallinger Spanish National Cancer

More information

Automatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan

Automatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan Automatic annotation in UniProtKB using UniRule, and Complete Proteomes Wei Mun Chan Talk outline Introduction to UniProt UniProtKB annotation and propagation Data increase and the need for Automatic Annotation

More information

Graph Modeling and Analysis in Oracle

Graph Modeling and Analysis in Oracle Graph Modeling and Analysis in Oracle Susie Stephens Principal Product Manager, Life Sciences Oracle Corporation BioPathways, July 30, 2004 Access Distributed Data UltraSearch External Sites Distributed

More information

Literature Databases

Literature Databases Literature Databases Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Overview 1. Databases 2. Publications in Science 3. PubMed and

More information

Blast2GO Teaching Exercises

Blast2GO Teaching Exercises Blast2GO Teaching Exercises Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation process with Blast2GO

More information

Applied Bioinformatics

Applied Bioinformatics Applied Bioinformatics Course Overview & Introduction to Linux Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu What is bioinformatics Bio Bioinformatics

More information

Bio wikis. Paolo Romano Bioinformatics, National Cancer Research Institute, Genova

Bio wikis. Paolo Romano Bioinformatics, National Cancer Research Institute, Genova Bio wikis Paolo Romano (paolo.romano@istge.it) Bioinformatics, National Cancer Research Institute, Genova Outline o Wiki systems: aims and technologies o Working with wikis: practical issues for setting

More information

Computational Representation of Biological Systems. Zach Frazier, Jason McDermott, Michal Guerquin, and Ram Samudrala

Computational Representation of Biological Systems. Zach Frazier, Jason McDermott, Michal Guerquin, and Ram Samudrala Chapter 23 Computational Representation of Biological Systems Zach Frazier, Jason McDermott, Michal Guerquin, and Ram Samudrala Abstract Integration of large and diverse biological data sets is a daunting

More information

SciMiner User s Manual

SciMiner User s Manual SciMiner User s Manual Copyright 2008 Junguk Hur. All rights reserved. Bioinformatics Program University of Michigan Ann Arbor, MI 48109, USA Email: juhur@umich.edu Homepage: http://jdrf.neurology.med.umich.edu/sciminer/

More information

Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide Bioinformatics Resources.

Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide Bioinformatics Resources. 1 of 12 9/10/2003 11:15 AM Web-based tools for Bioinformatics; A (free) introduction to (freely available) NCBI, MUSC and World-wide Bioinformatics Resources. When and Where---Wednesdays at 1pm Room 438

More information

BIO-ONTOLOGIES: A KNOWLEDGE REPRESENTATION RESOURCE IN BIOINFORMATICS

BIO-ONTOLOGIES: A KNOWLEDGE REPRESENTATION RESOURCE IN BIOINFORMATICS BIO-ONTOLOGIES: A KNOWLEDGE REPRESENTATION RESOURCE IN BIOINFORMATICS Carmen Galvez University of Granada Granada, Spain cgalvez@ugr.es Abstract Bioinformatics manages the information that has been gathered

More information

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI.

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI. 2 Navigating the NCBI Instructions Aim: To become familiar with the resources available at the National Center for Bioinformatics (NCBI) and the search engine Entrez. Instructions: Write the answers to

More information

PubMed Assistant: A Biologist-Friendly Interface for Enhanced PubMed Search

PubMed Assistant: A Biologist-Friendly Interface for Enhanced PubMed Search Bioinformatics (2006), accepted. PubMed Assistant: A Biologist-Friendly Interface for Enhanced PubMed Search Jing Ding Department of Electrical and Computer Engineering, Iowa State University, Ames, IA

More information

Retrieving factual data and documents using IMGT-ML in the IMGT information system

Retrieving factual data and documents using IMGT-ML in the IMGT information system Retrieving factual data and documents using IMGT-ML in the IMGT information system Authors : Chaume D. *, Combres K. *, Giudicelli V. *, Lefranc M.-P. * * Laboratoire d'immunogénétique Moléculaire, LIGM,

More information

BMMB 597D - Practical Data Analysis for Life Scientists. Week 12 -Lecture 23. István Albert Huck Institutes for the Life Sciences

BMMB 597D - Practical Data Analysis for Life Scientists. Week 12 -Lecture 23. István Albert Huck Institutes for the Life Sciences BMMB 597D - Practical Data Analysis for Life Scientists Week 12 -Lecture 23 István Albert Huck Institutes for the Life Sciences Tapping into data sources Entrez: Cross-Database Search System EntrezGlobal

More information

Information Resources in Molecular Biology Marcela Davila-Lopez How many and where

Information Resources in Molecular Biology Marcela Davila-Lopez How many and where Information Resources in Molecular Biology Marcela Davila-Lopez (marcela.davila@medkem.gu.se) How many and where Data growth DB: What and Why A Database is a shared collection of logically related data,

More information

Deliverable D5.5. D5.5 VRE-integrated PDBe Search and Query API. World-wide E-infrastructure for structural biology. Grant agreement no.

Deliverable D5.5. D5.5 VRE-integrated PDBe Search and Query API. World-wide E-infrastructure for structural biology. Grant agreement no. Deliverable D5.5 Project Title: World-wide E-infrastructure for structural biology Project Acronym: West-Life Grant agreement no.: 675858 Deliverable title: D5.5 VRE-integrated PDBe Search and Query API

More information

Data Integration Framework of Pharmacology Databases Using Ontology

Data Integration Framework of Pharmacology Databases Using Ontology Data Integration Framework of Pharmacology Databases Using Ontology Phimphan Thipphayasaeng 1, Poonpong Boonbrahm 1, Marut Buranarach 2 and Anunchai Assawamakin 3 1 School of Informatics, Walailak University,

More information

EBI services. Jennifer McDowall EMBL-EBI

EBI services. Jennifer McDowall EMBL-EBI EBI services Jennifer McDowall EMBL-EBI The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number 226073 (Integrating

More information

TAIR User guide. TAIR User Guide Version 1.0 1

TAIR User guide. TAIR User Guide Version 1.0 1 TAIR User guide TAIR User Guide Version 1.0 1 Getting Started... 3 Browser compatibility and configuration.... 3 Additional Resources... 3 Finding help documents for TAIR tools... 3 Requesting Help....

More information

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment An Introduction to NCBI BLAST Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment Resources: The BLAST web server is available at https://blast.ncbi.nlm.nih.gov/blast.cgi

More information

Blast2GO Teaching Exercises SOLUTIONS

Blast2GO Teaching Exercises SOLUTIONS Blast2GO Teaching Exerces SOLUTIONS Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation with Blast2GO

More information

ygs98.db December 22,

ygs98.db December 22, ygs98.db December 22, 2018 ygs98alias Map Open Reading Frame (ORF) Identifiers to Alias Gene Names A set of gene names may have been used to report yeast genes represented by ORF identifiers. One of these

More information

AMNH Gerstner Scholars in Bioinformatics & Computational Biology Application Instructions

AMNH Gerstner Scholars in Bioinformatics & Computational Biology Application Instructions PURPOSE AMNH Gerstner Scholars in Bioinformatics & Computational Biology Application Instructions The seeks highly qualified applicants for its Gerstner postdoctoral fellowship program in Bioinformatics

More information

Simulation of Molecular Evolution with Bioinformatics Analysis

Simulation of Molecular Evolution with Bioinformatics Analysis Simulation of Molecular Evolution with Bioinformatics Analysis Barbara N. Beck, Rochester Community and Technical College, Rochester, MN Project created by: Barbara N. Beck, Ph.D., Rochester Community

More information

In the sense of the definition above, a system is both a generalization of one gene s function and a recipe for including and excluding components.

In the sense of the definition above, a system is both a generalization of one gene s function and a recipe for including and excluding components. 1 In the sense of the definition above, a system is both a generalization of one gene s function and a recipe for including and excluding components. 2 Starting from a biological motivation to annotate

More information

EBI is an Outstation of the European Molecular Biology Laboratory.

EBI is an Outstation of the European Molecular Biology Laboratory. EBI is an Outstation of the European Molecular Biology Laboratory. InterPro is a database that groups predictive protein signatures together 11 member databases single searchable resource provides functional

More information

The Electron Microscopy Data Bank and OME

The Electron Microscopy Data Bank and OME The Electron Microscopy Data Bank and OME Rich data, quality assessment, and cloud computing Christoph Best European Bioinformatics Institute, Cambridge, UK Transmossion Electron Microscope ADVANTAGES

More information

MetScape User Manual

MetScape User Manual MetScape 2.3.2 User Manual A Plugin for Cytoscape National Center for Integrative Biomedical Informatics July 2012 2011 University of Michigan This work is supported by the National Center for Integrative

More information

Alternative Tools for Mining The Biomedical Literature

Alternative Tools for Mining The Biomedical Literature Yale University From the SelectedWorks of Rolando Garcia-Milian May 14, 2014 Alternative Tools for Mining The Biomedical Literature Rolando Garcia-Milian, Yale University Available at: https://works.bepress.com/rolando_garciamilian/1/

More information

An Algebra for Protein Structure Data

An Algebra for Protein Structure Data An Algebra for Protein Structure Data Yanchao Wang, and Rajshekhar Sunderraman Abstract This paper presents an algebraic approach to optimize queries in domain-specific database management system for protein

More information

Supporting Bioinformatic Experiments with A Service Query Engine

Supporting Bioinformatic Experiments with A Service Query Engine Supporting Bioinformatic Experiments with A Service Query Engine Xuan Zhou Shiping Chen Athman Bouguettaya Kai Xu CSIRO ICT Centre, Australia {xuan.zhou,shiping.chen,athman.bouguettaya,kai.xu}@csiro.au

More information