EBI is an Outstation of the European Molecular Biology Laboratory.
|
|
- Scot Murphy
- 6 years ago
- Views:
Transcription
1 EBI is an Outstation of the European Molecular Biology Laboratory.
2 InterPro is a database that groups predictive protein signatures together 11 member databases single searchable resource provides functional analysis of proteins by classifying them into families and predicting domains and important sites Enables whole genome analysis
3 InterPro Consortium Consortium of 11 major signature databases
4 Protein signatures More sensitive homology searches Each member database creates signatures using different methods and methodologies: manually-created sequence alignments automatic processes with some human input and correction entirely automatically.
5 Why do we need predictive annotation tools?
6 What are protein signatures? Protein family/domain Multiple sequence alignment Build model Search Protein analysis it. UniProt Significant match ITWKGPVCGLDGKTYRNECALL AVPRSPVCGSDDVTYANECELK Mature model
7 Member databases METHODS Hidden Markov Models Finger- Prints Profiles Patterns Sequence Clusters Structural Domains Protein features (active sites ) Functional annotation of families/domains Prediction of conserved domains
8 InterPro entry
9 InterPro entry
10 The InterPro entry: types Family Proteins share a common evolutionary origin, as reflected in their related functions, sequences or structure Domain Distinct functional, structural or sequence units that may exist in a variety of biological contexts Repeats Short sequences typically repeated within a protein Sites PTM Active Site Binding Site Conserved Site
11 InterPro Entry Groups similar signatures together Adds Adds extensive extensive annotation Links Links to other to other databases databases Structural information and viewers Quality control Removes redundancy
12 InterPro Entry Groups similar signatures together Adds Adds extensive extensive annotation Links Links to other to other databases databases Structural information and viewers Hierarchical classification
13 Interpro hierarchies: Families FAMILIES can have parent/child relationships with other Families Parent/Child relationships are based on: Comparison of protein hits child should be a subset of parent siblings should not have matches in common Existing hierarchies in member databases Biological knowledge of curators
14 InterPro hierarchies: Domains DOMAINS can have parent/child relationships with other domains
15 Domains and Families may be linked through Domain Organisation Hierarchy
16 InterPro Entry Groups similar signatures together Adds Adds extensive extensive annotation Links to other databases Links to other databases Structural information and viewers
17 InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links Links to other to other databases databases Structural information and viewers The Gene Ontology project provides a controlled vocabulary of terms for describing gene product characteristics
18 InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links Links to other to other databases databases Structural information and viewers UniProt KEGG... Reactome... IntAct... UniProt taxonomy PANDIT... MEROPS... Pfam clans... Pubmed
19 InterPro Entry Groups similar signatures together Adds extensive annotation Adds extensive annotation Links to other databases Links to other databases Structural information and viewers PDB 3-D Structures SCOP Structural domains CATH Structural domain classification
20 InterProScan Protein Sequence Analysis algorithm Raw Matches Filtering algorithm Reported Matches Predictive Models
21 InterProScan access Interactive: Webservice (SOAP and REST): Downloadable: ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/
22 Why redesign InterProScan? InterProScan 4 complicated installation complicated update limited queuing system Only guaranteed with LSF limited configurability reliability
23 InterProScan 5.0 aims Easy install and configuration Modular Expandable Easily integrated into existing pipelines Incorporate new data model / XML exchange format Easy to port on to different architectures: Desktop machine Simple LAN LSF PBS Sun Grid Engine...cloud? GRID? Reliablity
24 InterProScan 5 Technology
25 Architecture Cluster platform Web services Java API InterPro website JMS: monitoring queues Job Management: scheduling analyses Business Logic: performing analyses One-way dependencies + replaceable layers = low-coupling + maintainable Oracle PostgreSQL HSQLDB Database Access XML Data Model File I/O File system
26 Java Messaging Service Master Schedules tasks & sub-tasks, and places on queue Broker Manages queues & topics Broker starts workers on demand Workers take tasks off queues Monitoring & Management Application Web or stand-alone app to monitor & manage InterProScan Worker Peforms Worker task / sub-task Peforms Worker and task / reports sub-task Peforms Worker back Worker and to task / Worker reports Broker sub-task Peforms Peforms back and to task / task / reports Broker sub-task back and Performs to sub-task and task / reports Broker back to sub-task, reports Broker back reports to back Broker to Broker Simple and robust programming model Mature and stable standard current JMS version released in 2002 Guaranteed message delivery to a single worker Easy to monitor Flexible easy to implement on multiple platforms
27 Beta release functionality
28 Installation Requirements Java 1.6 Linux Perl Installation process wget ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/i5 dist.tar.gz tar xzf i5 dist.tar.gz ready to use
29 ./interproscan.sh i test_proteins.fasta o test_proteins.tsv goterms A2YIW7 f927b0d241297dcc9a1c5990b58bf3c4 122 Pfam PF00085 Thioredoxin E 28 T IPR Thioredoxin domain Biological Process:cell redox homeostasis (GO: ) A2YIW7 f927b0d241297dcc9a1c5990b58bf3c4 122 ProSitePatterns PS00194 Thioredoxin family active site T IPR Thioredoxin, conserved site Biological Process:cell redox homeostasis (GO: ) A2YIW7 f927b0d241297dcc9a1c5990b58bf3c4 122 PIRSF PIRSF null E 27 T IPR Thioredoxin Molecular Function:protein disulfide oxidoreductase activity (GO: ), Biological Process:glycerol ether metabolic process (GO: ), Biological Process:cell redox homeostasis (GO: ), Molecular Function:electron carrier activity (GO: ) A2YIW7 f927b0d241297dcc9a1c5990b58bf3c4 122 PRINTS PR00421 Thioredoxin family signature T IPR Thioredoxin Molecular Function:protein disulfide oxidoreductase activity (GO: ), Biological Process:glycerol ether metabolic process (GO: ), Biological Process:cell redox homeostasis (GO: ), Molecular Function:electron carrier activity (GO: ) A2YIW7 f927b0d241297dcc9a1c5990b58bf3c4 122 PRINTS PR00421 Thioredoxin family signature T IPR Thioredoxin Molecular Function:protein disulfide oxidoreductase activity (GO: ), Biological Process:glycerol ether metabolic process (GO: ), Biological Process:cell redox homeostasis (GO: ), Molecular Function:electron carrier activity (GO: ) A2YIW7 f927b0d241297dcc9a1c5990b58bf3c4 122 PRINTS PR00421 Thioredoxin family signature T IPR Thioredoxin Molecular Function:protein disulfide oxidoreductase activity (GO: ), Biological Process:glycerol ether metabolic process (GO: ), Biological Process:cell redox homeostasis (GO: ), Molecular Function:electron carrier activity (GO: ) Default tab-separated values output
30 ./interproscan.sh i test_proteins.fasta o test_proteins.xml goterms F xml <?xml version="1.0" encoding="utf 8" standalone="yes"?> <protein matches xmlns=" <protein> <sequence md5="f927b0d241297dcc9a1c5990b58bf3c4">maaeegvviachnkdefdaqmtkakeagkvviidftaswcgpcrfiapvfaeyakkfpgavflkvdvdelkev AEKYNVEAMPTFLFIKDGAEADKVVGARKDDLQNTIVKHVGATAASASA</sequence> <xref id="a2yiw7"/> <matches> <fingerprints match graphscan="iii" evalue=" e 7"> <signature name="thioredoxin" desc="thioredoxin family signature" ac="pr00421"> <models> <model name="thioredoxin" desc="thioredoxin family signature" ac="pr00421"/> </models> <signature library release version="41.1" library="prints"/> </signature> <locations> <fingerprints location score="0.0" pvalue="0.0" motifnumber="3" end="48" start="39"/> <fingerprints location score="0.0" pvalue="0.0" motifnumber="2" end="89" start="78"/> <fingerprints location score="0.0" pvalue="0.0" motifnumber="1" end="39" start="31"/> </locations> </fingerprints match> <hmmer2 match score="100.5" evalue=" INF"> <signature name="thioredoxin" ac="pirsf000077"> <models> <model name="thioredoxin" ac="pirsf000077"/> </models> <signature library release version="2.74" library="pirsf"/> </signature> <locations> <hmmer2 location hmm length="0" hmm end="108" hmm start="1" evalue=" e 27" score="0.0" end="113" start="4"/> </locations> </hmmer2 match>...etc XML output
31 Pre-calculated match lookup BerkeleyDB-backed REST web service Includes matches for all of UniParc (27 million sequences) 250 million matches Fast response Integrated into i5.
32 Other functionality Increased reliability Precalculated match lookup Configuration simple properties file Nucleotide sequence getorf map matches to nucleotide coordinates Pathway mapping KEGG, Reactome, MetaCyc, Unipathway
33 Future functionality Webservice Interact directly with architecture: LAN LSF PBS Sun Grid Engine Database persistence Oracle MySQL Postgres etc Graphical output Other functionality ask!
34 InterProScan 5 timeline Beta release August 2011 InterProScan 4 still maintained Full release Early 2012 InterProScan 4 deprecated interproscan-5-dev@googlegroups.com
35 Acknowledgements Team leader Developers Bioinformaticians Curators Sarah Hunter Matthew Fraser Anthony Quinn Phil Jones Craig McAnulla Alex Mitchell Sebastien Pesseat Maxim Scheremetjew Siew-Yit Yong Amaia Sangrador Any Questions Stand 302
36 Come and see us at booths 9 and 10! Job opportunities PhD and postdoc positions Training in person and online Services Industry programme EBI is an Outstation of the European Molecular Biology Laboratory.
EBI patent related services
EBI patent related services 4 th Annual Forum for SMEs October 18-19 th 2010 Jennifer McDowall Senior Scientist, EMBL-EBI EBI is an Outstation of the European Molecular Biology Laboratory. Overview Patent
More informationmpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction
mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction Molecular Recognition Features (MoRFs) are short, intrinsically disordered regions in proteins that undergo
More informationAbout the Edinburgh Pathway Editor:
About the Edinburgh Pathway Editor: EPE is a visual editor designed for annotation, visualisation and presentation of wide variety of biological networks, including metabolic, genetic and signal transduction
More informationWelcome - webinar instructions
Welcome - webinar instructions GoToTraining works best in Chrome or IE avoid Firefox due to audio issues with Macs To access the full features of GoToTraining, use the desktop version by clicking switch
More informationEBI services. Jennifer McDowall EMBL-EBI
EBI services Jennifer McDowall EMBL-EBI The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number 226073 (Integrating
More informationSearch and Result Help Document
Search and Result Help Document Advanced Search 1-Quick links: quick link to common searches, such as retrieving modified forms or terms with crossreference to a database. Select the option an all relevant
More informationFinding data. HMMER Answer key
Finding data HMMER Answer key HMMER input is prepared using VectorBase ClustalW, which runs a Java application for the graphical representation of the results. If you get an error message that blocks this
More informationApplied Bioinformatics
Applied Bioinformatics Course Overview & Introduction to Linux Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu What is bioinformatics Bio Bioinformatics
More informationBlast2GO User Manual. Blast2GO Ortholog Group Annotation May, BioBam Bioinformatics S.L. Valencia, Spain
Blast2GO User Manual Blast2GO Ortholog Group Annotation May, 2016 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Clusters of Orthologs 2 2 Orthologous Group Annotation Tool 2 3 Statistics for NOG
More informationBlast2GO PRO Plugin for Geneious User Manual
Blast2GO PRO Plugin for Geneious User Manual Geneious 8.0 Version 1.0 October 2015 BioBam Bioinformatics S.L. Valencia, Spain Contents Introduction 2 1.1 Blast2GO methodology................................
More informationApplied Bioinformatics
Applied Bioinformatics Course Overview & Introduction to Linux Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu What is bioinformatics Bio Bioinformatics
More informationFinding and Exporting Data. BioMart
September 2017 Finding and Exporting Data Not sure what tool to use to find and export data? BioMart is used to retrieve data for complex queries, involving a few or many genes or even complete genomes.
More informationConSAT user manual. Version 1.0 March Alfonso E. Romero
ConSAT user manual Version 1.0 March 2014 Alfonso E. Romero Department of Computer Science, Centre for Systems and Synthetic Biology Royal Holloway, University of London Egham Hill, Egham, TW20 0EX Table
More informationBiology 644: Bioinformatics
A statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states in the training data. First used in speech and handwriting recognition In
More informationEMBL-EBI Patent Services
EMBL-EBI Patent Services 5 th Annual Forum for SMEs October 6-7 th 2011 Jennifer McDowall EBI is an Outstation of the European Molecular Biology Laboratory. Patent resources at EBI 2 http://www.ebi.ac.uk/patentdata/
More informationDeliverable D5.5. D5.5 VRE-integrated PDBe Search and Query API. World-wide E-infrastructure for structural biology. Grant agreement no.
Deliverable D5.5 Project Title: World-wide E-infrastructure for structural biology Project Acronym: West-Life Grant agreement no.: 675858 Deliverable title: D5.5 VRE-integrated PDBe Search and Query API
More informationThe VCell Database. Sharing, Publishing, Reusing VCell Models.
The VCell Database Sharing, Publishing, Reusing VCell Models http://vcell.org Design Requirements Resources Compilers, libraries, add-ons, HPC hardware NO! Portability Run on Windows, Mac, Unix Availability
More informationBovineMine Documentation
BovineMine Documentation Release 1.0 Deepak Unni, Aditi Tayal, Colin Diesh, Christine Elsik, Darren Hag Oct 06, 2017 Contents 1 Tutorial 3 1.1 Overview.................................................
More informationTutorial:OverRepresentation - OpenTutorials
Tutorial:OverRepresentation From OpenTutorials Slideshow OverRepresentation (about 12 minutes) (http://opentutorials.rbvi.ucsf.edu/index.php?title=tutorial:overrepresentation& ce_slide=true&ce_style=cytoscape)
More informationCONTENTS 1. Contents
BIANA Tutorial CONTENTS 1 Contents 1 Getting Started 6 1.1 Starting BIANA......................... 6 1.2 Creating a new BIANA Database................ 8 1.3 Parsing External Databases...................
More informationTutorial. Step 1. Step 2. Figure 1
Tutorial Welcome to the MISTIC Tutorial! In the next pages we will use an example case study to help you load data, submit the job and then analyze and visualize the results. Step 1 We will be using the
More informationmogene20sttranscriptcluster.db
mogene20sttranscriptcluster.db November 17, 2017 mogene20sttranscriptclusteraccnum Map Manufacturer identifiers to Accession Numbers mogene20sttranscriptclusteraccnum is an R object that contains mappings
More informationSoftware review. Biomolecular Interaction Network Database
Biomolecular Interaction Network Database Keywords: protein interactions, visualisation, biology data integration, web access Abstract This software review looks at the utility of the Biomolecular Interaction
More informationTBtools, a Toolkit for Biologists integrating various HTS-data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 TBtools, a Toolkit for Biologists integrating various HTS-data handling tools with a user-friendly interface Chengjie Chen 1,2,3*, Rui Xia 1,2,3, Hao Chen 4, Yehua
More informationA generic and modular platform for automated sequence processing and annotation. Arthur Gruber
2 A generic and modular platform for automated sequence processing and annotation Arthur Gruber Instituto de Ciências Biomédicas Universidade de São Paulo AG-ICB-USP 2 Sequence processing and annotation
More informationFacilitating Semantic Alignment of EBI Resources
Facilitating Semantic Alignment of EBI Resources 17 th March, 2017 Tony Burdett Technical Co-ordinator Samples, Phenotypes and Ontologies Team www.ebi.ac.uk What is EMBL-EBI? Europe s home for biological
More informationTopics of the talk. Biodatabases. Data types. Some sequence terminology...
Topics of the talk Biodatabases Jarno Tuimala / Eija Korpelainen CSC What data are stored in biological databases? What constitutes a good database? Nucleic acid sequence databases Amino acid sequence
More informationData systems supporting chemical informatics and small molecule discovery for crop protection research.
Data systems supporting chemical informatics and small molecule discovery for crop protection research. Mark Forster - Oracle Life Science User Group Meeting. April 2006. Presentation Outline. Syngenta
More informationMascot Insight is a new application designed to help you to organise and manage your Mascot search and quantitation results. Mascot Insight provides
1 Mascot Insight is a new application designed to help you to organise and manage your Mascot search and quantitation results. Mascot Insight provides ways to flexibly merge your Mascot search and quantitation
More informationCLC Server. End User USER MANUAL
CLC Server End User USER MANUAL Manual for CLC Server 10.0.1 Windows, macos and Linux March 8, 2018 This software is for research purposes only. QIAGEN Aarhus Silkeborgvej 2 Prismet DK-8000 Aarhus C Denmark
More informationInformation Resources in Molecular Biology Marcela Davila-Lopez How many and where
Information Resources in Molecular Biology Marcela Davila-Lopez (marcela.davila@medkem.gu.se) How many and where Data growth DB: What and Why A Database is a shared collection of logically related data,
More informationManatee and the Annotation System Architecture. An In-depth Look Inside Manatee Development and the Annotation Process
Manatee and the Annotation System Architecture An In-depth Look Inside Manatee Development and the Annotation Process Annotation Architecture Overview Manatee is only a small part of a network of annotation
More informationBLAST, Profile, and PSI-BLAST
BLAST, Profile, and PSI-BLAST Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 26 Free for academic use Copyright @ Jianlin Cheng & original sources
More informationmgu74a.db November 2, 2013 Map Manufacturer identifiers to Accession Numbers
mgu74a.db November 2, 2013 mgu74aaccnum Map Manufacturer identifiers to Accession Numbers mgu74aaccnum is an R object that contains mappings between a manufacturer s identifiers and manufacturers accessions.
More information2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.
Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take
More informationBlast2GO Command Line User Manual
Blast2GO Command Line User Manual Version 1.1 October 2015 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Introduction....................................... 1 1.1 Main characteristics..............................
More informationNew generation of patent sequence databases Information Sources in Biotechnology Japan
New generation of patent sequence databases Information Sources in Biotechnology Japan EBI is an Outstation of the European Molecular Biology Laboratory. Patent-related resources Patents Patent Resources
More informationDiscovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London
Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,
More informationorg.hs.ipi.db November 7, 2017 annotation data package
org.hs.ipi.db November 7, 2017 org.hs.ipi.db annotation data package Welcome to the org.hs.ipi.db annotation Package. The annotation package was built using a downloadable R package - PAnnBuilder (download
More informationBioinformatics Hubs on the Web
Bioinformatics Hubs on the Web Take a class The Galter Library teaches a related class called Bioinformatics Hubs on the Web. See our Classes schedule for the next available offering. If this class is
More informationOntology-Based Mediation in the. Pisa June 2007
http://asp.uma.es Ontology-Based Mediation in the Amine System Project Pisa June 2007 Prof. Dr. José F. Aldana Montes (jfam@lcc.uma.es) Prof. Dr. Francisca Sánchez-Jiménez Ismael Navas Delgado Raúl Montañez
More informationEnabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services
Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit
More informationHow to store and visualize RNA-seq data
How to store and visualize RNA-seq data Gabriella Rustici Functional Genomics Group gabry@ebi.ac.uk EBI is an Outstation of the European Molecular Biology Laboratory. Talk summary How do we archive RNA-seq
More informationMDA Blast2GO Exercises
MDA 2011 - Blast2GO Exercises Ana Conesa and Stefan Götz March 2011 Bioinformatics and Genomics Department Prince Felipe Research Center Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2
More informationStructural Bioinformatics
Structural Bioinformatics Elucidation of the 3D structures of biomolecules. Analysis and comparison of biomolecular structures. Prediction of biomolecular recognition. Handles three-dimensional (3-D) structures.
More informationMaster Thesis. Andreas Schlicker
Master Thesis A Global Approach to Comparative Genomics: Comparison of Functional Annotation over the Taxonomic Tree by Andreas Schlicker A Thesis Submitted to the Center for Bioinformatics of Saarland
More informationHymenopteraMine Documentation
HymenopteraMine Documentation Release 1.0 Aditi Tayal, Deepak Unni, Colin Diesh, Chris Elsik, Darren Hagen Apr 06, 2017 Contents 1 Welcome to HymenopteraMine 3 1.1 Overview of HymenopteraMine.....................................
More informationFinding sets of annotations that frequently co-occur in a list of genes
Finding sets of annotations that frequently co-occur in a list of genes In this document we illustrate, with a straightforward example, the GENECODIS algorithm for finding sets of annotations that frequently
More informationMultiple Sequence Alignment Based on Profile Alignment of Intermediate Sequences
Multiple Sequence Alignment Based on Profile Alignment of Intermediate Sequences Yue Lu and Sing-Hoi Sze RECOMB 2007 Presented by: Wanxing Xu March 6, 2008 Content Biology Motivation Computation Problem
More informationGenome Browser. Background and Strategy. 12 April 2010
Genome Browser Background and Strategy 12 April 2010 I. Background 1. Project definition 2. Survey of genome browsers II. Strategy Alejandro Caro, Chandni Desai, Neha Gupta, Jay Humphrey, Chengwei Luo,
More informationHsAgilentDesign db
HsAgilentDesign026652.db January 16, 2019 HsAgilentDesign026652ACCNUM Map Manufacturer identifiers to Accession Numbers HsAgilentDesign026652ACCNUM is an R object that contains mappings between a manufacturer
More informationMultiple Sequence Alignment
Introduction to Bioinformatics online course: IBT Multiple Sequence Alignment Lec3: Navigation in Cursor mode By Ahmed Mansour Alzohairy Professor (Full) at Department of Genetics, Zagazig University,
More informationhgu133plus2.db December 11, 2017
hgu133plus2.db December 11, 2017 hgu133plus2accnum Map Manufacturer identifiers to Accession Numbers hgu133plus2accnum is an R object that contains mappings between a manufacturer s identifiers and manufacturers
More informationAn overview of Cytoscape for network biology with a focus on residue interaction networks
An overview of Cytoscape for network biology with a focus on residue interaction networks Guillaume Brysbaert IR2 CNRS - Bioinformatics - Unit of Structural and Functional Glycobiology Team: Computational
More informationPFstats User Guide. Aspartate/ornithine carbamoyltransferase Case Study. Neli Fonseca
PFstats User Guide Aspartate/ornithine carbamoyltransferase Case Study 1 Contents Overview 3 Obtaining An Alignment 3 Methods 4 Alignment Filtering............................................ 4 Reference
More informationAutomatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan
Automatic annotation in UniProtKB using UniRule, and Complete Proteomes Wei Mun Chan Talk outline Introduction to UniProt UniProtKB annotation and propagation Data increase and the need for Automatic Annotation
More informationThe LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases
International Symposium on Integrative Bioinformatics 2010 The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases M Lange, K Spies, C Colmsee, S Flemming, M Klapperstück,
More informationClustering of Proteins
Melroy Saldanha saldanha@stanford.edu CS 273 Project Report Clustering of Proteins Introduction Numerous genome-sequencing projects have led to a huge growth in the size of protein databases. Manual annotation
More informationBlast2GO Teaching Exercises
Blast2GO Teaching Exercises Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation process with Blast2GO
More informationUtilizing Databases in Grid Engine 6.0
Utilizing Databases in Grid Engine 6.0 Joachim Gabler Software Engineer Sun Microsystems http://sun.com/grid Current status flat file spooling binary format for jobs ASCII format for other objects accounting
More informationDocumentation of HMMEditor 1.0
Documentation of HMMEditor 1.0 HMMEditor 1.0 stands for profile Hidden Markov Model (phmm) Visual Editor. It is a tool to visualize and edit phmm in HMMer format. HMMer format is also used by Pfam protein
More informationHow to use KAIKObase Version 3.1.0
How to use KAIKObase Version 3.1.0 Version3.1.0 29/Nov/2010 http://sgp2010.dna.affrc.go.jp/kaikobase/ Copyright National Institute of Agrobiological Sciences. All rights reserved. Outline 1. System overview
More informationUniProt - The Universal Protein Resource
UniProt - The Universal Protein Resource Claire O Donovan Pre-UniProt Swiss-Prot: created in July 1986; since 1987, a collaboration of the SIB and the EMBL/EBI; TrEMBL: created at the EBI in November 1996
More informationAnnotating a Genome in PATRIC
Annotating a Genome in PATRIC The following step-by-step workflow is intended to help you learn how to navigate the new PATRIC workspace environment in order to annotate and browse your genome on the PATRIC
More informationManual of mirdeepfinder for EST or GSS
Manual of mirdeepfinder for EST or GSS Index 1. Description 2. Requirement 2.1 requirement for Windows system 2.1.1 Perl 2.1.2 Install the module DBI 2.1.3 BLAST++ 2.2 Requirement for Linux System 2.2.1
More informationBioExtract Server User Manual
BioExtract Server User Manual University of South Dakota About Us The BioExtract Server harnesses the power of online informatics tools for creating and customizing workflows. Users can query online sequence
More information15-780: Graduate Artificial Intelligence. Computational biology: Sequence alignment and profile HMMs
5-78: Graduate rtificial Intelligence omputational biology: Sequence alignment and profile HMMs entral dogma DN GGGG transcription mrn UGGUUUGUG translation Protein PEPIDE 2 omparison of Different Organisms
More informationUser Guide Written By Yasser EL-Manzalawy
User Guide Written By Yasser EL-Manzalawy 1 Copyright Gennotate development team Introduction As large amounts of genome sequence data are becoming available nowadays, the development of reliable and efficient
More informationProtein Information Tutorial
Protein Information Tutorial Relevant websites: SMART (normal mode): SMART (batch mode): HMMER search: InterProScan: CBS Prediction Servers: EMBOSS: http://smart.embl-heidelberg.de/ http://smart.embl-heidelberg.de/smart/batch.pl
More informationWhen we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame
1 When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from
More informationResume Ruchira S. Datta
Resume Ruchira S. Datta Ruchira.Datta@gmail.com (510) 761-3949 http://www.ruchiradatta.com Objective: Part-time, full-time, or contract/consulting work, including systems analysis or modeling, and software
More informationCACAO Training. Jim Hu and Suzi Aleksander Spring 2016
CACAO Training Jim Hu and Suzi Aleksander Spring 2016 1 What is CACAO? Community Assessment of Community Annotation with Ontologies (CACAO) Annotation of gene function Competition Within a class Between
More informationLet's Play... Try to name the databases described on the following slides...
Database Software Let's Play... Try to name the databases described on the following slides... "World's most popular" Free relational database system (RDBMS) that... the "M" in "LAMP" and "XAMP" stacks
More informationLanguages and tools for building and using ontologies. Simon Jupp, James Malone
An overview of ontology technology Languages and tools for building and using ontologies Simon Jupp, James Malone jupp@ebi.ac.uk, malone@ebi.ac.uk Outline Languages OWL and OBO classes, individuals, relations,
More informationTutorial: Using the SFLD and Cytoscape to Make Hypotheses About Enzyme Function for an Isoprenoid Synthase Superfamily Sequence
Tutorial: Using the SFLD and Cytoscape to Make Hypotheses About Enzyme Function for an Isoprenoid Synthase Superfamily Sequence Requirements: 1. A web browser 2. The cytoscape program (available for download
More informationArrayExpress and Expression Atlas: Mining Functional Genomics data
and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL gabry@ebi.ac.uk What is functional genomics (FG)? The aim of FG is to understand the function
More informationygs98.db December 22,
ygs98.db December 22, 2018 ygs98alias Map Open Reading Frame (ORF) Identifiers to Alias Gene Names A set of gene names may have been used to report yeast genes represented by ORF identifiers. One of these
More informationA cell-cycle knowledge integration framework
A cell-cycle knowledge integration framework Erick Antezana Dept. of Plant Systems Biology. Flanders Interuniversity Institute for Biotechnology/Ghent University. Ghent BELGIUM. erant@psb.ugent.be http://www.psb.ugent.be/cbd/
More informationEnsembl Core API. EMBL European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK
Ensembl Core API EMBL European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK EBI is an Outstation of the European Molecular Biology Laboratory. Outline a. b. c.
More informationClueGO - CluePedia Frequently asked questions
ClueGO - CluePedia Frequently asked questions Gabriela Bindea, Bernhard Mlecnik Laboratory of Integrative Cancer Immunology INSERM U872 Cordeliers Research Center Paris, France Contents License...............................................................
More informationIPA: networks generation algorithm
IPA: networks generation algorithm Dr. Michael Shmoish Bioinformatics Knowledge Unit, Head The Lorry I. Lokey Interdisciplinary Center for Life Sciences and Engineering Technion Israel Institute of Technology
More informationhgug4845a.db September 22, 2014 Map Manufacturer identifiers to Accession Numbers
hgug4845a.db September 22, 2014 hgug4845aaccnum Map Manufacturer identifiers to Accession Numbers hgug4845aaccnum is an R object that contains mappings between a manufacturer s identifiers and manufacturers
More informationAutomation of bioinformatics processes through workflow management systems
Automation of bioinformatics processes through workflow management systems Paolo Romano Bioinformatics National Cancer Research Institute of Genoa, Italy paolo.romano@istge.it Summary Information and data
More informationIntroduction to Web Services
Introduction to Web Services Peter Fischer Hallin, Center for Biological Sequence Analysis Comparative Microbial Genomics Workshop Bangkok, Thailand June 2nd 2008 Background - why worry... Increasing size
More informationLecture 5. Functional Analysis with Blast2GO Enriched functions. Kegg Pathway Analysis Functional Similarities B2G-Far. FatiGO Babelomics.
Lecture 5 Functional Analysis with Blast2GO Enriched functions FatiGO Babelomics FatiScan Kegg Pathway Analysis Functional Similarities B2G-Far 1 Fisher's Exact Test One Gene List (A) The other list (B)
More informationPARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology
Nucleic Acids Research, 2005, Vol. 33, Web Server issue W535 W539 doi:10.1093/nar/gki423 PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology Per Eystein
More information20.453J / 2.771J / HST.958J Biomedical Information Technology Fall 2008
MIT OpenCourseWare http://ocw.mit.edu 20.453J / 2.771J / HST.958J Biomedical Information Technology Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More information8/19/13. Computational problems. Introduction to Algorithm
I519, Introduction to Introduction to Algorithm Yuzhen Ye (yye@indiana.edu) School of Informatics and Computing, IUB Computational problems A computational problem specifies an input-output relationship
More informationEditing Pathway/Genome Databases
Editing Pathway/Genome Databases By Ron Caspi ron.caspi@sri.com This presentation can be found at http://bioinformatics.ai.sri.com/ptools/tutorial/sessions/ curation/curation of genes, enzymes and Pathways/
More informationReactome Error! Bookmark not defined. Reactome Tools
Reactome This document introduces Reactome, the user interface and the database content. Further information can be found in the online Reactome user guide at http://www.reactome.org/userguide/usersguide.html.
More informationBiobtree: A tool to search, map and visualize bioinformatics identifiers and special keywords [version 1; referees: awaiting peer review]
SOFTWARE TOOL ARTICLE Biobtree: A tool to search, map and visualize bioinformatics identifiers and special keywords [version 1; referees: awaiting peer review] Tamer Gur European Bioinformatics Institute,
More informationThe genexplain platform. Workshop SW2: Pathway Analysis in Transcriptomics, Proteomics and Metabolomics
The genexplain platform Workshop SW2: Pathway Analysis in Transcriptomics, Proteomics and Metabolomics Saturday, March 17, 2012 2 genexplain GmbH Am Exer 10b D-38302 Wolfenbüttel Germany E-mail: olga.kel-margoulis@genexplain.com,
More informationBlast2GO Teaching Exercises SOLUTIONS
Blast2GO Teaching Exerces SOLUTIONS Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation with Blast2GO
More informationUsing the Distributed Annotation System
Using the Distributed Annotation System http://www.ebi.ac.uk Introduction This half day course is designed for those with a biological background that are relatively new to the use of the Distributed Annotation
More informationA distributed computation of Interpro Pfam, PROSITE and ProDom for protein annotation
E.O. Ribeiro et al. 590 A distributed computation of Interpro Pfam, PROSITE and ProDom for protein annotation Edward de O. Ribeiro¹, Gustavo G. Zerlotini¹, Irving R.M. Lopes¹, Victor B.R. Ribeiro¹, Alba
More informationGenome 559. Hidden Markov Models
Genome 559 Hidden Markov Models A simple HMM Eddy, Nat. Biotech, 2004 Notes Probability of a given a state path and output sequence is just product of emission/transition probabilities If state path is
More informationDeliverable D4.3 Release of pilot version of data warehouse
Deliverable D4.3 Release of pilot version of data warehouse Date: 10.05.17 HORIZON 2020 - INFRADEV Implementation and operation of cross-cutting services and solutions for clusters of ESFRI Grant Agreement
More informationMetaPhyler Usage Manual
MetaPhyler Usage Manual Bo Liu boliu@umiacs.umd.edu March 13, 2012 Contents 1 What is MetaPhyler 1 2 Installation 1 3 Quick Start 2 3.1 Taxonomic profiling for metagenomic sequences.............. 2 3.2
More informationBIOZON: a system for unification, management and analysis of heterogeneous biological data
BIOZON: a system for unification, management and analysis of heterogeneous biological data Aaron Birkland and Golan Yona Department of Computer Science Cornell University, Ithaca, NY 14853 Abstract Integration
More informationA Semantic Model for Federated Queries Over a Normalized Corpus
A Semantic Model for Federated Queries Over a Normalized Corpus Samuel Croset, Christoph Grabmüller, Dietrich Rebholz-Schuhmann 17 th March 2010, Hinxton EBI is an Outstation of the European Molecular
More information