12. Key features involved in building biological 3databases

Size: px
Start display at page:

Download "12. Key features involved in building biological 3databases"

Transcription

1 12. Key features involved in building biological 3databases Central to the discipline of bioinformatics is the need to store biological information systematically in structured databases. The first databases were really just simple formatted text files. These were organised so that particular records within them could be easily identified and linked. However, as the complexity and types of available biological data grew, and biologists wanted to ask more complex questions, database architectures also became more sophisticated. This topic guide will look at a range of aspects involved in creating biological databases, and how these may have to evolve to meet future needs, as new technologies give rise to new types of data. On successful completion of this topic you will: understand key features involved in building biological databases (LO3). To achieve a Pass in this unit you need to show that you can: summarise the design and management of biological databases (3.1) identify a range of records within a file (3.2) discuss the need for biological databases to store, organise and index basic biological processes (3.3) discuss the nature of new data available and the types of database and resources that might be used (3.4). 1

2 Key terms Coding region: The portion of an mrna sequence that is translatable into a polypeptide. Flat-file database: A plain-text file containing a number of data entries that lack structured interrelationships. 1 Building biological databases Complete genome sequencing was a major achievement. However, just amassing more data does not instantly make us more knowledgeable or provide miraculous understanding of the information we are collecting. Gaining biological and biomedical insights from raw genomic data is a huge undertaking. For example, as part of the process: genes must be located and their structures properly assembled coding regions must be translated functions must be assigned to genes and their products disease associations must be discovered, etc. These tasks depend on using the right computational tools and finding the right balance between human and machine input. Online access to databases gave scientists the ability to use public data in their own private research projects. Databases thus became invaluable as repositories of biological information. In addition to this, they are important because they allow logical connections to be made to related information in different resources via their annotations. Annotations are the intelligence or clues we attached to raw data to make them meaningful to, and reusable by, other researchers. For example, linear strings of nucleotide bases or amino acid residues are virtually useless on their own, but allied with information about their evolutionary relationships, biological functions, roles, interactions, disease associations, etc., they become elements of knowledge. The evolution of biological flat-file databases Database annotations add value to raw data, in principle allowing them to be reused quickly and conveniently. The more annotations there are, the richer the database content. The problem is, the more information added, the greater the need for disciplined approaches to data archiving if computers are to be able to access particular annotations reliably, they must be stored in a structured way. This begs two questions: i what kinds of annotation are crucial, and ii how should they be organised? We already saw that, for sequence data, adding notes about potential biological relationships, functions, roles, etc., is useful. To add further value, it is also helpful to add database specific details, such as when the sequence was submitted and when the database entry was last updated; links to relevant scientific literature (for example, to an article that describes the biological function of a sequence) and cross-references to information in related databases are also informative. Structuring such information sensibly and systematically to facilitate computer access is challenging. Plain text, like the page you are reading now, is accessible to humans, but means nothing to computers. Nevertheless, the earliest biological databases were created as plain-text files, or flat-files. This meant that particular pieces of information had to be pinpointed with specific tags to help computers identify the types of data being stored in those parts of the file. Inevitably, several 2

3 different flat-file formats evolved to store different types of biological data. Of these, one particular format became popular (because its structure was relatively simple) and was adapted for a variety of different data-types by a number of databases that are still in use today (e.g., Swiss-Prot, TrEMBL, PROSITE). This simple flat-file format was the one originally devised to store nucleotide sequences in the EMBL database. Figure gives a flavour of how a flat-file database is constructed. Figure : Creating a flat-file database. Numerous plain-text or flatfiles are appended to create a flat-file database. Each file, or database record, contains different data fields, each of which is identified by a specific tag. Here, the zoomed-in section shows a variety of tags found in the EMBL flatfile format, exemplified with a range of fields typical of UniProtKB entries. Flat-file RECORD TAG TAG Xn Flat-file database ID AC DT DE GN OS OC RN CC DR KW FT SQ Zoom A human-readable identifier A computer-readable code Date of creation of database record A descriptive title for the entry Gene name Organism source details Organism classification Cross references to publications Description of the function, etc. Cross-links to related databases Keywords Table listing sequence features Sequence details Key terms Rhodopsin: A light-sensitive biological pigment, found in the rod-shaped photoreceptor cells of the retinas of most vertebrates, that mediates vision in dim light; rhodopsin belongs to the superfamily of G protein-coupled receptors to which it gives its name. PubMed: An online interface to millions of biomedical literature citations from the MEDLINE database, from life science journals, online books, etc.; PubMed is a service of the National Center for Biotechnology Information (NCBI). Database fields and tags The EMBL flat-file format uses a series of two-letter tags to describe the data stored on each line of the file, as shown in Figure on page 4. The file begins with an identifying (ID) code (here, OPSD_HUMAN) and an accession (AC) number (here, P08100): the AC number is designed for computers to read; the ID code is more meaningful to humans in this case, OPSD_HUMAN denotes human rhodopsin. The AC and ID codes specify a given database entry. In principle, the AC number is invariant so that this sequence can always be tracked in any version of the database. Other important pieces of information within the flat-file include: DT, the date a sequence entered the database and when changes were last made to its entry DE, the description or title of the stored entity (here, the protein rhodopsin) GN, the source gene name (here, rho) OS, a more precise specification of the organism species (here, Homo sapiens) OC, a more precise specification of the organism classification (Eukaryota, Metazoa, Chordata, etc.). In addition, the file includes bibliographic citations: RN is the reference number RP gives the subject RM the literature database (PubMed) cross-reference RA the authors RL the place of publication. 3

4 Figure : Illustration of the flat-file format of a UniProtKB/Swiss-Prot entry. ID OPSD_HUMAN STANDARD; PRT; 348 AA. AC P08100; DT 01-AUG-1988 (REL. 08, CREATED) DT 01-AUG-1988 (REL. 08, LAST SEQUENCE UPDATE) DT 01-MAR-1992 (REL. 21, LAST ANNOTATION UPDATE) DE RHODOPSIN. GN RHO. OS HOMO SAPIENS (HUMAN). OC EUKARYOTA; METAZOA; CHORDATA; VERTEBRATA; TETRAPODA; MAMMALIA; OC EUTHERIA; PRIMATES. RN [1] RP SEQUENCE FROM N.A. RM RA NATHANS J., HOGNESS D.S.; RL PROC. NATL. ACAD. SCI. U.S.A. 81: (1984). RN [3] RP VARIANTS RETINITIS PIGMENTOSA. RM RA DRYJA T.P., HAHN L.B., COWLEY G.S., MCGEE T.L., BERSON E.L.; RL PROC. NATL. ACAD. SCI. U.S.A. 88: (1991). CC -!- FUNCTION: VISUAL PIGMENTS ARE THE LIGHT-ABSORBING MOLECULES THAT CC MEDIATE VISION. THEY CONSIST OF AN APOPROTEIN, OPSIN, COENTLY CC LINKED TO CIS-RETINAL. CC -!- TISSUE SPECIFICITY: ROD SHAPED PHOTORECEPTOR CELLS WHICH MEDIATES CC VISION IN DIM LIGHT.. CC -!- DISEASE: AUTOSOMAL DOMINANT RETINITIS PIGMENTOSA CAN BE DUE TO A CC DEFECT IN RHO. PATIENTS TYPICALLY HAVE NIGHT VISION BLINDNESS AND CC LOSS OF MIDPERIPHERAL VISUAL FIELD; AS THEIR CONDITION PROGRESSES, CC THEY LOSE THEIR FAR PERIPHERAL VISUAL FIELD AND EVENTUALLY CENTRAL CC VISION AS WELL. CC -!- SIMILARITY: TO ALL OTHER G-PROTEIN COUPLED RECEPTORS. STRONGEST TO CC ALL OTHER OPSINS. DR EMBL; K02281; HSOPS. DR MIM; ; NINTH EDITION. DR PROSITE; PS00237; G_PROTEIN_RECEPTOR. DR PROSITE; PS00238; OPSIN. KW PHOTORECEPTOR; RETINAL PROTEIN; TRANSMEMBRANE; GLYCOPROTEIN; VISION; KW PHOSPHORYLATION; LIPOPROTEIN; G-PROTEIN COUPLED RECEPTOR; ACETYLATION; KW RETINITIS PIGMENTOSA. FT DOMAIN 1 36 EXTRACELLULAR. FT TRANSMEM FT DOMAIN CYTOPLASMIC. FT DOMAIN EXTRACELLULAR. FT TRANSMEM FT DOMAIN CYTOPLASMIC. FT MOD_RES 1 1 ACETYLATION (BY SIMILARITY). FT CARBOHYD 2 2 BY SIMILARITY. FT BINDING RETINAL CHROMOPHORE. FT LIPID PALMITATE (BY SIMILARITY). FT DISULFID BY SIMILARITY. FT VARIANT T -> M (IN RETINITIS PIGMENTOSA). FT VARIANT P -> S (IN RETINITIS PIGMENTOSA). SQ SEQUENCE 348 AA; MW; CN; MNGTEGPNFY VPFSNATGVV RSPFEYPQYY LAEPWQFSML AAYMFLLIVL GFPINFLTLY VTVQHKKLRT PLNYILLNLA VADLFMVLGG FTSTLYTSLH GYFVFGPTGC NLEGFFATLG GEIALWSLVV LAIERYVVVC KPMSNFRFGE NHAIMGVAFT WVMALACAAP PLAGWSRYIP EGLQCSCGID YYTLKPEVNN ESFVIYMFVV HFTIPMIIIF FCYGQLVFTV KEAAAQQQES ATTQKAEKEV TRMVIIMVIA FLICWVPYAS VAFYIFTHQG SNFGPIFMTI PAFFAKSAAI YNPVIYIMMN KQFRNCMLTT ICCGKNPLGD DEASATVSKT ETSQVAPA // 4

5 Key terms MIM or OMIM (Online Mendelian Inheritance in Man): A comprehensive database of human genes and genetic disorders. Transmembrane domain: A hydrophobic segment of amino acids within a protein that crosses a membrane. Single-letter code: Letters of the alphabet used to denote the amino acids (A for alanine, P for proline, V for valine, etc.). A rich seam of annotation is stored in the comment (CC) field. In this example, we learn about the protein s function, tissue-specificity, disease associations, family relationships, etc. To facilitate swift computational processing of the file, many of the terms used here are also included as keywords (KW). Links to related information in other databases are made in the DR lines (such as here to EMBL, MIM and PROSITE). Further enriching the entry, various characteristics of the sequence itself are documented in the Feature Table (FT): for example, here we learn about the locations of the protein s transmembrane (TM) domains and functional sites (lipid and carbohydrate attachment sites, other binding sites, etc.), and about its sequence variants. Finally, the sequence is stored in the SQ field using the single-letter code, together with attributes such as its length and molecular weight. The entry terminates with the // symbol. Activity The link shows the history of the sequence of human insulin from its earliest Swiss-Prot entry in Scroll down and click on the fifth version (5.txt). What is the function of the protein? With what disease is the protein associated? How many amino acid residues are there in the active hormone? Database interoperability indexing biological data As we have already seen, storing information in databases is not very useful unless computers can access the data and help humans to interrogate and analyse the knowledge they contain. Achieving this requires adherence to standard data formats. For example, for protein and nucleotide sequences using the EMBL format, use of a common Feature Table format helps to improve data consistency and reliability, and facilitates database interoperation. Regulating the content, and the vocabulary and syntax used to describe the documented features, helps to ensure that the data can be readily accessed and manipulated by computer software. The principal means by which computers access database information is via their entries unique AC numbers and ID codes. This allows data from very different resources to be connected, whether from a nucleotide or protein sequence database, a protein family or structure database, a literature database, and so on. The more internal cross-references a database stores, the greater the web of connectivity that is possible from it. 5

6 Figure : Illustration of flat-file indexing. Data fields in flat-file databases can be linked via their two-letter tags. The main points of connectivity are the accession number (AC) or identifier (ID) tags. Here, literature (RP) and database cross-references (DR) in UniProtKB/ Swiss-Prot are linked to MEDLINE via a PubMed ID (PMID), to the PDB via the PDB ID, to EMBL via the EMBL AC tag and to PROSITE via the PROSITE AC tag. Reciprocal links from EMBL and PROSITE link back to UniProtKB/ Swiss-Prot via the Swiss-Prot AC tag. EMBL ID Q AC X RP MEDLINE PMID DR P02700 ABCD_YEAST PROSITE ID ATP-BIND UiniProtKB/Swiss-Prot ID ABCD_YEAST AC P02700 RP MEDLINE PMID DR EMBL X DR PDB 1TIM DR PROSITE PS00500 PDB ID 1TIM N CA C O CB AC DE DR PS00500 ATP-binding domain P02700 ABCD_YEAST MEDLINE PMID Exptl. studies of ATP binding of ACD protein Key term Flat-file index: An address or set of coordinates that allows query software to access specific parts of a flat-file database by means of designated tags. Interoperability: The ability of software systems or databases to communicate or exchange information seamlessly (to interoperate) without restriction. Relational database: A database in which data and their attributes are structured and stored in nonredundant tables in such as way as to facilitate information retrieval. Next-generation sequencing: Lowcost, parallelised, high-throughput technology capable of producing thousands or millions of sequences simultaneously (for example, 454 pyrosequencing, Illumina (Solexa) and SOLiD sequencing). Third-generation sequencing: Lowcost, single-molecule sequencing technology that aims to reduce the cost of sequencing a single human genome to US $1000 or less. The first tool to exploit this fact was SRS, the Sequence Retrieval System. SRS is an information indexing-tool that allows any flat-file database to be indexed to any other, permitting highly specific queries across different databases via a single interface, irrespective of their underlying data-types. Figure illustrates how integrated access to diverse information across different flat-file databases is made possible via links to and from their entries respective AC numbers and ID codes. The need for relational database management systems We have seen that flat-file databases can interoperate if they have been indexed or cross-referenced. This makes database queries fast and efficient, because they can be directed to specific parts of the file, rather than to the whole database(s). However, although this is effective for data integration, the approach is very brittle. Consider the role of AC numbers. If an AC number changes, its associated database entry suddenly becomes invisible to all resources to which it was formerly connected; to remain visible, all connected databases must incorporate the new AC number. Owing to its ease of use, the flat-file format was popular for many years. In time, as the pace of data acquisition increased and the accompanying body of scientific literature grew, keeping database data and annotations up to date became more time-consuming (and error-prone, because much of the work was manual). This prompted the use of relational database systems in order to help structure data more formally: here, data are managed in tables in such a way that changes in one table can be readily propagated to others, easing data-management burdens. For more complex resources too, like data warehouses, removing redundancy between databases and ensuring data consistency are easier to achieve using relational systems. The challenge now is not so much what we want to do with such systems today, but how they will need to adapt to future needs. The quantity of data that nextand third-generation sequencing technologies will produce is unprecedented, and will likely have a major impact on future database design. 6

7 Take it further More detailed information about flat-file database formats can be found in Chapter 3 of Introduction to Bioinformatics (Attwood and Parry- Smith, 1999), Prentice Hall. More detailed information on building biological databases, and the use of MySQL, can be found in Building Bioinformatics Solutions: with Perl, R and MySQL (Bessant, Shadforth and Oakley, 2008), OUP. Link Find out more about DNA and RNA structure and coding regions in Unit 7: Molecular biology and genetics. Activity Read the following news article describing the road to the US $1,000 human genome from the Human National Genome Research Institute website. What is one type of innovation that is being explored in order to allow revolutionary 3Gen technologies to deliver the $1000 genome? Further reading Attwood, T. and Parry-Smith, D. (1999) Introduction to Bioinformatics, Prentice Hall. Chapter 3 contains more information about flat-file database formats. Higgs, P. and Attwood, T. (2005) Bioinformatics and Molecular Evolution, Wiley-Blackwell. Refer to Chapter 5 for further details of biological databases. Bessant, C., Shadforth, I. and Oakley, D. (2008) Building Bioinformatics Solutions: with Perl, R and MySQL, OUP. Contains more detailed information on building biological databases, and the use of MySQL. Find out more about the road to the $1000 genome as described in the following news article from the National Human Genome Research Institute website. Checklist At the end of this topic guide you should be familiar with the following ideas about bioinformatics: the flat-file database (essentially a plain-text file) was the original means of managing raw sequence data the EMBL flat-file format was adopted by different databases because its structure made it easy to use and to adapt the EMBL format uses a series of two-letter tags to denote different database fields (ID and AC tags for the entry identifier and accession number, DE and GN tags for the descriptive title and gene name, CC for comments and FT for the Feature Table, etc.) tags allow the data in different parts of flat-file databases to be indexed, which allows them to be cross-linked to information in related databases flat-file databases are simple to understand but are brittle to changes in the data structure more sophisticated relational database management systems were devised to store and manage bioinformatics data more efficiently and more robustly. Acknowledgements The publisher would like to thank the following for their kind permission to reproduce their photographs: PhotoDisc: Lawrence Lawry All other images Pearson Education We are grateful to the following for permission to reproduce copyright material: Realia showing coding of a flat-file format of a UniProtKB/Swiss-Prot entry. Produced by Uniprot Consortium. Used by permission. In some instances we have been unable to trace the owners of copyright material, and we would appreciate any information that would enable us to do so. 7

Similarity searches in biological sequence databases

Similarity searches in biological sequence databases Similarity searches in biological sequence databases Volker Flegel september 2004 Page 1 Outline Keyword search in databases General concept Examples SRS Entrez Expasy Similarity searches in databases

More information

Bioinformatics Database Worksheet

Bioinformatics Database Worksheet Bioinformatics Database Worksheet (based on http://www.usm.maine.edu/~rhodes/goodies/matics.html) Where are the opsin genes in the human genome? Point your browser to the NCBI Map Viewer at http://www.ncbi.nlm.nih.gov/mapview/.

More information

The Use of WWW in Biological Research

The Use of WWW in Biological Research The Use of WWW in Biological Research Introduction R.Doelz, Biocomputing Basel T.Etzold, EMBL Heidelberg Information in Biology grows rapidly. Initially, biological retrieval systems used conventional

More information

Bioinformatics resources for data management. Etienne de Villiers KEMRI-Wellcome Trust, Kilifi

Bioinformatics resources for data management. Etienne de Villiers KEMRI-Wellcome Trust, Kilifi Bioinformatics resources for data management Etienne de Villiers KEMRI-Wellcome Trust, Kilifi Typical Bioinformatic Project Pose Hypothesis Store data in local database Read Relevant Papers Retrieve data

More information

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction Molecular Recognition Features (MoRFs) are short, intrinsically disordered regions in proteins that undergo

More information

In the sense of the definition above, a system is both a generalization of one gene s function and a recipe for including and excluding components.

In the sense of the definition above, a system is both a generalization of one gene s function and a recipe for including and excluding components. 1 In the sense of the definition above, a system is both a generalization of one gene s function and a recipe for including and excluding components. 2 Starting from a biological motivation to annotate

More information

LinkDB: A Database of Cross Links between Molecular Biology Databases

LinkDB: A Database of Cross Links between Molecular Biology Databases LinkDB: A Database of Cross Links between Molecular Biology Databases Susumu Goto, Yutaka Akiyama, Minoru Kanehisa Institute for Chemical Research, Kyoto University Introduction We have developed a molecular

More information

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI.

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI. 2 Navigating the NCBI Instructions Aim: To become familiar with the resources available at the National Center for Bioinformatics (NCBI) and the search engine Entrez. Instructions: Write the answers to

More information

EBI patent related services

EBI patent related services EBI patent related services 4 th Annual Forum for SMEs October 18-19 th 2010 Jennifer McDowall Senior Scientist, EMBL-EBI EBI is an Outstation of the European Molecular Biology Laboratory. Overview Patent

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

Integrated Access to Biological Data. A use case

Integrated Access to Biological Data. A use case Integrated Access to Biological Data. A use case Marta González Fundación ROBOTIKER, Parque Tecnológico Edif 202 48970 Zamudio, Vizcaya Spain marta@robotiker.es Abstract. This use case reflects the research

More information

Literature Databases

Literature Databases Literature Databases Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Overview 1. Databases 2. Publications in Science 3. PubMed and

More information

Data Mining Technologies for Bioinformatics Sequences

Data Mining Technologies for Bioinformatics Sequences Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment

More information

Ontology-Based Mediation in the. Pisa June 2007

Ontology-Based Mediation in the. Pisa June 2007 http://asp.uma.es Ontology-Based Mediation in the Amine System Project Pisa June 2007 Prof. Dr. José F. Aldana Montes (jfam@lcc.uma.es) Prof. Dr. Francisca Sánchez-Jiménez Ismael Navas Delgado Raúl Montañez

More information

RLIMS-P Website Help Document

RLIMS-P Website Help Document RLIMS-P Website Help Document Table of Contents Introduction... 1 RLIMS-P architecture... 2 RLIMS-P interface... 2 Login...2 Input page...3 Results Page...4 Text Evidence/Curation Page...9 URL: http://annotation.dbi.udel.edu/text_mining/rlimsp2/

More information

NCBI News, November 2009

NCBI News, November 2009 Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved

More information

) I R L Press Limited, Oxford, England. The protein identification resource (PIR)

) I R L Press Limited, Oxford, England. The protein identification resource (PIR) Volume 14 Number 1 Volume 1986 Nucleic Acids Research 14 Number 1986 Nucleic Acids Research The protein identification resource (PIR) David G.George, Winona C.Barker and Lois T.Hunt National Biomedical

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2017 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Goal-oriented Schema in Biological Database Design

Goal-oriented Schema in Biological Database Design Goal-oriented Schema in Biological Database Design Ping Chen Department of Computer Science University of Helsinki Helsinki, Finland 00014 EMAIL: pchen@cs.helsinki.fi Abstract In this paper, I reviewed

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Biostatistics and Bioinformatics Molecular Sequence Databases

Biostatistics and Bioinformatics Molecular Sequence Databases . 1 Description of Module Subject Name Paper Name Module Name/Title 13 03 Dr. Vijaya Khader Dr. MC Varadaraj 2 1. Objectives: In the present module, the students will learn about 1. Encoding linear sequences

More information

Complex Query Formulation Over Diverse Information Sources Using an Ontology

Complex Query Formulation Over Diverse Information Sources Using an Ontology Complex Query Formulation Over Diverse Information Sources Using an Ontology Robert Stevens, Carole Goble, Norman Paton, Sean Bechhofer, Gary Ng, Patricia Baker and Andy Brass Department of Computer Science,

More information

Data Curation Profile Human Genomics

Data Curation Profile Human Genomics Data Curation Profile Human Genomics Profile Author Profile Author Institution Name Contact J. Carlson N. Brown Purdue University J. Carlson, jrcarlso@purdue.edu Date of Creation October 27, 2009 Date

More information

Approaches to Efficient Multiple Sequence Alignment and Protein Search

Approaches to Efficient Multiple Sequence Alignment and Protein Search Approaches to Efficient Multiple Sequence Alignment and Protein Search Thesis statements of the PhD dissertation Adrienn Szabó Supervisor: István Miklós Eötvös Loránd University Faculty of Informatics

More information

Information Resources in Molecular Biology Marcela Davila-Lopez How many and where

Information Resources in Molecular Biology Marcela Davila-Lopez How many and where Information Resources in Molecular Biology Marcela Davila-Lopez (marcela.davila@medkem.gu.se) How many and where Data growth DB: What and Why A Database is a shared collection of logically related data,

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the

More information

Topics of the talk. Biodatabases. Data types. Some sequence terminology...

Topics of the talk. Biodatabases. Data types. Some sequence terminology... Topics of the talk Biodatabases Jarno Tuimala / Eija Korpelainen CSC What data are stored in biological databases? What constitutes a good database? Nucleic acid sequence databases Amino acid sequence

More information

Genome Browsers - The UCSC Genome Browser

Genome Browsers - The UCSC Genome Browser Genome Browsers - The UCSC Genome Browser Background The UCSC Genome Browser is a well-curated site that provides users with a view of gene or sequence information in genomic context for a specific species,

More information

Exploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix

Exploring and Exploiting the Biological Maze. Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Exploring and Exploiting the Biological Maze Presented By Vidyadhari Edupuganti Advisor Dr. Zoe Lacroix Motivation An abundance of biological data sources contain data about scientific entities, such as

More information

Genome Browsers Guide

Genome Browsers Guide Genome Browsers Guide Take a Class This guide supports the Galter Library class called Genome Browsers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame

When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame 1 When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame translation on the fly. That is, 3 reading frames from

More information

Humboldt-University of Berlin

Humboldt-University of Berlin Humboldt-University of Berlin Exploiting Link Structure to Discover Meaningful Associations between Controlled Vocabulary Terms exposé of diploma thesis of Andrej Masula 13th October 2008 supervisor: Louiqa

More information

EBI services. Jennifer McDowall EMBL-EBI

EBI services. Jennifer McDowall EMBL-EBI EBI services Jennifer McDowall EMBL-EBI The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number 226073 (Integrating

More information

Bioinformatics Hubs on the Web

Bioinformatics Hubs on the Web Bioinformatics Hubs on the Web Take a class The Galter Library teaches a related class called Bioinformatics Hubs on the Web. See our Classes schedule for the next available offering. If this class is

More information

Protein Sequence Database

Protein Sequence Database Protein Sequence Database A protein is a large molecule manufactured in the cell of a living organism to carry out essential functions within the cell. The primary structure of a protein is a sequence

More information

Retrieving factual data and documents using IMGT-ML in the IMGT information system

Retrieving factual data and documents using IMGT-ML in the IMGT information system Retrieving factual data and documents using IMGT-ML in the IMGT information system Authors : Chaume D. *, Combres K. *, Giudicelli V. *, Lefranc M.-P. * * Laboratoire d'immunogénétique Moléculaire, LIGM,

More information

Abstract. of biological data of high variety, heterogeneity, and semi-structured nature, and the increasing

Abstract. of biological data of high variety, heterogeneity, and semi-structured nature, and the increasing Paper ID# SACBIO-129 HAVING A BLAST: ANALYZING GENE SEQUENCE DATA WITH BLASTQUEST WHERE DO WE GO FROM HERE? Abstract In this paper, we pursue two main goals. First, we describe a new tool called BlastQuest,

More information

The GenAlg Project: Developing a New Integrating Data Model, Language, and Tool for Managing and Querying Genomic Information

The GenAlg Project: Developing a New Integrating Data Model, Language, and Tool for Managing and Querying Genomic Information The GenAlg Project: Developing a New Integrating Data Model, Language, and Tool for Managing and Querying Genomic Information Joachim Hammer and Markus Schneider Department of Computer and Information

More information

BIOSPIDA: A Relational Database Translator for NCBI

BIOSPIDA: A Relational Database Translator for NCBI BIOSPIDA: A Relational Database Translator for NCBI Matthew S. Hagen, MSE 1,2,3,5, Eva K. Lee, PhD *,1,2,3,4 1 Center for Operations Research in Medicine and HealthCare, 2 NSF I/UCRC Center for Health

More information

An Introduction to PubMed Searching: A Reference Guide

An Introduction to PubMed Searching: A Reference Guide An Introduction to PubMed Searching: A Reference Guide Created by the Ontario Public Health Libraries Association (OPHLA) ACCESSING PubMed PubMed, the National Library of Medicine s free version of MEDLINE,

More information

Customisable Curation Workflows in Argo

Customisable Curation Workflows in Argo Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:

More information

A First Introduction to Scientific Visualization Geoffrey Gray

A First Introduction to Scientific Visualization Geoffrey Gray Visual Molecular Dynamics A First Introduction to Scientific Visualization Geoffrey Gray VMD on CIRCE: On the lower bottom left of your screen, click on the window start-up menu. In the search box type

More information

EBP. Accessing the Biomedical Literature for the Best Evidence

EBP. Accessing the Biomedical Literature for the Best Evidence Accessing the Biomedical Literature for the Best Evidence Structuring the search for information and evidence Basic search resources Starting the search EBP Lab / Practice: Simple searches Using PubMed

More information

Finding homologous sequences in databases

Finding homologous sequences in databases Finding homologous sequences in databases There are multiple algorithms to search sequences databases BLAST (EMBL, NCBI, DDBJ, local) FASTA (EMBL, local) For protein only databases scan via Smith-Waterman

More information

Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA

Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA Journal of Computer Science 2 (3): 292-296, 2006 ISSN 1549-3636 2006 Science Publications Taxonomically Clustering Organisms Based on the Profiles of Gene Sequences Using PCA 1 E.Ramaraj and 2 M.Punithavalli

More information

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit

More information

Automatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan

Automatic annotation in UniProtKB using UniRule, and Complete Proteomes. Wei Mun Chan Automatic annotation in UniProtKB using UniRule, and Complete Proteomes Wei Mun Chan Talk outline Introduction to UniProt UniProtKB annotation and propagation Data increase and the need for Automatic Annotation

More information

NGS NEXT GENERATION SEQUENCING

NGS NEXT GENERATION SEQUENCING NGS NEXT GENERATION SEQUENCING Paestum (Sa) 15-16 -17 maggio 2014 Relatore Dr Cataldo Senatore Dr.ssa Emilia Vaccaro Sanger Sequencing Reactions For given template DNA, it s like PCR except: Uses only

More information

Geneious 5.6 Quickstart Manual. Biomatters Ltd

Geneious 5.6 Quickstart Manual. Biomatters Ltd Geneious 5.6 Quickstart Manual Biomatters Ltd October 15, 2012 2 Introduction This quickstart manual will guide you through the features of Geneious 5.6 s interface and help you orient yourself. You should

More information

PFstats User Guide. Aspartate/ornithine carbamoyltransferase Case Study. Neli Fonseca

PFstats User Guide. Aspartate/ornithine carbamoyltransferase Case Study. Neli Fonseca PFstats User Guide Aspartate/ornithine carbamoyltransferase Case Study 1 Contents Overview 3 Obtaining An Alignment 3 Methods 4 Alignment Filtering............................................ 4 Reference

More information

AMNH Gerstner Scholars in Bioinformatics & Computational Biology Application Instructions

AMNH Gerstner Scholars in Bioinformatics & Computational Biology Application Instructions PURPOSE AMNH Gerstner Scholars in Bioinformatics & Computational Biology Application Instructions The seeks highly qualified applicants for its Gerstner postdoctoral fellowship program in Bioinformatics

More information

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si Information Retrieval, Information Extraction, and Text Mining Applications for Biology Slides by Suleyman Cetintas & Luo Si 1 Outline Introduction Overview of Literature Data Sources PubMed, HighWire

More information

Structural Bioinformatics

Structural Bioinformatics Structural Bioinformatics Elucidation of the 3D structures of biomolecules. Analysis and comparison of biomolecular structures. Prediction of biomolecular recognition. Handles three-dimensional (3-D) structures.

More information

Proceedings of the Postgraduate Annual Research Seminar

Proceedings of the Postgraduate Annual Research Seminar Proceedings of the Postgraduate Annual Research Seminar 2006 202 Database Integration Approaches for Heterogeneous Biological Data Sources: An overview Iskandar Ishak, Naomie Salim Faculty of Computer

More information

CAP BIOINFORMATICS Su-Shing Chen CISE. 8/19/2005 Su-Shing Chen, CISE 1

CAP BIOINFORMATICS Su-Shing Chen CISE. 8/19/2005 Su-Shing Chen, CISE 1 CAP 5510-2 BIOINFORMATICS Su-Shing Chen CISE 8/19/2005 Su-Shing Chen, CISE 1 Building Local Genomic Databases Genomic research integrates sequence data with gene function knowledge. Gene ontology to represent

More information

Deliverable D4.3 Release of pilot version of data warehouse

Deliverable D4.3 Release of pilot version of data warehouse Deliverable D4.3 Release of pilot version of data warehouse Date: 10.05.17 HORIZON 2020 - INFRADEV Implementation and operation of cross-cutting services and solutions for clusters of ESFRI Grant Agreement

More information

Supplementary Note 1: Considerations About Data Integration

Supplementary Note 1: Considerations About Data Integration Supplementary Note 1: Considerations About Data Integration Considerations about curated data integration and inferred data integration mentha integrates high confidence interaction information curated

More information

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files.

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files. Structure Viewers Take a Class This guide supports the Galter Library class called Structure Viewers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

Text mining tools for semantically enriching the scientific literature

Text mining tools for semantically enriching the scientific literature Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the

More information

Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction

Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Pavel P. Kuksa, Rutgers University Yanjun Qi, Bing Bai, Ronan Collobert, NEC Labs Jason Weston, Google Research NY Vladimir

More information

Finding and Exporting Data. BioMart

Finding and Exporting Data. BioMart September 2017 Finding and Exporting Data Not sure what tool to use to find and export data? BioMart is used to retrieve data for complex queries, involving a few or many genes or even complete genomes.

More information

Protein Sequence Database

Protein Sequence Database Protein Sequence Database A protein is a large molecule manufactured in the cell of a living organism to carry out essential functions within the cell. The primary structure of a protein is a sequence

More information

HsAgilentDesign db

HsAgilentDesign db HsAgilentDesign026652.db January 16, 2019 HsAgilentDesign026652ACCNUM Map Manufacturer identifiers to Accession Numbers HsAgilentDesign026652ACCNUM is an R object that contains mappings between a manufacturer

More information

New generation of patent sequence databases Information Sources in Biotechnology Japan

New generation of patent sequence databases Information Sources in Biotechnology Japan New generation of patent sequence databases Information Sources in Biotechnology Japan EBI is an Outstation of the European Molecular Biology Laboratory. Patent-related resources Patents Patent Resources

More information

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into

More information

Human Disease Models Tutorial

Human Disease Models Tutorial Mouse Genome Informatics www.informatics.jax.org The fundamental mission of the Mouse Genome Informatics resource is to facilitate the use of mouse as a model system for understanding human biology and

More information

What is Internet COMPUTER NETWORKS AND NETWORK-BASED BIOINFORMATICS RESOURCES

What is Internet COMPUTER NETWORKS AND NETWORK-BASED BIOINFORMATICS RESOURCES What is Internet COMPUTER NETWORKS AND NETWORK-BASED BIOINFORMATICS RESOURCES Global Internet DNS Internet IP Internet Domain Name System Domain Name System The Domain Name System (DNS) is a hierarchical,

More information

Introduction to the Protein Data Bank Master Chimie Info Roland Stote Page #

Introduction to the Protein Data Bank Master Chimie Info Roland Stote Page # Introduction to the Protein Data Bank Master Chimie Info - 2009 Roland Stote The purpose of the Protein Data Bank is to collect and organize 3D structures of proteins, nucleic acids, protein-nucleic acid

More information

hgu133plus2.db December 11, 2017

hgu133plus2.db December 11, 2017 hgu133plus2.db December 11, 2017 hgu133plus2accnum Map Manufacturer identifiers to Accession Numbers hgu133plus2accnum is an R object that contains mappings between a manufacturer s identifiers and manufacturers

More information

Managing Your Biological Data with Python

Managing Your Biological Data with Python Chapman & Hall/CRC Mathematical and Computational Biology Series Managing Your Biological Data with Python Ailegra Via Kristian Rother Anna Tramontano CRC Press Taylor & Francis Group Boca Raton London

More information

Medical Informatics Databases Databases Databases Databases

Medical Informatics Databases Databases Databases Databases Medical Informatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr http://www.yildiz.edu.tr/~naydin 1 2 Computers serve four interdependent functions in biomedical informatics: communications, computation,

More information

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009 Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images

More information

Yutaka Ueno Neuroscience, AIST Tsukuba, Japan

Yutaka Ueno Neuroscience, AIST Tsukuba, Japan Yutaka Ueno Neuroscience, AIST Tsukuba, Japan Lua is good in Molecular biology for: 1. programming tasks 2. database management tasks 3. development of algorithms Current Projects 1. sequence annotation

More information

An Algebra for Protein Structure Data

An Algebra for Protein Structure Data An Algebra for Protein Structure Data Yanchao Wang, and Rajshekhar Sunderraman Abstract This paper presents an algebraic approach to optimize queries in domain-specific database management system for protein

More information

efip online Help Document

efip online Help Document efip online Help Document University of Delaware Computer and Information Sciences & Center for Bioinformatics and Computational Biology Newark, DE, USA December 2013 K K S I K K Table of Contents INTRODUCTION...

More information

Software review. Biomolecular Interaction Network Database

Software review. Biomolecular Interaction Network Database Biomolecular Interaction Network Database Keywords: protein interactions, visualisation, biology data integration, web access Abstract This software review looks at the utility of the Biomolecular Interaction

More information

Medical Center Library & Archives

Medical Center Library & Archives Medical Center Library & Archives October 1, 2016 The Medical Center Library welcomes you to the Duke community! We would like to take a moment to tell you about some of the tremendous number of services

More information

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment An Introduction to NCBI BLAST Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment Resources: The BLAST web server is available at https://blast.ncbi.nlm.nih.gov/blast.cgi

More information

BIFS 617 Dr. Alkharouf. Topics. Parsing GenBank Files. More regular expression modifiers. /m /s

BIFS 617 Dr. Alkharouf. Topics. Parsing GenBank Files. More regular expression modifiers. /m /s Parsing GenBank Files BIFS 617 Dr. Alkharouf 1 Parsing GenBank Files Topics More regular expression modifiers /m /s 2 1 Parsing GenBank Libraries Parsing = systematically taking apart some unstructured

More information

XML in the bipharmaceutical

XML in the bipharmaceutical XML in the bipharmaceutical sector XML holds out the opportunity to integrate data across both the enterprise and the network of biopharmaceutical alliances - with little technological dislocation and

More information

This document contains information about the annotation workflow for the Full BioCreative interactive task.

This document contains information about the annotation workflow for the Full BioCreative interactive task. BioCreative IV-User Interactive Task RLIMS-P Annotation Task This document contains information about the annotation workflow for the Full BioCreative interactive task. Annotation Workflow using RLIMS-P

More information

Patterns / Regular expressions

Patterns / Regular expressions Sequence bioinformatics http://bio.lundberg.gu.se/courses/ht07/bio2/ Perl programming (GK) Hidden Markov Models (MO) Methods and applications - Algorithms of sequence alignment, BLAST, multiple alignments

More information

Important Example: Gene Sequence Matching. Corrigiendum. Central Dogma of Modern Biology. Genetics. How Nucleotides code for Amino Acids

Important Example: Gene Sequence Matching. Corrigiendum. Central Dogma of Modern Biology. Genetics. How Nucleotides code for Amino Acids Important Example: Gene Sequence Matching Century of Biology Two views of computer science s relationship to biology: Bioinformatics: computational methods to help discover new biology from lots of data

More information

SEBI: An Architecture for Biomedical Image Discovery, Interoperability and Reusability based on Semantic Enrichment

SEBI: An Architecture for Biomedical Image Discovery, Interoperability and Reusability based on Semantic Enrichment SEBI: An Architecture for Biomedical Image Discovery, Interoperability and Reusability based on Semantic Enrichment Ahmad C. Bukhari 1, Michael Krauthammer 2, Christopher J.O. Baker 1 1 Department of Computer

More information

Sequence Variation Database Project at the European Bioinformatics Institute

Sequence Variation Database Project at the European Bioinformatics Institute 52 LEHVÄ SLAIHO ET AL. HUMAN MUTATION 15:52 56 (2000) MDI SPECIAL ARTICLE Sequence Variation Database Project at the European Bioinformatics Institute Heikki Lehväslaiho,* Elia Stupka, and Michael Ashburner

More information

HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT

HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT - Swarbhanu Chatterjee. Hidden Markov models are a sophisticated and flexible statistical tool for the study of protein models. Using HMMs to analyze proteins

More information

Querying Multiple Bioinformatics Information Sources: Can Semantic Web Research Help?

Querying Multiple Bioinformatics Information Sources: Can Semantic Web Research Help? Querying Multiple Bioinformatics Information Sources: Can Semantic Web Research Help? David Buttler, Matthew Coleman 1, Terence Critchlow 1, Renato Fileto, Wei Han, Ling Liu, Calton Pu, Daniel Rocco, Li

More information

Measuring inter-annotator agreement in GO annotations

Measuring inter-annotator agreement in GO annotations Measuring inter-annotator agreement in GO annotations Camon EB, Barrell DG, Dimmer EC, Lee V, Magrane M, Maslen J, Binns ns D, Apweiler R. An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.

More information

Protein Data Bank Japan

Protein Data Bank Japan Protein Data Bank Japan http://www.pdbj.org/ PDBj Today gene information for many species is just at the point of being revealed. To make use of this information, it is necessary to look at the proteins

More information

Turning Text into Insight: Text Mining in the Life Sciences WHITEPAPER

Turning Text into Insight: Text Mining in the Life Sciences WHITEPAPER Turning Text into Insight: Text Mining in the Life Sciences WHITEPAPER According to The STM Report (2015), 2.5 million peer-reviewed articles are published in scholarly journals each year. 1 PubMed contains

More information

Introduction to Genome Browsers

Introduction to Genome Browsers Introduction to Genome Browsers Rolando Garcia-Milian, MLS, AHIP (Rolando.milian@ufl.edu) Department of Biomedical and Health Information Services Health Sciences Center Libraries, University of Florida

More information

EMBL-EBI Patent Services

EMBL-EBI Patent Services EMBL-EBI Patent Services 5 th Annual Forum for SMEs October 6-7 th 2011 Jennifer McDowall EBI is an Outstation of the European Molecular Biology Laboratory. Patent resources at EBI 2 http://www.ebi.ac.uk/patentdata/

More information

VirusPKT: A Search Tool For Assimilating Assorted Acquaintance For Viruses

VirusPKT: A Search Tool For Assimilating Assorted Acquaintance For Viruses VirusPKT: A Search Tool For Assimilating Assorted Acquaintance For Viruses Jayanthi Manicassamy Department of Computer Science Pondicherry University Pondicherry, India. jmanic2@yahoo.com P. Dhavachelvan

More information

1. HPC & I/O 2. BioPerl

1. HPC & I/O 2. BioPerl 1. HPC & I/O 2. BioPerl A simplified picture of the system User machines Login server(s) jhpce01.jhsph.edu jhpce02.jhsph.edu 72 nodes ~3000 cores compute farm direct attached storage Research network

More information

Exercises. Biological Data Analysis Using InterMine workshop exercises with answers

Exercises. Biological Data Analysis Using InterMine workshop exercises with answers Exercises Biological Data Analysis Using InterMine workshop exercises with answers Exercise1: Faceted Search Use HumanMine for this exercise 1. Search for one or more of the following using the keyword

More information

What is Text Mining? Sophia Ananiadou National Centre for Text Mining University of Manchester

What is Text Mining? Sophia Ananiadou National Centre for Text Mining   University of Manchester National Centre for Text Mining www.nactem.ac.uk University of Manchester Outline Aims of text mining Text Mining steps Text Mining uses Applications 2 Aims Extract and discover knowledge hidden in text

More information

PubMed Assistant: A Biologist-Friendly Interface for Enhanced PubMed Search

PubMed Assistant: A Biologist-Friendly Interface for Enhanced PubMed Search Bioinformatics (2006), accepted. PubMed Assistant: A Biologist-Friendly Interface for Enhanced PubMed Search Jing Ding Department of Electrical and Computer Engineering, Iowa State University, Ames, IA

More information

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Amanda Clare 1,3, Samuel Croset 2,3 (croset@ebi.ac.uk), Christoph Grabmueller 2,3, Senay

More information

Lane Medical Library Stanford University Medical Center

Lane Medical Library Stanford University Medical Center Lane Medical Library Stanford University Medical Center http://lane.stanford.edu LaneAskUs@Stanford.edu 650.723.6831 PubMed: A Quick Guide PubMed: (connect from Lane Library s webpage, http://lane.stanford.edu/

More information

SciVerse Scopus. 1. Scopus introduction and content coverage. 2. Scopus in comparison with Web of Science. 3. Basic functionalities of Scopus

SciVerse Scopus. 1. Scopus introduction and content coverage. 2. Scopus in comparison with Web of Science. 3. Basic functionalities of Scopus Prepared by: Jawad Sayadi Account Manager, United Kingdom Elsevier BV Radarweg 29 1043 NX Amsterdam The Netherlands J.Sayadi@elsevier.com SciVerse Scopus SciVerse Scopus 1. Scopus introduction and content

More information