The Ensembl API. What is the API? November 7, European Bioinformatics Institute (EBI) Hinxton, Cambridge, UK

Similar documents
Ensembl Core API. EMBL European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK

Two Examples of Datanomic. David Du Digital Technology Center Intelligent Storage Consortium University of Minnesota

Tutorial 1: Exploring the UCSC Genome Browser

Genome Browsers Guide

Browser Exercises - I. Alignments and Comparative genomics

Genome Browsers - The UCSC Genome Browser

Genome Environment Browser (GEB) user guide

Finding and Exporting Data. BioMart

How to use KAIKObase Version 3.1.0

Exercise 2: Browser-Based Annotation and RNA-Seq Data

ChIP-seq (NGS) Data Formats

BovineMine Documentation

Topics of the talk. Biodatabases. Data types. Some sequence terminology...

Identiyfing splice junctions from RNA-Seq data

Introduction to Genome Browsers

HymenopteraMine Documentation

Exon Probeset Annotations and Transcript Cluster Groupings

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

Generating and using Ensembl based annotation packages

Advanced UCSC Browser Functions

The UCSC Gene Sorter, Table Browser & Custom Tracks

pyensembl Documentation

Preliminary Syllabus. Genomics. Introduction & Genome Assembly Sequence Comparison Gene Modeling Gene Function Identification

Building and Using Ensembl Based Annotation Packages with ensembldb

Bioinformatics Hubs on the Web

Genomic Analysis with Genome Browsers.

When we search a nucleic acid databases, there is no need for you to carry out your own six frame translation. Mascot always performs a 6 frame

Public Repositories Tutorial: Bulk Downloads

Sequence Analysis Pipeline

Perl for Biologists. Object Oriented Programming and BioPERL. Session 10 May 14, Jaroslaw Pillardy

epigenomegateway.wustl.edu

BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J.

Annotation resources - ensembldb

The UCSC Genome Browser

Wilson Leung 05/27/2008 A Simple Introduction to NCBI BLAST

Introduc)on to annota)on with Artemis. Download presenta.on and data

Creating and Using Genome Assemblies Tutorial

m6aviewer Version Documentation

How to use earray to create custom content for the SureSelect Target Enrichment platform. Page 1

Ensembl RNASeq Practical. Overview

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege

General Help & Instructions to use with Examples

Chen lab workshop. Christian Frech

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Bioinformatics in next generation sequencing projects

VectorBase Web Apollo April Web Apollo 1

mpmorfsdb: A database of Molecular Recognition Features (MoRFs) in membrane proteins. Introduction

MetaPhyler Usage Manual

Using the UCSC genome browser

Finishing Circular Assemblies. J Fass UCD Genome Center Bioinformatics Core Thursday April 16, 2015

Advanced genome browsers: Integrated Genome Browser and others Heiko Muller Computational Research

ALEXA-Seq ( User Manual (v.1.6)

Genomic Locus Operation - DataBase version 1.0

The QoRTs Analysis Pipeline Example Walkthrough

The UCSC Genome Browser

ChIP-Seq Tutorial on Galaxy

Tutorial for Windows and Macintosh. Trimming Sequence Gene Codes Corporation

CisGenome User s Manual

Eval: A Gene Set Comparison System

MacVector for Mac OS X

Fusion Detection Using QIAseq RNAscan Panels

SAVANT v1.1.0 documentation CONTENTS

Documentation contents

Package biomart. April 11, 2018

Goal: General Process of Annotating Bee Genes Using Apollo

CLC Server. End User USER MANUAL

From genomic regions to biology

Getting Started. April Strand Life Sciences, Inc All rights reserved.

Comparative Sequencing

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI.

Package ensemblvep. April 5, 2014

Part 1: How to use IGV to visualize variants

Tutorial 1: Using Excel to find unique values in a list

Annotating sequences in batch

Geneious 5.6 Quickstart Manual. Biomatters Ltd

Rsubread package: high-performance read alignment, quantification and mutation discovery

Analysing High Throughput Sequencing Data with SeqMonk

Genomics 92 (2008) Contents lists available at ScienceDirect. Genomics. journal homepage:

Package biomart. January 20, 2019

featuredb storing and querying genomic annotation

Lecture 12. Short read aligners

Implementing a Relational Database

Rsubread package: high-performance read alignment, quantification and mutation discovery

Our data for today is a small subset of Saimaa ringed seal RNA sequencing data (RNA_seq_reads.fasta). Let s first see how many reads are there:

Useful software utilities for computational genomics. Shamith Samarajiwa CRUK Autumn School in Bioinformatics September 2017

RNA-Seq data analysis software. User Guide 023UG050V0200

Tutorial: Building up and querying databases

Sequencing Data. Paul Agapow 2011/02/03

NGS Data Analysis. Roberto Preste

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

Assessing Transcriptome Assembly

Chromatin immunoprecipitation sequencing (ChIP-Seq) on the SOLiD system Nature Methods 6, (2009)

RASER: Reads Aligner for SNPs and Editing sites of RNA (version 0.51) Manual

MIRING: Minimum Information for Reporting Immunogenomic NGS Genotyping. Data Standards Hackathon for NGS HACKATHON 1.0 Bethesda, MD September

Design and Annotation Files

Multiple Sequence Alignments

Object Oriented Programming and Perl

What is bioperl. What Bioperl can do

Services Performed. The following checklist confirms the steps of the RNA-Seq Service that were performed on your samples.

David Crossman, Ph.D. UAB Heflin Center for Genomic Science. GCC2012 Wednesday, July 25, 2012

Transcription:

The Ensembl API European Bioinformatics Institute (EBI) Hinxton, Cambridge, UK 8 Nov 2006 1/51 What is the API? The Ensembl API (application programming interface) is a framework for applications that need to access or store data in Ensembl's databases. 8 Nov 2006 2/51 1

System Context Other Scripts & Applications Apollo www Pipeline Java API (EnsJ) Perl API MartShell MartView Ensembl DBs Mart DB 8 Nov 2006 3/51 The Perl API Written in Object-Oriented Perl. Used to retrieve data from and to store data in Ensembl databases. Foundation for the Ensembl Pipeline and Ensembl Web interface. 8 Nov 2006 4/51 2

Why use an API? Uniform method of access to the data. Avoid writing the same thing twice: reusable in different systems. Reliable: lots of hard work, testing and optimisation already done. Insulates developers to underlying changes at a lower level (i.e. the database). 8 Nov 2006 5/51 Main object types Data objects Object adaptors Database adaptors (Registry) 8 Nov 2006 6/51 3

Data Objects Information is obtained from the API in the form of Data Objects. A Data Object represents a piece of data that is (or can be) stored in the database. Gene Transcript Marker Exon Exon 8 Nov 2006 7/51 Data Objects Code Example # print out the start, end and strand of a transcript print $transcript->start, '-', $transcript->end, '(',$transcript->strand, ")\n"; # print out the stable identifier for an exon print $exon->stable_id, "\n"; # print out the name of a marker and its primer sequences print $marker->display_marker_synonym->name, "\n"; print "left primer: ", $marker->left_primer, "\n"; print "right primer: ", $marker->right_primer, "\n"; # set the start and end of a simple feature $simple_feature->start(10); $simple_feature->end(100); 8 Nov 2006 8/51 4

Object Adaptors Object Adaptors are factories for Data Objects. Data Objects are retrieved from and stored in the database using Object Adaptors. Each Object Adaptor is responsible for creating objects of only one particular type. Data Adaptor fetch, store, and remove methods are used to retrieve, save, and delete information in the database. 8 Nov 2006 9/51 Object Adaptors Code Example # fetch a gene by its internal identifier $gene = $gene_adaptor->fetch_by_dbid(1234); # fetch a gene by its stable identifier $gene = $gene_adaptor->fetch_by_stable_id('ensg0000005038'); # store a transcript in the database $transcript_adaptor->store($transcript); # remove an exon from the database $exon_adaptor->remove($exon); # get all transcripts having a specific interpro domain @transcripts = @{$transcript_adaptor->fetch_all_by_domain('ipr000980')}; 8 Nov 2006 10/51 5

The Registry Is a central store of all the adaptors Automatically adds Adaptors on creation Adaptors can be retrieved by using the species, database type and Adaptor type. Enables you to specify configuration scripts Very useful for multi species scripts. 8 Nov 2006 11/51 load_registry_from_db Loads all the databases for the correct version of this API Ensures you use the correct API and Database Lazy loads so no database connections are made until requested Use -verbose option to print out the databases loaded 8 Nov 2006 12/51 6

Registry Usage use Bio::EnsEMBL::Registry; my $reg = "Bio::EnsEMBL::Registry"; $reg->load_registry_from_db( -host => 'ensembldb.ensembl.org', -user => 'anonymous'); $slice_adaptor = $reg->get_adaptor("human", "core", "Slice"); 8 Nov 2006 13/51 Getting More Information Inline POD Perldoc viewer for inline API documentation: http://www.ensembl.org/info/software/core/index.html Tutorial document: http://www.ensembl.org/info/software/core/core_tutorial.html ensembl-dev mailing list: <ensembl-dev@ebi.ac.uk> 8 Nov 2006 14/51 7

Question 1 a) Use the function load_registry_from_db to load all the databases and print the names of the databases found? hint: use Perldoc to find the methods to do this b) What is the database name fro the human core database? hint: use the Registry's get_adaptor method c) What other core databases are there? 8 Nov 2006 15/51 Perl Debugger $ perl d 1.pl # start the perl script in debug mode b 12 # breakpoint/stop at line 12 l # list script x $object # examine the data in $object x 2 $object # examine object down to 2 levels m $object # what methods are available r # run the script s # step to next line following # methods n # step to next line in this script 8 Nov 2006 16/51 8

Question 2 a) Fetch the sequence of the first 10MB of chromosome 20 and write the sequence of the chromosome Slice to file in FASTA format. hint: Slice objects inherit from Bio::Seq so can be written to file easily using Bio::SeqIO, ie: my $output = Bio::SeqIO->new( -file=>'>filename.fasta', -format=>'fasta'); $output->write_seq($slice); b) Use the identifier 'ENSG00000101266' to fetch a slice surrounding this gene with 2kb of flanking sequence. c) Print out information about your slices, include the name of the coordinate system, the name of the sequence, the start and end coordinates of the slice and strand. d) Determine the number of genes on the first 10MB of chromosome 20. 8 Nov 2006 17/51 Coordinate Systems Ensembl stores features and sequence in a number of coordinate systems. Example coordinate systems: chromosome, clone, scaffold, contig Chromosome CoordSystem 1, 2,..., X, Y Clone CoordSystem AC105024, AC107993, etc. 8 Nov 2006 18/51 9

Coordinate Systems Regions in one coordinate system may be constructed from a tiling path of regions from another coordinate system. Gap Chromosome 17 BAC clones 8 Nov 2006 19/51 Coordinate Systems Code Example # get a coordinate system adaptor $csa = Bio::EnsEMBL::Registry->get_adaptor('human', 'core', 'coordsystem'); # print out all of the coord systems in the db $coord_systems = $csa->fetch_all; foreach $cs (@$coord_systems) { print $cs->name, ' ',$cs->version, "\n"; } Sample output: chromosome NCBI36 supercontig clone contig 8 Nov 2006 20/51 10

Slices A Slice Data Object represents an arbitrary region of a genome. Slices are not directly stored in the database. A Slice is used to request sequence or features from a specific region in a specific coordinate system. chr20 Clone AC022035 8 Nov 2006 21/51 Slices Code Example # get the slice adaptor $slice_adaptor = $reg->get_adaptor('human', 'core', 'slice'); # fetch a slice on a region of chromosome 12 $slice = $slice_adaptor->fetch_by_region('chromosome', '12', 1e6, 2e6); # print out the sequence from this region print $slice->seq, "\n"; # get all clones in the database and print out their names @slices = @{ $slice_adaptor->fetch_all('clone') }; foreach $slice (@slices) { print $slice->seq_region_name, "\n"; } 8 Nov 2006 22/51 11

Features Features are Data Objects with associated genomic locations. All Features have start, end, strand and slice attributes. Features are retrieved from Object Adaptors using limiting criteria such as identifiers or regions (slices). chr20 Clone AC022035 8 Nov 2006 23/51 Features in the database Gene Transcript Exon PredictionTranscript PredictionExon DnaAlignFeature ProteinAlignFeature SimpleFeature MarkerFeature QtlFeature MiscFeature KaryotypeBand RepeatFeature AssemblyExceptionFeature DensityFeature 8 Nov 2006 24/51 12

A Complete Code Example use Bio::EnsEMBL::Registry; my $reg = "Bio::EnsEMBL::Registry"; $reg->load_registry_from_db( -host => 'ensembldb.ensembl.org', -user => 'anonymous'); $slice_ad = $db->get_adaptor('human', 'core', 'slice'); $slice = $slice_ad->fetch_by_region('chromosome', 'X', 1e6, 10e6); foreach $sf (@{ $slice->get_all_simplefeatures }) { $start = $sf->start; $end = $sf->end; $strand = $sf->strand; $score = $sf->score; print "$start-$end ($strand) $score\n"; } 8 Nov 2006 25/51 Question 3 a) Get all the repeat features from chromosome 20:1-500k. Print out the name and position of each and the total number. b) Get a protein alignment feature with the db identifier 34425. What is the analysis program that produced this alignment and what are the two sequences called. Hint: use a ProteinAlignFeatureAdaptor print the results of fetch_by_dbid 8 Nov 2006 26/51 13

Genes, Transcripts, Exons Genes, Transcripts and Exons are objects that can be used just like any other Feature object. A Gene is a set of alternatively spliced Transcripts. A Transcript is a set of Exons. 8 Nov 2006 27/51 Genes, Transcripts, Exons Code Example # fetch a gene by its stable identifier $gene = $gene_adaptor->fetch_by_stable_id('ensg00000123427'); # print out the gene, its transcripts, and its exons print "Gene: ", get_string($gene), "\n"; foreach $transcript (@{ $gene->get_all_transcripts }) { print " Transcript: ", get_string($transcript), "\n"; foreach $exon (@{ $transcript->get_all_exons }) { print " Exon: ", get_string($exon), "\n"; } } # helper function: returns location and stable_id string sub get_string { $feature = shift; $stable_id = $feature->stable_id; $seq_region = $feature->slice->seq_region_name; $start = $feature->start; $end = $feature->end; $strand = $feature->strand; return "$stable_id $seq_region:$start-$end($strand)"; 8 Nov 2006 28/51 } 14

Genes, Transcripts, Exons Sample Output Gene: ENSG00000123427 12:56452650-56462590(1) Transcript: ENST00000300209 12:56452650-56462590(1) Exon: ENSE00001180238 12:56452650-56452951(1) Exon: ENSE00001354204 12:56453067-56453178(1) Exon: ENSE00001354199 12:56460305-56462590(1) Transcript: ENST00000337139 12:56454741-56462451(1) Exon: ENSE00000838990 12:56454741-56454812(1) Exon: ENSE00001246483 12:56454943-56454949(1) Exon: ENSE00001141252 12:56460305-56462451(1) Transcript: ENST00000333012 12:56452728-56462590(1) Exon: ENSE00000838993 12:56452728-56452951(1) Exon: ENSE00001354204 12:56453067-56453178(1) Exon: ENSE00001246506 12:56454679-56454817(1) Exon: ENSE00001336722 12:56460305-56462590(1) 8 Nov 2006 29/51 Translations Translations are not Features. A Translation object defines the UTR and CDS of a Transcript. Peptides are not stored in the database, they are computed on the fly using Translation objects. Translation 8 Nov 2006 30/51 15

Translations Code Example # fetch a transcript from the database $transcript = $transcript_adaptor->fetch_by_stable_id('enst00000333012'); # obtain the translation of the transcript $translation = $transcript->translation; # print out the translation info print "Translation: ", $translation->stable_id, "\n"; print "Start Exon: ",$translation->start_exon->stable_id,"\n"; print "End Exon: ", $translation->end_exon->stable_id, "\n"; # start and end are coordinates within the start/end exons print "Start : ", $translation->start, "\n"; print "End : ", $translation->end, "\n"; # print the peptide which is the product of the translation print "Peptide : ", $transcript->translate->seq, "\n"; 8 Nov 2006 31/51 Translations Sample Output Translation: ENSP00000327425 Start Exon: ENSE00000838993 End Exon: ENSE00001336722 Start : 3 End : 22 Peptide : SERRPRCPGTPLAGRMADPGPDPESESESVFPREVGLFADSYSEKSQFCFCGHVLTITQN FGSRLGVAARVWDAALSLCNYFESQNVDFRGKKVIELGAGTGIVGILAALQGAYGLVRET EDDVIEQELWRGMRGACGHALSMSTMTPWESIKGSSVRGGCYHH 8 Nov 2006 32/51 16

Question 4 a) Fetch a gene with id ENSG00000101266, print information about the number of transcripts and exons. b) For the above gene, get all the transcripts and list the number of exons in each and the translations. c) Why do the exon numbers not match? 8 Nov 2006 33/51 Coordinate Transformations The API provides the means to convert between any related coordinate systems in the database. Feature methods transfer, transform, project can be used to move features between coordinate systems. Slice method project can be used to move features between coordinate systems. 8 Nov 2006 34/51 17

Feature::transfer The Feature method transfer moves a feature from one Slice to another. The Slice may be in the same coordinate system or a different coordinate system. Chr20 Chr17 Chr17 AC099811 ChrX 8 Nov 2006 35/51 Feature::transfer Code Example # fetch an exon from the database $exon = $exon_adaptor->fetch_by_stable_id('ense00001180238'); print "Exon is on slice: ", $exon->slice->name, "\n"; print "Exon coords: ", $exon->start, '-', $exon->end, "\n"; # transfer the exon to a small slice just covering it $exon_slice = $slice_adaptor->fetch_by_feature($exon); $exon = $exon->transfer($exon_slice); print "Exon is on slice: ", $exon->slice->name, "\n"; print "Exon coords: ", $exon->start, '-', $exon->end, "\n"; Sample output: Exon is on slice: chromosome:ncbi35:12:1:132449811:1 Exon coords: 56452650-56452951 Exon is on slice: chromosome:ncbi35:12:56452650:56452951:1 Exon coords: 1-302 8 Nov 2006 36/51 18

Feature::transform Transform is like the transfer method but it does not require a Slice argument, only the name of a Coordinate System. Slice will span entire CoordSystem. Chr20 Chr17 Chr17 AC099811 AC099811 8 Nov 2006 37/51 Feature::transform Code Example # fetch a prediction transcript from the database $pta = $reg->get_adaptor('human', 'core', 'PredictionTranscript'); $pt = $pta->fetch_by_dbid(1834); print "Prediction Transcript: ", $pt->stable_id, "\n"; print "On slice: ", $pt->slice->name, "\n"; print "Coords: ", $pt->start, '-', $pt->end, "\n"; # transform the prediction transcript to the supercontig system $pt = $pt->transform('supercontig'); print "On slice: ", $pt->slice->name, "\n"; print "Coords: ", $pt->start, '-', $pt->end, "\n"; Sample output: Prediction Transcript: GENSCAN00000001834 On slice: contig::al390876.19.1.185571:1:185571:1 Coords: 131732-174068 On slice: supercontig::nt_008470:1:40394264:1 Coords: 10814989-10857325 8 Nov 2006 38/51 19

Feature::project Project does not move a feature, but it provides a definition of where a feature lies in another coordinate system. It is useful to determine where a feature that spans multiple sequence regions lies. Chr17 Chr17 AC099811 AC099811 8 Nov 2006 39/51 Feature::project Code Example # fetch a transcript from the database $tr = $transcript_adaptor->fetch_by_stable_id('enst00000351033'); $cs = $tr->slice->coord_system; print "Coords: ", $cs->name, ' ', $tr->slice->seq_region_name,' ', $tr->start, '-', $tr->end, '(', $tr->strand, ")\n"; # project to the clone coordinate system foreach $segment (@{ $tr->project('clone') }) { ($from_start, $from_end, $pslice) = @$segment; } print "$from_start-$from_end projects to clone region: "; print $pslice->seq_region_name, ' ', $pslice->start, '-', $pslice->end, "(", $pslice->strand, ")\n"; 8 Nov 2006 40/51 20

Feature::project Sample Output Coords: chromosome 1 147647-147749(-1) 1-103 projects to clone region: AL627309.15 147271-147373(-1) 8 Nov 2006 41/51 Slice::project Slice::project acts exactly like Feature::project, except that it works with a Slice instead of a Feature. It is used to determine how one coordinate system is made of another. Chr17 Chr17 AC099811 AC099811 8 Nov 2006 42/51 21

Slice::project Code Example # fetch a slice on a region of chromosome 5 $slice = $slice_adaptor->fetch_by_region('chromosome', '5', 10e6, 12e6); # project to the clone coordinate system foreach $segment (@{ $slice->project('clone') }) { ($from_start, $from_end, $pslice) = @$segment; } print "$from_start-$from_end projects to clone region: "; print $pslice->seq_region_name, ' ', $pslice->start, '-', $pslice->end, "(", $pslice->strand, ")\n"; 8 Nov 2006 43/51 Slice::project Sample Output 1-55893 projects to clone region: AC010629.8 24890-80782(1) 55894-151517 projects to clone region: AC091905.3 423-96046(1) 151518-296668 projects to clone region: AC034229.4 768-145918(1) 296669-432616 projects to clone region: AC012640.12 1-135948(-1) 432617-562215 projects to clone region: AC092336.2 1-129599(-1) 562216-646577 projects to clone region: AC112200.3 1-84362(-1) 646578-722633 projects to clone region: AC106760.2 2135-78190(1) 722634-889200 projects to clone region: AC012629.9 1-166567(-1) 889201-1030964 projects to clone region: AC010626.6 18549-160312(1) 1030965-1148679 projects to clone region: AC005610.1 105694-223408(1) 1148680-1286532 projects to clone region: AC004648.1 1-137853(-1) 1286533-1367657 projects to clone region: AC005367.1 463-81587(1) 1367658-1488052 projects to clone region: AC003954.1 1-120395(-1) 1488053-1522377 projects to clone region: AC004633.1 1901-36225(1) 1522378-1776593 projects to clone region: AC010433.9 4714-258929(1) 1776594-1950687 projects to clone region: AC113390.3 22452-196545(1) 1950688-1994121 projects to clone region: AC127460.2 1-43434(-1) 1994122-2000001 projects to clone region: AC114289.2 164862-170741(- 1) 8 Nov 2006 44/51 22

Question 5 a) Fetch any gene from clone 'AL049761.11', print details of the gene including the coordinate system and start/end positions. Hint: to get a gene, first fetch a slice containing the clone with fetch_by_region, then get a gene. b) Transform the gene to 'toplevel' and print the new details. 8 Nov 2006 45/51 External References Ensembl cross references its Gene predictions with identifiers from other databases. SwissProt P51587 Ensembl ENSP00000267071 ENSP00000308413 Q9UBS0 HUGO BRCA2 RPS6KB2 8 Nov 2006 46/51 23

External References Code Example # fetch a transcript using an external identifier my ($tr) = @{$transcript_adaptor->fetch_all_by_external_name('ttn')}; print "TTN is Ensembl transcript ", $transc->stable_id, "\n"; # get all external identifiers for a translation $transl = $tr->translation; print "External identifiers for ", $transl->stable_id, ":\n"; foreach my $db_entry (@{ $transl->get_all_dbentries }) { print $db_entry->dbname, ":", $db_entry->display_id, "\n"; } 8 Nov 2006 47/51 External References Sample Output TTN is Ensembl transcript ENST00000286104 External identifiers for ENSP00000286104: GO:GO:0005524 GO:GO:0016020 GO:GO:0005975 GO:GO:0006979 GO:GO:0004674 GO:GO:0006468 GO:GO:0004713 GO:GO:0004896 GO:GO:0007517 GO:GO:0008307 Uniprot/SPTREMBL:Q10465 EMBL:X90569 protein_id:caa62189.1 GO:GO:0006941 GO:GO:0030017 GO:GO:0004601 Uniprot/SPTREMBL:Q8WZ42 EMBL:AJ277892 protein_id:cad12456.1 8 Nov 2006 48/51 24

Question 6 a) Fetch a database entry for the Refseq_dna gene NP_808227. b) Fetch the equivalent ENSEMBL gene Hint: use a gene adaptor and fetch_all_by_external_name() c) Print out all the other linked database entries for the gene. 8 Nov 2006 49/51 Getting More Information Inline POD Perldoc viewer for inline API documentation: http://www.ensembl.org/info/software/core/index.html Tutorial document: http://www.ensembl.org/info/software/core/core_tutorial.html ensembl-dev mailing list: <ensembl-dev@ebi.ac.uk> 8 Nov 2006 50/51 25

Acknowledgements The Ensembl Core Team Glenn Proctor Ian Longden Patrick Meidl The rest of the Ensembl team 8 Nov 2006 51/51 26