New generation of patent sequence databases Information Sources in Biotechnology Japan EBI is an Outstation of the European Molecular Biology Laboratory.
Patent-related resources Patents Patent Resources 2 http://www.ebi.ac.uk/
Patent resources at EBI 3 http://www.ebi.ac.uk/patentdata/
Patent resources at EBI EPO Patent proteins: USPTO JPO KIPO Patent nucleotides: ENA (EPO, USPTO, JPO, KIPO) 4 Same sequences (EPO, USPTO, JPO, KIPO) Non-redundant sequence data Patent family classification Enriched with patent information
Sequence data from patent literature JPO USPTO NCBI GenBank NIG DDBJ KIPO INSDC 5 other patent offices INSDC agreement: Free unrestricted access All data exchanged daily EBI EMBL-Bank EPO NR patent sequence databases
Non-redundant patent databases Patent nucleotides Patent proteins Level-1 NRNL1 NRPL1 (Non-redundant (Non-redundant nucleotide level-1) protein level-1) Groups together 100% identical patent sequences Level-2 NRNL2 (Non-redundant NRPL2 (Non-redundant Groups together identical sequences nucleotide level-2) protein level-2) by patent family 6 http://www.ebi.ac.uk/patentdata/
Patent sequence record in NRNL1 7 Patents containing 100% identical sequence Sequence
8 Patent sequence record in NRNL2 Patent equivalents Sequence record in ENA Priority number and date Patent literature Translation Sequence
Non-redundant patent databases EMBL patents (redundant) Remove sequence redundancy Level-1 NR Group by patent families Additional annotation, including priority dates for patent families 9 Level-2 NR www.ebi.ac.uk
Patent sequence records at EBI Nucleotide ENA NRNL1 NRNL2 ~23.9 M PAT sequences (>230 M total) ~12.2 M sequences ~15.5 M sequences Protein Patent Proteins NRPL1 ~6.5 M PRT sequences (>32 M total) ~2.5 M sequences 10 NRPL2 ~3.8 M sequences
11 Sequence search
Sequence searching Tools Sequence Similarity & Analysis 12 http://www.ebi.ac.uk/
Sequence searching Wide variety of search tools 13 www.ebi.ac.uk/tools/sss/
Choosing the right search engine BLAST General search engine FASTA Better general search engine SSEARCH Sensitive but slow; good for short sequences GGSEARCH Force full-length matches Query Subject 14 GLSEARCH Match domains/patterns to protein; oligo-to-gene Query Subject
15 Search a variety of databases Protein *Select all 6 results in triplicate!! Patent databases
16 Search a variety of databases Nucleotide *Select all 3 results in triplicate!! Patent data
17 let s look at an example
Searching a redundant database Protein Example: Search patent protein sequence Patent proteins 18 http://www.ebi.ac.uk/tools/sss/
19 Results from a redundant database. >260 identical results too much to analyze
20 LEVEL-1 NR patent sequence database removes redundancy fewer results to analyze, less chance of missing important results
Searching NR level-1 patent database NR patent Level-1 Example: Search patent protein sequence NR patent level-1 21 http://www.ebi.ac.uk/tools/sss/
22 Results from NR level-1 database Each hit unique
23 Results from NR level-1 database List of all patents containing the sequence Earliest publication date Link to sequence entry Link to patent documentation
24 Patent families Simple Patent Family is a group of patents that relate to the same invention, and are based on the same originating application They arise when an invention is patented in multiple countries Grouping patents into families reduces multi-national results down to a representative member
Patent families patent family Invention A second patent family Invention B EP WO US US JP GM671154 ADA42650 CS017585 ACQ13114 DI603183 HB492658 AAR79155 DD649656 100% identical sequences Same sequence can appear multiple times in a database due to: Same invention filed multiple times in different offices (same patent family) Different inventors use the same sequence in different contexts (different 25 patent families)
26 LEVEL-2 NR patent sequence database groups identical sequences by patent family provides earliest priority date for family
Searching NR level-2 patent database NR patent Level-2 Example: Search patent protein sequence NR patent level-2 27 http://www.ebi.ac.uk/tools/sss/
28 Results from NR level-2 database Each hit = one family
29 Results from NR level-2 database Patent equivalents Earliest publication data in family Earliest active priority date in family
30 Results from NR level-2 database patents in same family Link to sequence entry Link to patent documentation
31 Text search
SRS: advanced text search 1 st : Select resources to search 2 nd : Create query 32 http://www.ebi.ac.uk/srs/
SRS: advanced text search Select library tab Sequence Searching Tools 33
SRS: advanced text search Search >100 databases Select library tab NR patent DNA (NRNL1 & NRNL2) NR patent proteins (NRPL1 & NRPL2) Sequence Searching Tools 34
SRS: advanced text search Search >100 databases Select library tab Example: Selected to search NR level-1 patent DNA database Sequence Searching Tools 35
SRS: advanced text search Select library tab Select resources to search Sequence Searching Tools 36
SRS: advanced text search Select library tab Select resources to search 1) Select field 2) Type in text Sequence Searching Tools 37
SRS: advanced text search Select library tab Select resources to search Sequence Searching Tools 38 Here, selected patent number
SRS: advanced text search Select library tab Select resources to search Create query Sequence Searching Tools 39
SRS: advanced text search Select library tab Select resources to search Create query Lists non-redundant nucleotide sequences from WO0146262 Sequence Searching Tools 40
SRS: advanced text search Select library tab Select resources to search Create query WO0146262 sequences Sequence Searching Tools 41
SRS: advanced text search Select library tab WO0146262 nucleotide sequence record in NRNL1 Select resources to search Create query WO0146262 sequences Sequence Searching Tools 42 Details which other patents also claim this sequence (with NRNL2, would see family grouping)
SRS: advanced text search Select library tab Select resources to search Create query NRNL1 sequence record WO0146262 sequences Sequence Searching Tools 43
SRS: advanced text search Select library tab Select resources to search Create query WO0146262 literature WO0146262 sequences NRNL1 sequence record Sequence Searching Tools 44 http://www.ebi.ac.uk/srs/
SRS: advanced text search EMBL-Bank Find all sequences associated with a patent NRNL1 Find all sequences associated with a patent + identify all patents associated with each sequence NRNL2 Find all sequences associated with a patent + identify all patents in the same family associated with each sequence Sequence Searching Tools 45
For more information Non-redundant 46 http://www.ebi.ac.uk/patentdata/
47 For more information User Manual Publication
48 Help Contacts: http://www.ebi.ac.uk/support/