Multifile Patent Sequence Searching on STN. Robert Austin FIZ Karlsruhe

Size: px
Start display at page:

Download "Multifile Patent Sequence Searching on STN. Robert Austin FIZ Karlsruhe"

Transcription

1 Multifile Patent Sequence Searching on STN Robert Austin FIZ Karlsruhe

2 Agenda Sequence searchable databases on STN Step-by-step through a multifile BLAST search Multifile post-processing using STN Express Overview of the search results Summary and resources See also: Sequence Basics e-seminar (June 2010): 2

3 STN sequence searchable databases DGENE Thomson Reuters GENESEQ TM Value-added patent sequence data from around the globe USGENE The USPTO Genetic Sequence Database All available sequence data from the USPTO PCTGEN WIPO/PCT Patent Application Biosequences All available e-published sequence data from WIPO CAS REGISTRY Chemical Abstracts Service (CAS) REGISTRY Worldwide value-added patent and non-patent sequences 3

4 DGENE, USGENE and PCTGEN offer three sequence search modes Sequence Code Match (motif) searching Using the RUN GETSEQ command BLAST similarity Using the RUN BLAST command FASTA similarity Using the RUN GETSIM command Note: this e-seminar covers BLAST. 4

5 CAS REGISTRY/CAplus offers two sequence search modes Sequence Code Match (motif) searching Using the Search (=> S) command BLAST similarity Using a separate Graphic User Interface Note: this e-seminar covers BLAST. 5

6 Multifile patent sequence searching Search Question: Find all patents that disclose Homo sapiens D- amino-acid oxidase (NCBI NP_001908), or similar sequences ( 80%): MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLS DPNNPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGF RKLTPRELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVA REGADVIVNCTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPY IIPGTQTVTLGGIFQLGNWSELNNIQDHNTIWEGCCRLEPTLKNARIIGERTGFRPV RPQIRLEREQLRTGPSNTEVIHNYGHGGYGLTIHWGCALEAAKLFGRILEEKKLSRM PPSHL (Search conducted on 7 th July 2010) 6

7 Multifile search strategy 1) RUN BLAST in DGENE, USGENE and PCTGEN using offline BATCH mode 2) Merge, organize by patent family, and display DGENE, USGENE and PCTGEN results 3) Repeat the search using CAS REGISTRY BLAST 4) Retrieve, identify, and display unique CAS REGISTRY BLAST CAplus records 5) Post-process DGENE, USGENE and PCTGEN results using the STN Express Table Tool 6) Post-process unique REGISTRY BLAST results using the BLAST Report Tool 7

8 SAVE, UPLOAD and VERIFY the query Prepare and save the query as a plain text file in a suitable text editor, e.g. Windows Notepad 8

9 SAVE, UPLOAD and VERIFY the query (cont.) (a) Click Upload Sequence (b) Choose the query file (c) Select the STN database (a) (b) (c) From the Discover! button menu. The sequence becomes a Query L-number in the database of choice for use with RUN BLAST. 9

10 SAVE, UPLOAD and VERIFY the query (cont.) => FILE USGENE => UPL R BLAST Commands in red are automatically run by the STN Express Sequence Query Upload wizard. Uploading C:\....\NP_ Homo sapiens DAO.txt UPLOAD SUCCESSFULLY COMPLETED L1 GENERATED Verify the sequence was uploaded => D L1 LQUE successfully with D LQUE. L1 ANSWER 1 USGENE COPYRIGHT 2010 SEQUENCEBASE CORP on STN LQUE MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSD PNNPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRK LTPRELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREG ADVIVNCTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPG TQTVTLGGIFQLGNWSELNNIQDHNTIWEGCCRLEPTLKNARIIGERTGFRPVRPQIR LEREQLRTGPSNTEVIHNYGHGGYGLTIHWGCALEAAKLFGRILEEKKLSRMPPSHL The sequence query is now ready for searching directly in DGENE, USGENE, or PCTGEN using the L-number (L1). 10

11 RUN the DGENE, USGENE and PCTGEN BLAST searches in BATCH mode => FILE DGENE FILE 'DGENE' ENTERED AT 17:05:31 ON 07 JUL 2010 COPYRIGHT (C) 2010 THOMSON REUTERS => RUN BLAST L1 /SQP -F F BATCH Add BATCH to the end of a RUN BLAST command to search in offline batch search mode. PLEASE ENTER BATCH IDENTIFIER (MAX. 8 CHARS):DAOP TO BE NOTIFIED WHEN THIS BATCH SEARCH IS COMPLETE, PLEASE ENTER YOUR ADDRESS (MAX. 50 CHARS) OR "NONE" INPUT: OR (END):ROBERT.AUSTIN@FIZ-KARLSRUHE.DE BLAST Version 2.2 The BLAST software is used herein with permission of the National Center for Biotechnology Information (NCBI) of the National Library of Medicine (NLM).... BATCH PROCESSING STARTED FOR DAOP New! Enter a valid address to be notified when the BATCH search is completed. 11

12 RUN the DGENE, USGENE and PCTGEN BLAST searches in BATCH mode (cont.) => FILE USGENE => RUN BLAST L1 /SQP -F F BATCH.... PLEASE ENTER BATCH IDENTIFIER (MAX. 8 CHARS):DAOP.... Note: DGENE, USGENE and PCTGEN BLAST searches can be run in parallel using BATCH mode. => FILE PCTGEN => RUN BLAST L1 /SQP -F F BATCH.... Turn the Low Complexity Filter off with the syntax: /SQP F F PLEASE ENTER BATCH IDENTIFIER (MAX. 8 CHARS):DAOP.... => LOG H Tip: use LOGOFF HOLD (LOG H) to be able to return to the same STN session within two hours. SESSION WILL BE HELD FOR 120 MINUTES STN INTERNATIONAL SESSION SUSPENDED AT 17:07:14 ON 07 JUL

13 Retrieve the BATCH search results => FILE DGENE FILE 'DGENE' ENTERED AT 17:11:25 ON 07 JUL 2010 COPYRIGHT (C) 2010 THOMSON REUTERS => RUN GETBATCH DAOP Use RUN GETBATCH to retrieve Please enter your batch identifier completed BATCH search results. or enter # for batch id list or enter * for batch id at top of list or enter - before batch id to delete or enter. for (end) Database DGENE AA Posted date: Jun 25, :33 PM.... ENTER EITHER THE NUMBER OF ANSWERS YOU WISH TO KEEP OR ENTER MINIMUM PERCENT OF SELF SCORE FOLLOWED BY % (BEST ANSWER PERCENTAGE OF SELF SCORE IS 100%) ENTER (ALL) OR? :80% L2 RUN STATEMENT CREATED L2 19 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYAD... MPPSHL/SQP.-F F In this example, 80% of the Query Self Score is used to select out just the most relevant results (L2). Answer set arranged by accession number; to sort by descending similarity score, enter at an arrow prompt (=>) "sor score d". 13

14 Retrieve the BATCH search results (cont.) => FILE USGENE => RUN GETBATCH DAOP.... Use RUN GETBATCH to retrieve completed BATCH search results. ENTER EITHER THE NUMBER OF ANSWERS YOU WISH TO KEEP OR ENTER MINIMUM PERCENT OF SELF SCORE FOLLOWED BY % (BEST ANSWER PERCENTAGE OF SELF SCORE IS 100%) ENTER (ALL) OR? :80% L3 RUN STATEMENT CREATED L3 14 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYAD... MPPSHL/SQP.-F F => FILE PCTGEN => RUN GETBATCH DAOP.... ENTER EITHER THE NUMBER OF ANSWERS YOU WISH TO KEEP OR ENTER MINIMUM PERCENT OF SELF SCORE FOLLOWED BY % (BEST ANSWER PERCENTAGE OF SELF SCORE IS 100%) ENTER (ALL) OR? :80% L4 RUN STATEMENT CREATED L4 3 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYAD... MPPSHL/SQP.-F F 14

15 Multifile search strategy 1) RUN BLAST in DGENE, USGENE and PCTGEN using offline BATCH mode 2) Merge, organize by patent family, and display DGENE, USGENE and PCTGEN results 3) Repeat the search using CAS REGISTRY BLAST 4) Retrieve, identify, and display unique CAS REGISTRY BLAST CAplus records 5) Post-process DGENE, USGENE and PCTGEN results using the STN Express Table Tool 6) Post-process unique REGISTRY BLAST results using the BLAST Report Tool 15

16 Merge the results into a single L-number => SET DUPORDER FILE SET COMMAND COMPLETED => DUP IDE L2 L3 L4 SET DUPORER FILE ensures that multifile records merged using DUP IDE are organized by database (file). FILE 'DGENE' ENTERED AT 17:16:56 ON 07 JUL 2010 COPYRIGHT (C) 2010 THOMSON REUTERS FILE 'USGENE' ENTERED AT 17:16:56 ON 07 JUL 2010 COPYRIGHT (C) 2010 SEQUENCEBASE CORP FILE 'PCTGEN' ENTERED AT 17:16:56 ON 07 JUL 2010 COPYRIGHT (C) 2010 WIPO PROCESSING COMPLETED FOR L2 PROCESSING COMPLETED FOR L3 PROCESSING COMPLETED FOR L4 L5 36 DUP IDE L2 L3 L4 (INCLUDES 0 SETS OF DUPLICATES) ANSWERS '1-19' FROM FILE DGENE => SOR IDENT D PROCESSING COMPLETED FOR L5 L6 36 SOR L5 IDENT D ANSWERS '20-33' FROM FILE USGENE ANSWERS '34-36' FROM FILE PCTGEN New! DUPLICATE IDENTIFY (DUP IDE) is used here to create a single multifile L-number (L5). The multifile L-number (L5) can be sorted by BLAST SCORE, or Percent Identity (IDENT). 16

17 Review multifile answers with a free-of-charge format including alignment => D L6 TRIAL SCORE ALIGN 1-36; FILE STNGUIDE L6 ANSWER 1 OF 36 DGENE COPYRIGHT 2010 THOMSON REUTERS on STN AN AAO23074 Protein DGENE TI Determining a genotype of an individual for preparing a composition for treating schizophrenia by determining the identity of a nucleotide at a biallelic marker of the D-amino acid oxidase gene of the polynucleotide in a sample - DESC Human D-amino acid oxidase wild-type protein. KW Biallelic marker; D-amino acid oxidase; DAO; neuroleptic; CNS disorder; movement; Parkinson's disease; Huntington's; motor neurone; Alzheimer's; mood; unipolar depression; bipolar;.... SQL 347 Query Self Score and percentage. SCORE % of query self score 731 BLASTALIGN Query = 347 letters Length = 347 Score = 731 bits (1886), Expect = 0.0 Identities = 347/347 (100%), Positives = 347/347 (100%) Query: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQP... MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQP Sbjct: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQP... 17

18 Review answers with a free-of-charge format including alignment (cont.) L6 ANSWER 4 OF 36 USGENE COPYRIGHT 2010 SEQUENCEBASE CORP on STN TI Collections of matched biological reagents and methods for identifying matched reagents (PublishedApplication) MTY Protein SQL 347 SCORE % of query self score 731 BLASTALIGN Query = 347 letters Length = 347 Score = 731 bits (1886), Expect = 0.0 Identities = 347/347 (100%), Positives = 347/347 (100%) BLAST Percent Identity (IDENT). Query: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN Sbjct: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN Query: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR Sbjct: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR Query: 121 ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN Sbjct: 121 ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN Query: 181 CTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPGTQ... CTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPGTQ Sbjct: 181 CTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPGTQ... 18

19 Review answers with a free-of-charge format including alignment (cont.) L6 ANSWER 28 OF 36 PCTGEN COPYRIGHT 2010 WIPO on STN TI ORGAN-SPECIFIC PROTEINS AND METHODS OFTHEIR USE MTY PRT SQL 347 SCORE % of query self score 731 BLASTALIGN Query = 347 letters Length = 347 Score = 728 bits (1879), Expect = 0.0 Identities = 346/347 (99%), Positives = 346/347 (99%) Query: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN MRVVVIGAGVIGLSTALCIHERYHSVLQPL IKVYADRFTPLTTTDVAAGLWQPYLSDPN Sbjct: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLHIKVYADRFTPLTTTDVAAGLWQPYLSDPN Query: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR Sbjct: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR Query: 121 ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN Sbjct: 121 ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN Query: 181 CTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPGTQ... CTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPGTQ Sbjct: 181 CTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPGTQ... 19

20 Ensure Capture Session is on to record a transcript for use in post-processing Note: Check the Capture Retrospectively box to capture the session so far, as well as the session from this point forwards. 20

21 Use the STN Express 8.4 Patent Family Manager wizard display the results Access the patent family manager wizard from the Discover! Menu. Choose a bibliographic display format with alignment for the first (best) hit, and a free-ofcharge format with alignment for the rest of the sequences in each patent family group. 21

22 The patent family manager begins by organising the results using FSORT... => FSORT L6.... L7 36 FSO L6 Commands in RED are those issued automatically by the STN Express Patent Family Manager. 11 Multi-record Families Answers 1-33 Family 1 Answers 1-5 Family 2 Answers 6-8 Family 3 Answers 9-10 Family 4 Answers Family 5 Answers Family 6 Answers Family 7 Answers Family 8 Answers Family 9 Answers Family 10 Answers Family 11 Answers Individual Records Answers Non-patent Records FSORT organizes the patent sequence records by Publication, Application, Related, and Priority numbers. In this example, 14 patent family groups (i.e ) are retrieved. 22

23 ...and then continues by displaying the family groups in the specified formats => DIS L7 PFAM=7 1 BIB,SQL,SCORE,IDENT,ALIGN L7 ANSWER 17 OF 36 DGENE COPYRIGHT 2010 THOMSON REUTERS on STN FAMILY7 AN AEL25470 protein DGENE TI Identifying compound that reduce/inhibit internal ribosome.... IN Fear M PA (TELE-N) TELETHON INST CHILD HEALTH RES. PI WO A AI WO 2006-AU PRAI AU PSL Disclosure; SEQ ID NO 18 LA English OS [76] CR N-PSDB: AEL25469 PC-NCBI: gi30446 PC-SWISSPROT: P14920 DESC Reporter protein SEQ ID NO:18. SQL 347 SCORE % of query self score 731 IDENT 99% BLASTALIGN Query = 347 letters Length = 347 Score = 726 bits (1873), Expect = 0.0 Identities = 345/347 (99%), Positives = 345/347 (99%) Commands in RED are those issued automatically by the STN Express Patent Family Manager. 23

24 ...and then continues by displaying the family groups in the specified formats (cont.) => DIS L7 PFAM=7 2-TOT TRIAL,SCORE,IDENT,ALIGN L7 ANSWER 18 OF 36 USGENE COPYRIGHT 2010 SEQUENCEBASE CORP on STNFAMILY7 TI Isolation of Inhibitors of IRES-Mediated Translation (PublishedApplication) DESC Homo Sapiens Protein; sequence 18 of 148 MTY Protein SQL 347 SCORE % of query self score 731 IDENT 99% BLASTALIGN Query = 347 letters Length = 347 Score = 726 bits (1873), Expect = 0.0 Identities = 345/347 (99%), Positives = 345/347 (99%) This USGENE hit is in the same family as the DGENE record on the previous slide (FAMILY 7). Query: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN MRVVVIGAGVIGLSTALCIHERYHSVLQPL IKVYADRFTPLTTTDVAAGLWQPYLSDPN Sbjct: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLHIKVYADRFTPLTTTDVAAGLWQPYLSDPN Query: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR Sbjct: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR Query: 121 ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN Sbjct: 121 ELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVN

25 ...and then continues by displaying the family groups in the specified formats (cont.) => DIS L BIB,SQL,SCORE,IDENT,ALIGN L7 ANSWER 34 OF 36 USGENE COPYRIGHT 2010 SEQUENCEBASE CORP on STN AN Protein USGENE TI Collections of matched biological reagents and methods for identifying matched reagents (PublishedApplication) IN Carrino John (San Diego, CA); Liang Feng (San Diego, CA) PA Invitrogen Corporation (Carlsbad CA) PI US A AI US DT Patent SQL 347 SCORE % of query self score 731 IDENT 100% BLASTALIGN Query = 347 letters Length = 347 Score = 731 bits (1886), Expect = 0.0 Identities = 347/347 (100%), Positives = 347/347 (100%) This USGENE record is the first of the 3 individual records in the FSORT answer set (L7). Query: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN Sbjct: 1 MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPN Query: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR Sbjct: 61 NPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPR

26 Multifile search strategy 1) RUN BLAST in DGENE, USGENE and PCTGEN using offline BATCH mode 2) Merge, organize by patent family, and display DGENE, USGENE and PCTGEN results 3) Repeat the search using CAS REGISTRY BLAST 4) Retrieve, identify, and display unique CAS REGISTRY BLAST CAplus records 5) Post-process DGENE, USGENE and PCTGEN results using the STN Express Table Tool 6) Post-process unique REGISTRY BLAST results using the BLAST Report Tool 26

27 Typical steps of CAS REGISTRY BLAST 1. Launch BLAST 2. Search the sequence 3. Examine and evaluate alignment/relevance of sequence answers 4. Display STN data on sequences REGISTRY 5. Display STN data on sequences CAplus SM Limit CAplus results, if necessary Display CAplus data (references and HITRN) 6. Post-process BLAST alignment data 27

28 Launch CAS REGISTRY BLAST The Result Set Manager is the starting point To begin a new sequence search To review results of previous sequence searches 28

29 Input the search query Sequences can be input by Copy/paste Read from a file Recall a previously searched sequence within the same session Sequence line numbers do not interfere with the search. 29

30 Select the BLAST program The following programs are most typically run: BLASTn for nucleotides BLASTp for proteins/peptides 30

31 Verify BLAST settings Default values have been set to optimize sequence searches for researchers. Recommended settings for patent searches: Low Complexity Filtering unchecked Max No. of Answers

32 View results Highlight the result set to be viewed, and click on View Results. 32

33 Evaluate the alignment report The negative sign represents that the alignment details are shown. Detail information such as the sequence length, score, percent identity are available. 33

34 Select sequences of interest Sequences can be selected: In groups, using the color bar in the Alignment Scores Individually, by selecting the check box To transfer the sequence data to STN, click the Get STN Data button. 34

35 Get STN Data and Save alignments (.xss) The alignment data is saved in STN Express Saved Sequences (.xss) format. Alignment data needs to be transferred for post-processing. 35

36 Transfer sequences to STN Logon to STN and a REGISTRY search of the sequences is automatic. Results display can be accomplished using either Discover! wizards or command line input. Note: Type END or click Cancel to get out of the Display Wizard. You can turn off the Display Wizard in Preferences. Display sequences if desired. 36

37 Multifile search strategy 1) RUN BLAST in DGENE, USGENE and PCTGEN using offline BATCH mode 2) Merge, organize by patent family, and display DGENE, USGENE and PCTGEN results 3) Repeat the search using CAS REGISTRY BLAST 4) Retrieve, identify, and display unique CAS REGISTRY BLAST CAplus records 5) Post-process DGENE, USGENE and PCTGEN results using the STN Express Table Tool 6) Post-process unique REGISTRY BLAST results using the BLAST Report Tool 37

38 Display additional CAplus answers including the HITRN for alignment post-processing => FILE HCAPLUS FILE 'HCAPLUS' ENTERED AT 17:25:10 ON 07 JUL 2010 COPYRIGHT (C) 2010 AMERICAN CHEMICAL SOCIETY (ACS) => S L12 AND PATENT/DT L13 12 L12 AND PATENT/DT => TRANSFER L6 PN 1- L14 TRANSFER L6 1- PN : 20 TERMS L15 29 L14 ALL TERMS IN L14 RETRIEVED. => S L13 NOT L15 L16 2 L13 NOT L15 => D BIB HITRN 1-2 The 44 REGISTRY records (L12) correspond to 12 HCAplus patent records (L13). Transfer Publication Numbers (PN) from DGENE/USGENE/PCTGEN (L6) to find corresponding HCAplus records (L15). In this example, 2 additional, highly relevant references have been found by including the REGISTRY/HCAplus search (L16). 38

39 Example: Unique REGISTRY/CAplus result L16 ANSWER 1 OF 2 HCAPLUS COPYRIGHT 2010 ACS on STN AN 2002: HCAPLUS DN 137:1836 TI Measurement of DNA methylation for analysis of the toxicology.... IN Olek, Alexander; Piepenbrock, Christian; Berlin, Kurt PA Epigenomics Ag, Germany SO PCT Int. Appl., 113 pp. CODEN: PIXXD2 LA German FAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE PI WO A WO 2001-EP PRAI DE A WO 2001-EP12951 W Note: HITRN must be included, IT , Protein (human 347-amino acid) RL: BSU (Biological study, unclassified); so that PRP the (Properties); CAS REGISTRY BIOL (Biological study) BLAST alignments can be (amino acid sequence; measurement of DNA methylation for anal. of the toxicol. of substances) merged into the BLAST Report. 39

40 Multifile search strategy 1) RUN BLAST in DGENE, USGENE and PCTGEN using offline BATCH mode 2) Merge, organize by patent family, and display DGENE, USGENE and PCTGEN results 3) Repeat the search using CAS REGISTRY BLAST 4) Retrieve, identify, and display unique CAS REGISTRY BLAST CAplus records 5) Post-process DGENE, USGENE and PCTGEN results using the STN Express Table Tool 6) Post-process unique REGISTRY BLAST results using the BLAST Report Tool 40

41 Access the Table Tool and select the multifile search Transcript file The most recent STN session Transcript is usually listed here. 41

42 Choose a template and select content Option: choose a predefined custom template from a previous project. L7 is the DGENE, USGENE and PCTGEN FSORTed answer set. 42

43 Select fields, column order, headings, fonts and spacing for the table The pre-defined custom template included a list of fields. These can be further customized and the template re-saved. 43

44 Review, adjust, and export the table 44

45 Explore the results further in Microsoft Excel Some tips for Microsoft Excel: Resize columns and rows as desired especially the BLAST alignment column to approx 77 View, Freeze panes holds the top row fixed when scrolling down Add Filters provides a great way to navigate results for example by BLAST percent identity (above) 45

46 Multifile search strategy 1) RUN BLAST in DGENE, USGENE and PCTGEN using offline BATCH mode 2) Merge, organize by patent family, and display DGENE, USGENE and PCTGEN results 3) Repeat the search using CAS REGISTRY BLAST 4) Retrieve, identify, and display unique CAS REGISTRY BLAST CAplus records 5) Post-process DGENE, USGENE and PCTGEN results using the STN Express Table Tool 6) Post-process unique REGISTRY BLAST results using the BLAST Report Tool 46

47 Post-process REGISTRY BLAST alignments Download the post-processing template (.PRF) files used in this seminar: 47

48 Select BLAST alignment report The first step is to select the XSS file to include in the BLAST report. Important: If your BLAST query is fairly long, or a nucleic acid, or the answers may exceed 1000 characters, make sure you change the value in the Do not include alignments longer than box. Post-processing then continues via standard STN Express Custom Report Tool steps. 48

49 Select the session Transcript and template The most recent STN session Transcript is usually listed here. Option: choose a predefined custom template from a previous project. 49

50 Select the records to be processed L16 is REGISTRY/CAplus additional unique answers. 50

51 Select fields, fonts and spacing for the report The pre-defined custom template included a list of fields. These can be further customized and the template re-saved. 51

52 Review, adjust, and export the report 52

53 Overview of search results for Homo sapiens D- amino-acid oxidase unique in (red) SEQs 80% PNs Patent Families* DGENE (1) USGENE (2) PCTGEN (1) REGISTRY (2) NCBI (0) Total Unique (* Patent families = INPADOC Patent Families. Specifically, family records in INPAFAMDB.)

54 Summary RUN BLAST is available for searching DGENE, USGENE and PCTGEN directly on STN CAS REGISTRY BLAST provides BLAST searching options for the REGISTRY database DGENE, USGENE and PCTGEN multifile search results can be post-processed into tables, and exported to Microsoft Excel, using STN Express CAS REGISTRY BLAST alignment data can be merged with CAplus records, and exported in to RTF format, to form single unified report All four STN sequence databases are required for a comprehensive patent sequence search 54

55 Resources for sequence searching on STN Sequence Searching on STN modular workshop CAS REGISTRY sequence searching resources DGENE Workshop Manual USGENE Workshop Manual USGENE Workshop Manual Multifile Supplement: 55

56 CAS Support and Training: For more information FIZ Karlsruhe Support and Training:

Sequence Basics. Robert Austin FIZ Karlsruhe

Sequence Basics. Robert Austin FIZ Karlsruhe Sequence Basics Robert Austin FIZ Karlsruhe Agenda Sequence searchable databases on STN BLAST in DGENE, USGENE and PCTGEN CAS REGISTRY SM BLAST Sequence code match (motif) searching Recent enhancements

More information

Sequence Basics. Robert Austin FIZ Karlsruhe

Sequence Basics. Robert Austin FIZ Karlsruhe Sequence Basics Robert Austin FIZ Karlsruhe Agenda Sequence searchable databases on STN BLAST in DGENE, USGENE and PCTGEN CAS REGISTRY SM BLAST Sequence code match (motif) searching Resources See also:

More information

Sequence Basics. Robert Austin FIZ Karlsruhe

Sequence Basics. Robert Austin FIZ Karlsruhe Sequence Basics Robert Austin FIZ Karlsruhe Agenda Sequence searchable databases on STN BLAST in DGENE, USGENE and PCTGEN CAS REGISTRY SM BLAST Sequence code match (motif) searching Recent enhancements

More information

Taking command effective use of sequence search options in USGENE, DGENE and PCTGEN. Robert Austin FIZ Karlsruhe

Taking command effective use of sequence search options in USGENE, DGENE and PCTGEN. Robert Austin FIZ Karlsruhe Taking command effective use of sequence search options in USGENE, DGENE and PCTGEN Robert Austin FIZ Karlsruhe Agenda 2 Sequence searchable databases on STN Introduction to USGENE Command line sequence

More information

GENESEQ UNIQUE ADDED VALUE. Brian Larner Solution Consultant Thomson Reuters

GENESEQ UNIQUE ADDED VALUE. Brian Larner Solution Consultant Thomson Reuters GENESEQ UNIQUE ADDED VALUE Brian Larner Solution Consultant Thomson Reuters THOMSON REUTERS GENESEQ World s largest source of patented sequences more than 218 000 unique patents from around the world from

More information

Creating IP Reports Integrating Sequence, Family, and Hit Structure data with BizInt Smart Charts

Creating IP Reports Integrating Sequence, Family, and Hit Structure data with BizInt Smart Charts Patents & IP Sequences Clinical Trials Drug Pipelines Creating IP Reports Integrating Sequence, Family, and Hit Structure data with BizInt Smart Charts PIUG 2019 Biotechnology Conference, Boston MA John

More information

STN. High Quality, Reliable Content for Precision Searching

STN. High Quality, Reliable Content for Precision Searching High Quality, Reliable Content for Precision Searching The STN advantage STN is an online service offering global access to the most important and comprehensive sci-tech and patent databases STN is the

More information

New STN and BizInt Smart Charts

New STN and BizInt Smart Charts BizInt Smart Charts 2015 1 New STN and BizInt Smart Charts STN Patent Forum @ PIUG NE Conference October 12, 2015 John Willmore, VP Product Development BizInt Smart Charts 2015 2 Agenda What is BizInt

More information

Descriptions of the most

Descriptions of the most Descriptions of the most frequently used databases Descriptions of the most frequently used databases Nordic Patent Institute utilizes the examiners of the Norwegian and Danish Patent Offices who both

More information

Creating an Index of Hit Structures using BizInt Smart Charts for Patents

Creating an Index of Hit Structures using BizInt Smart Charts for Patents Patents & IP Sequences Clinical Trials Drug Pipelines Creating an Index of Hit Structures using BizInt Smart Charts for Patents John Willmore, VP Product Development EPO PIC Workshop, Brussels, 14 November

More information

Value-added Features of Commercial Patent Information Resources

Value-added Features of Commercial Patent Information Resources Value-added Features of Commercial Patent Information Resources Andrew Czajkowski Head, Innovation and Technology Support Section Lusaka July 16, 2014 Overview Patent Databases Free Coverage Commercial

More information

Getting Started with SciFinder 2007

Getting Started with SciFinder 2007 Getting Started with SciFinder 2007 for Windows November 2006 Copyright 2006 American Chemical Society. All Rights Reserved. SciFinder is a registered trademark of the American Chemical Society. Getting

More information

How to Work with a Substance Answer Set

How to Work with a Substance Answer Set How to Work with a Substance Answer Set Easily identify and isolate substances of interest Quickly retrieve relevant information from the world s largest, publicly available substance database. This guide

More information

New STN and BizInt Smart Charts

New STN and BizInt Smart Charts BizInt Smart Charts 2015 1 New STN and BizInt Smart Charts EPO Patent Information Conference November 11, 2015 John Willmore, VP Product Development BizInt Smart Charts 2015 2 Agenda Creating reports from

More information

Finding Information. Summer Use the New Features. Markush Search

Finding Information. Summer Use the New Features. Markush Search Use the New Features Summer 2010 This release of SciFinder provides a pathway to finding additional patent information via Markush searching. Other search and usability enhancements include automatic removal

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

Tutorial 4 BLAST Searching the CHO Genome

Tutorial 4 BLAST Searching the CHO Genome Tutorial 4 BLAST Searching the CHO Genome Accessing the CHO Genome BLAST Tool The CHO BLAST server can be accessed by clicking on the BLAST button on the home page or by selecting BLAST from the menu bar

More information

Getting Started with SciFinder Scholar TM 2006 for Mac OS X

Getting Started with SciFinder Scholar TM 2006 for Mac OS X Getting Started with SciFinder Scholar TM 2006 for Mac OS X June 2006 Copyright 2006 American Chemical Society. All Rights Reserved. SciFinder Scholar is a trademark of the American Chemical Society. Mac

More information

Locate patents which contain a biological sequence of interest in GENESEQ

Locate patents which contain a biological sequence of interest in GENESEQ GENESEQ and Derwent Innovation Blueprint for Success Ensure freedom to operate around a biological sequence Do we have freedom-to-operate around specific biological sequences? Can we commercialize our

More information

Trilateral Search Guidebook in Biotechnology. [Ver.1 Publication ]

Trilateral Search Guidebook in Biotechnology. [Ver.1 Publication ] Trilateral Project DR2 Biotechnology Trilateral Search Guidebook in Biotechnology [Ver.1 Publication ] Part I 26 April 2007 United States Patent and trademark Office European Patent Office Japan Patent

More information

Searching Validity/Invalidity In the Pharmaceutical and Chemical Field. Judy Johnson Philipsen, TPRI Northeast PIUG Conference October 9, 2007

Searching Validity/Invalidity In the Pharmaceutical and Chemical Field. Judy Johnson Philipsen, TPRI Northeast PIUG Conference October 9, 2007 Searching Validity/Invalidity In the Pharmaceutical and Chemical Field Judy Johnson Philipsen, TPRI Northeast PIUG Conference October 9, 2007 Purpose of Validity/Invalidity Search Determine whether a patent

More information

How to Work with a Reference Answer Set

How to Work with a Reference Answer Set How to Work with a Reference Answer Set Easily identify and isolate references of interest Quickly retrieve relevant information from the world s largest, publicly available reference database for chemistry

More information

Page Images STN AnaVist. All information from Database Summary Sheets Additional subject information Current price list

Page Images STN AnaVist. All information from Database Summary Sheets Additional subject information Current price list Subject Coverage File Type Features Database Description Database Language Database Name Database Producer Display Fields File Data Directory Thesaurus Price List Property Fields Sample Records Search

More information

Enhancing Patent Family Display in BizInt Smart Charts in Patents

Enhancing Patent Family Display in BizInt Smart Charts in Patents BizInt Smart Charts 2015 1 Enhancing Patent Family Display in BizInt Smart Charts in Patents PIUG 2015 Annual Conference, Lombard IL May 2015 John Willmore, VP Product Development BizInt Smart Charts 2015

More information

EASY ACCESS TO STN. Finding sci-tech information has never been easier! FIZ KARLSRUHE

EASY ACCESS TO STN. Finding sci-tech information has never been easier! FIZ KARLSRUHE EASY ACCESS TO STN Finding sci-tech information has never been easier! FIZ KARLSRUHE Welcome to STN Easy EASY does it! Are you frequently searching for scientific and technical information, but don t want

More information

Getting Started with SciFinder Scholar TM 2006

Getting Started with SciFinder Scholar TM 2006 Getting Started with SciFinder Scholar TM 2006 for Windows August 2005 Copyright 2005 American Chemical Society All Rights Reserved Getting Started 3 Getting Started with SciFinder Scholar TM 2006 Welcome

More information

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege Sequence Alignment GBIO0002 Archana Bhardwaj University of Liege 1 What is Sequence Alignment? A sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity.

More information

PRECISE SEARCHING FOR PROFESSIONALS

PRECISE SEARCHING FOR PROFESSIONALS PRECISE SEARCHING FOR PROFESSIONALS STN offers a unique content collection, unparalleled search power and proven reliability. STN connects information professionals worldwide with essential patent and

More information

Integrating Word, Excel, Access, and PowerPoint

Integrating Word, Excel, Access, and PowerPoint Integrating Word, Excel, Access, and PowerPoint Microsoft Office 2013 Session 1: Integrating Word and Excel Objectives: Embed an Excel chart in a Word document Edit an Excel chart in a Word document Link

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2017 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

ReaxysTutorial. Dr. QF Carlos F. Lagos

ReaxysTutorial. Dr. QF Carlos F. Lagos ReaxysTutorial Dr. QF Carlos F. Lagos Agenda 1) Reaxys Basics Main Settings Query Menu: Reaction, Substances and Properties, Authors and citations Generate a structure t from a name Commercial Availability

More information

Basic Local Alignment Search Tool (BLAST)

Basic Local Alignment Search Tool (BLAST) BLAST 26.04.2018 Basic Local Alignment Search Tool (BLAST) BLAST (Altshul-1990) is an heuristic Pairwise Alignment composed by six-steps that search for local similarities. The most used access point to

More information

DWPIM (Derwent Markush Resource)

DWPIM (Derwent Markush Resource) (Derwent Markush Resource) Subject Coverage File Type Access Organic and organometallic compounds Inorganic compounds, polymers, peptides and partially defined structures Markush Structures The file is

More information

SciFinder Training Materials

SciFinder Training Materials SciFinder Training Materials # Contents Page How to Create a Substance Answer Set - Search by chemical structure, molecular formula, and substance identifier How to Work with a Substance Answer Set - Analyze

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Derwent Innovations Index

Derwent Innovations Index Derwent Innovations Index DERWENT INNOVATIONS INDEX Quick reference card ISI Web of Knowledge SM Derwent Innovations Index is a powerful patent research tool, combining Derwent World Patents Index, Patents

More information

Genome Browsers - The UCSC Genome Browser

Genome Browsers - The UCSC Genome Browser Genome Browsers - The UCSC Genome Browser Background The UCSC Genome Browser is a well-curated site that provides users with a view of gene or sequence information in genomic context for a specific species,

More information

STN Express Frequently Asked Technical Questions

STN Express Frequently Asked Technical Questions STN Express Frequently Asked Technical Questions This document answers the most frequently asked technical questions related to STN Express. Note: Support for Macintosh versions of STN Express was discontinued

More information

Derwent Innovations Index

Derwent Innovations Index ISI WEB OF KNOWLEDGE SM Derwent Innovations Index Quick Reference Card Derwent Innovations Index is a powerful patent research tool, combining Derwent World Patents Index, Patents Citation Index TM, and

More information

STN Express 8.6 User Guide

STN Express 8.6 User Guide STN Express 8.6 User Guide January 2016 Copyright 2016 American Chemical Society. All Rights Reserved. Table of Contents Welcome to STN Express, Version 8.6...13 STN Express System Requirements...14 Logon

More information

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment An Introduction to NCBI BLAST Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment Resources: The BLAST web server is available at https://blast.ncbi.nlm.nih.gov/blast.cgi

More information

Getting Started with SciFinder Scholar TM (2004 Edition)

Getting Started with SciFinder Scholar TM (2004 Edition) Getting Started with SciFinder Scholar TM (2004 Edition) for Windows and Macintosh August 2003 Copyright 2003 American Chemical Society All Rights Reserved Getting Started 3 Getting Started with SciFinder

More information

Current awareness searching in STN patent databases. eseminar, 10 th October 2012

Current awareness searching in STN patent databases. eseminar, 10 th October 2012 Current awareness searching in STN patent databases eseminar, 10 th October 2012 Agenda Basics of Alerts (SDIs) Automatic SDIs Multifile SDI and SDI package Specific features for major STN patent databases

More information

FIZ AutoDoc The STN Document Delivery System

FIZ AutoDoc The STN Document Delivery System FIZ AutoDoc The STN Document Delivery System Agenda Who we are and what we do STN Full Text Solutions FIZ AutoDoc Document Delivery Access Interfaces Intelligent Ordering Process Order Tracking and History

More information

Viewing Molecular Structures

Viewing Molecular Structures Viewing Molecular Structures Proteins fulfill a wide range of biological functions which depend upon their three dimensional structures. Therefore, deciphering the structure of proteins has been the quest

More information

Bioinformatics Hubs on the Web

Bioinformatics Hubs on the Web Bioinformatics Hubs on the Web Take a class The Galter Library teaches a related class called Bioinformatics Hubs on the Web. See our Classes schedule for the next available offering. If this class is

More information

STN AnaVist TM See How It Works

STN AnaVist TM See How It Works STN AnaVist TM See How It Works 2015 Table of Contents Save for STN AnaVist... 3 Searching for STN Databases... 3 Use the Save for STN AnaVist Wizard... 4 Additional Information... 5 Importing an Answer

More information

Homology Modeling FABP

Homology Modeling FABP Homology Modeling FABP Homology modeling is a technique used to approximate the 3D structure of a protein when no experimentally determined structure exists. It operates under the principle that protein

More information

New generation of patent sequence databases Information Sources in Biotechnology Japan

New generation of patent sequence databases Information Sources in Biotechnology Japan New generation of patent sequence databases Information Sources in Biotechnology Japan EBI is an Outstation of the European Molecular Biology Laboratory. Patent-related resources Patents Patent Resources

More information

Sequence alignment theory and applications Session 3: BLAST algorithm

Sequence alignment theory and applications Session 3: BLAST algorithm Sequence alignment theory and applications Session 3: BLAST algorithm Introduction to Bioinformatics online course : IBT Sonal Henson Learning Objectives Understand the principles of the BLAST algorithm

More information

NCBI News, November 2009

NCBI News, November 2009 Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved

More information

BIBLIODATA. Subject Coverage. File Type. Features Thesaurus None. Record Content. File Size. Coverage Updates. Language.

BIBLIODATA. Subject Coverage. File Type. Features Thesaurus None. Record Content. File Size. Coverage Updates. Language. Subject Coverage File Type The database is multidisciplinary. Parts of the German National Bibliography, which are included in : A: Monographs and periodicals from the publishers' book trade B: Monographs

More information

automate parts of Save time Stay up to date Share with colleagues

automate parts of Save time Stay up to date Share with colleagues Using Scripts in Streamline or automate parts of your search strategy Save time Stay up to date Share with colleagues What are Scripts? STN scripting is a mini programming language used to create repeatable

More information

SciFinder Training Materials July 2017

SciFinder Training Materials July 2017 SciFinder Training Materials July 07 Table of contents: Contents Slide No. SciFinder Overview How to Create a Reference Answer Set Search by Research Topic 6 How to Work with a Reference Answer Set 8 Search

More information

Bioinformatics explained: BLAST. March 8, 2007

Bioinformatics explained: BLAST. March 8, 2007 Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics

More information

INFODATA. Subject Coverage. File Type. Features Thesaurus None. Record Content. File Size citation (07/2017) Coverage. Updates.

INFODATA. Subject Coverage. File Type. Features Thesaurus None. Record Content. File Size citation (07/2017) Coverage. Updates. Subject Coverage File Type Artificial intelligence, expert systems, computational linguistics Databases and information systems Documentation of literature, facts and data Filing systems and library automation

More information

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into

More information

Database Searching Using BLAST

Database Searching Using BLAST Mahidol University Objectives SCMI512 Molecular Sequence Analysis Database Searching Using BLAST Lecture 2B After class, students should be able to: explain the FASTA algorithm for database searching explain

More information

SAS Web Report Studio 3.1

SAS Web Report Studio 3.1 SAS Web Report Studio 3.1 User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Web Report Studio 3.1: User s Guide. Cary, NC: SAS

More information

Access to value-added Patent Information via STN International

Access to value-added Patent Information via STN International Access to value-added Patent Information via STN International Gerd Tittlbach FIZ Karlsruhe SPTO Seminar, Madrid, 6-7 May 2002 1 STN International a co-operatively operated Online Service in Science &

More information

MetaPhyler Usage Manual

MetaPhyler Usage Manual MetaPhyler Usage Manual Bo Liu boliu@umiacs.umd.edu March 13, 2012 Contents 1 What is MetaPhyler 1 2 Installation 1 3 Quick Start 2 3.1 Taxonomic profiling for metagenomic sequences.............. 2 3.2

More information

The Merck Index on MedicinesComplete. User Guide

The Merck Index on MedicinesComplete. User Guide The Merck Index on MedicinesComplete User Guide The Merck Index on MedicinesComplete User Guide 1 About The Merck Index... 3 2 The interface... 3 2.1 The top bar... 3 2.2 The document area... 3 3 Finding

More information

Blast2GO Teaching Exercises SOLUTIONS

Blast2GO Teaching Exercises SOLUTIONS Blast2GO Teaching Exerces SOLUTIONS Ana Conesa and Stefan Götz 2012 BioBam Bioinformatics S.L. Valencia, Spain Contents 1 Annotate 10 sequences with Blast2GO 2 2 Perform a complete annotation with Blast2GO

More information

Data Walkthrough: Background

Data Walkthrough: Background Data Walkthrough: Background File Types FASTA Files FASTA files are text-based representations of genetic information. They can contain nucleotide or amino acid sequences. For this activity, students will

More information

Multiple Sequence Alignment

Multiple Sequence Alignment Introduction to Bioinformatics online course: IBT Multiple Sequence Alignment Lec3: Navigation in Cursor mode By Ahmed Mansour Alzohairy Professor (Full) at Department of Genetics, Zagazig University,

More information

PowerSchool Handbook Federal Survey Form Report

PowerSchool Handbook Federal Survey Form Report Handbook Federal Survey Form Report Version 2.1 August 22, 2018 Copyright 2018, San Diego Unified School District. All rights reserved. This document may be reproduced internally by San Diego Unified School

More information

What's new on STN. Erfahrungsaustausch Patente Basim Rahman

What's new on STN. Erfahrungsaustausch Patente Basim Rahman What's new on STN Erfahrungsaustausch Patente 2009 Basim Rahman Session Agenda STN-K database enhancements Software enhancements - STN Express 8.4 2 STN-K database enhancements DWPI: Japanese classifications:

More information

SQL Studio (BC) HELP.BCDBADASQL_72. Release 4.6C

SQL Studio (BC) HELP.BCDBADASQL_72. Release 4.6C HELP.BCDBADASQL_72 Release 4.6C SAP AG Copyright Copyright 2001 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express

More information

Wilson Leung 05/27/2008 A Simple Introduction to NCBI BLAST

Wilson Leung 05/27/2008 A Simple Introduction to NCBI BLAST A Simple Introduction to NCBI BLAST Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment Resources: The BLAST web server is available at http://www.ncbi.nih.gov/blast/

More information

Recommendation for the Disclosure of Sequence Listings using XML (ST.26) Sue Wolski Office of PCT Legal Administration

Recommendation for the Disclosure of Sequence Listings using XML (ST.26) Sue Wolski Office of PCT Legal Administration Recommendation for the Disclosure of Sequence Listings using XML (ST.26) Sue Wolski Office of PCT Legal Administration 1 Overview Background on revision of ST.25 Transition from ST.25 to ST.26 Request

More information

Apply your Skills - E-seminar for the experienced searcher not familiar with STN. FIZ Karlsruhe

Apply your Skills - E-seminar for the experienced searcher not familiar with STN. FIZ Karlsruhe Apply your Skills - E-seminar for the experienced searcher not familiar with STN FIZ Karlsruhe STN Apply Your Skills You know how to search on other online services and you want to learn how to do similar

More information

CAS / SciFinder Web Basic Training (Eng.)

CAS / SciFinder Web Basic Training (Eng.) A division of the American Chemical Society www.cas.org CAS / SciFinder Web Basic Training (Eng.) 2009.10 Agenda Briefly introduce Explore Reference Explore Substance Explore Reaction 2 SciFinder Web https://scifinder.cas.org

More information

Lab 4: Multiple Sequence Alignment (MSA)

Lab 4: Multiple Sequence Alignment (MSA) Lab 4: Multiple Sequence Alignment (MSA) The objective of this lab is to become familiar with the features of several multiple alignment and visualization tools, including the data input and output, basic

More information

Reporter Tutorial: Intermediate

Reporter Tutorial: Intermediate Reporter Tutorial: Intermediate Refer to the following sections for guidance on using these features of the Reporter: Lesson 1 Data Relationships in Reports Lesson 2 Create Tutorial Training Report Lesson

More information

PowerSchool Handbook Federal Survey Card Report

PowerSchool Handbook Federal Survey Card Report Handbook Federal Survey Card Report Version 1.0 August 9, 2017 Copyright 2017, San Diego Unified School District. All rights reserved. This document may be reproduced internally by San Diego Unified School

More information

Beilstein (Elsevier/MDL CrossFire Commander 7.1)

Beilstein (Elsevier/MDL CrossFire Commander 7.1) Introduction Beilstein (Elsevier/MDL CrossFire Commander 7.1) Beilstein vs. SciFinder Scholar As two of the most important databases in chemistry, Beilstein and SciFinder Scholar serve different needs.

More information

VUEWorks Report Generation Training Packet

VUEWorks Report Generation Training Packet VUEWorks Report Generation Training Packet Thursday, June 21, 2018 Copyright 2017 VUEWorks, LLC. All rights reserved. Page 1 of 53 Table of Contents VUEWorks Reporting Course Description... 3 Generating

More information

How to Save, Print and Export Answers

How to Save, Print and Export Answers How to Save, Print and Export Answers Keep your SciFinder answers for future use Keep answer sets for future use with print, save and export capabilities. To generate a hardcopy of part or all of your

More information

Biostatistics and Bioinformatics Molecular Sequence Databases

Biostatistics and Bioinformatics Molecular Sequence Databases . 1 Description of Module Subject Name Paper Name Module Name/Title 13 03 Dr. Vijaya Khader Dr. MC Varadaraj 2 1. Objectives: In the present module, the students will learn about 1. Encoding linear sequences

More information

Blackbaud StudentInformationSystem. Mail Guide

Blackbaud StudentInformationSystem. Mail Guide Blackbaud StudentInformationSystem Mail Guide 102411 2011 Blackbaud, Inc. This publication, or any part thereof, may not be reproduced or transmitted in any form or by any means, electronic, or mechanical,

More information

The Menu and Toolbar in Excel (see below) look much like the Word tools and most of the tools behave as you would expect.

The Menu and Toolbar in Excel (see below) look much like the Word tools and most of the tools behave as you would expect. Launch the Microsoft Excel Program Click on the program icon in Launcher or the Microsoft Office Shortcut Bar. A worksheet is a grid, made up of columns, which are lettered and rows, and are numbered.

More information

COPYRIGHT 2014 AMERICAN CHEMICAL SOCIETY ALL RIGHTS RESERVED PRINTED IN THE U.S.A.

COPYRIGHT 2014 AMERICAN CHEMICAL SOCIETY ALL RIGHTS RESERVED PRINTED IN THE U.S.A. STNINDEX USER GUIDE COPYRIGHT 2014 AMERICAN CHEMICAL SOCIETY ALL RIGHTS RESERVED PRINTED IN THE USA Quoting or copying of material from this publication for educational purposes is encouraged, provided

More information

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI.

2. Take a few minutes to look around the site. The goal is to familiarize yourself with a few key components of the NCBI. 2 Navigating the NCBI Instructions Aim: To become familiar with the resources available at the National Center for Bioinformatics (NCBI) and the search engine Entrez. Instructions: Write the answers to

More information

Quick Reference Guide

Quick Reference Guide Quick Reference Guide Table of Contents Homepage My Settings Generate a Structure from a Name Reactions Query tab Query tab Add further Search Conditions Results General Overview 7 Results Reactions tab

More information

Multi-file Polymer Searching

Multi-file Polymer Searching Knowledge Services Multi-file Polymer Searching ICIC 2011, Barcelona Ankit Biyani Agenda Introduction to Dolcera Polymer searching Databases Search Strategy REGISTRY CAPLUS IFICDB WPIX Final Results Summary

More information

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading:

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading: 24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, 2010 3 BLAST and FASTA This lecture is based on the following papers, which are all recommended reading: D.J. Lipman and W.R. Pearson, Rapid

More information

Lecture 5 Advanced BLAST

Lecture 5 Advanced BLAST Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 5 Advanced BLAST BLAST Recap Sequence Alignment Complexity and indexing BLASTN and BLASTP Basic parameters

More information

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files.

The beginning of this guide offers a brief introduction to the Protein Data Bank, where users can download structure files. Structure Viewers Take a Class This guide supports the Galter Library class called Structure Viewers. See our Classes schedule for the next available offering. If this class is not on our upcoming schedule,

More information

GROUP CANVAS USER SIDE FUNCTIONS

GROUP CANVAS USER SIDE FUNCTIONS Group Canvas V5.0 17 GROUP CANVAS USER SIDE FUNCTIONS INTRODUCTION Once the template is available on the user side there are a number of functions that the users have access to. This section of the manual

More information

Cited References in CAplus SM. and CA SM

Cited References in CAplus SM. and CA SM AmericaSTNotes Chemical Abstracts Service provides access to STN in North FEBRUARY 2009 No 24 REVISED In response to customer requests for more detailed information on new and enhanced system features,

More information

How to Create a Reference Answer Set

How to Create a Reference Answer Set How to Create a Reference Answer Set Find references quickly and easily In SciFinder, you are searching the world s largest, publicly available reference database for chemistry and related sciences as

More information

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA.

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Fasta is used to compare a protein or DNA sequence to all of the

More information

EBI services. Jennifer McDowall EMBL-EBI

EBI services. Jennifer McDowall EMBL-EBI EBI services Jennifer McDowall EMBL-EBI The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number 226073 (Integrating

More information

COPYRIGHTED MATERIAL. SciFinder : Setting the Scene. 1.1 I Just Want to Do a Quick and Simple Search on...

COPYRIGHTED MATERIAL. SciFinder : Setting the Scene. 1.1 I Just Want to Do a Quick and Simple Search on... 1 SciFinder : Setting the Scene 1.1 I Just Want to Do a Quick and Simple Search on......is sometimes heard in scientific laboratories. It can be achieved, provided the scientist has the background knowledge,

More information

Similarity searches in biological sequence databases

Similarity searches in biological sequence databases Similarity searches in biological sequence databases Volker Flegel september 2004 Page 1 Outline Keyword search in databases General concept Examples SRS Entrez Expasy Similarity searches in databases

More information

Creating and Using Genome Assemblies Tutorial

Creating and Using Genome Assemblies Tutorial Creating and Using Genome Assemblies Tutorial Release 8.1 Golden Helix, Inc. March 18, 2014 Contents 1. Create a Genome Assembly for Danio rerio 2 2. Building Annotation Sources 5 A. Creating a Reference

More information

User Guide. Version Exago Inc. All rights reserved.

User Guide. Version Exago Inc. All rights reserved. User Guide Version 2016.2 2016 Exago Inc. All rights reserved. Exago Reporting is a registered trademark of Exago, Inc. Windows is a registered trademark of Microsoft Corporation in the United States and

More information

INPADOCDB/INPAFAMDB News

INPADOCDB/INPAFAMDB News August 2008 INPADOCDB/INPAFAMDB News All patent publication events including grants are now covered in the INPADOC Legal Status for all authorities In the INPADOCDB and INPAFAMDB files, all patent publication

More information

SDC PLATINUM QUICK START GUIDE

SDC PLATINUM QUICK START GUIDE SDC PLATINUM QUICK START GUIDE Date of issue: 21 May 2012 Contents Contents About this Document... 1 Intended Readership... 1 In this Document... 1 Feedback... 1 Chapter 1 Overview... 2 Get Started...

More information

Computational Molecular Biology

Computational Molecular Biology Computational Molecular Biology Erwin M. Bakker Lecture 3, mainly from material by R. Shamir [2] and H.J. Hoogeboom [4]. 1 Pairwise Sequence Alignment Biological Motivation Algorithmic Aspect Recursive

More information