Semantic Web Analysis Service

Size: px
Start display at page:

Download "Semantic Web Analysis Service"

Transcription

1 CBRC, AIST Semantic Web Analysis Service User Manual CBRC 2013/01/15

2 1. Sync Type Analysis Services How to use Sync Type Analysis Services Blast Prepare input RDF Command for execution Results CentroidFold Prepare input RDF Command for execution Results ClustalW Prepare input RDF Command for execution Results IPknot Prepare input RDF Command for execution Results Mafft Prepare input RDF Command for execution Results Psipred Prepare input RDF Command for execution Results Raccess Prepare input RDF Command for execution Results RactIP Prepare input RDF Command for execution Results Wolfpsort Prepare input RDF... 36

3 Command for execution Results AsyncType Analysis Services How to use Async Type Analysis Services Last Prepare input RDF Command for execution Results Modelling Prepare input RDF Command for execution Results PoodleL Prepare input RDF Command for execution Results PoodleS Prepare input RDF Command for execution Results Rassie Prepare input RDF Command for execution Results Contact... 66

4 Analysis OWLClass that defines input RDF Service Name CentroidFold S IPknot S Mafft S Psipred S Raccess S RactIP S Wolfpsort S Last Modelling PoodleL A A A PoodleS A Rassie A Table 1-A OWLClass that defines input RDF: S (Synchronous), A (Asynchronous)

5 1. Sync Type Analysis Services 1.0. How to use Sync Type Analysis Services Usable SADI services will be shown by accessing (Figure 1-1). Figure 1-1 SADI service Among those, Sync type analysis services (Blast, ClustalW, CentroidFold, IPknot, Mafft, Psipred, Raccess, RactIP, Wolfpsort) can be executed by using the following curl command. 1

6 % curl RDF (refer to Table 1-A) -o outputrdf For example, output.rdf as an output RDF can be obtained by executing the following command: % curl -o output.rdf Input RDF format is as follows: <cbrc:wolfpsortinput rdf:about=" <cbrc:requireskingdominformationof>animal</cbrc:requireskingdominformationof> <cbrc:requiresqueryproteinsequence> MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWG QPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMD EYSN </cbrc:requiresqueryproteinsequence> </cbrc:wolfpsortinput> </rdf:rdf> 1-1 Input RDF for Wolfpsort - Letters in black >> start and end tags for RDF (common to each service) - Letters in green >> specify name space (common to each service) - Letters in red >> OWL class that defines input RDF (refer to table 1-A), specify corresponding URI 2

7 <for Wolfpsort> OWL class that defines input RDF: cbrc:wolfpsortinput Corresponding URI: - Letters in blue >> specify a triple required by service execution <for Wolfpsort (two triples need to be specified)> Predicate: Object: requireskingdominformationof Kingdom(animal, plant or fungi) requiresqueryproteinsequence amino acid sequence A sample input RDF file can be obtained through Figure 1-2 SADI service page 3

8 Output RDF format is as follows: <cbrc:wolfpsortoutput rdf:about=" <cbrc:requiresresultintextformat># k used for knn is: 32 queryprotein extr 19, E.R. 4, pero 4, lyso 3, E.R._mito 3, mito_pero 3 </cbrc:requiresresultintextformat> </cbrc:wolfpsortoutput> </rdf:rdf> 1-2 Output RDF for Wolfpsort - Letters in blue >> Execution results of each service are stored in triple format <for Wolfpsort > Predicate: Object: requiresresultintextformat results in text * curl (e.g. for windows) can be downloaded through the site below 4

9 1.1. Blast Prepare input RDF Create input RDF for Blast as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is BlastInput class, rdf:about attribute is blast.rdf#1 - Define a triple for a query sequence. Subject: BlastInput Predicate: requiresquerysequence Object: query sequence <cbrc:requiresquerysequence>query sequence</cbrc:requiresquerysequence> - Define a triple for Blast program Subject: BlastInput Predicate: requiresblastprogramname Object: blastp, blastn, blastx, tblastn or tblastx <cbrc:requiresblastprogramname>blastp</cbrc:requiresblastprogramname> - Define a triple for target database Subject: BlastInput Predicate: requiresblastdatabase Object: SWISS, TREMBL, UNIPROT, PROTEIN, PDB etc. <cbrc:requiresblastdatabase>swiss</cbrc:requiresblastdatabase> Input RDF for Blast based on those definitions would look like 1-3. Letters in red for a query sequence, letters in blue for target database, letters in green for blast program name. 5

10 <cbrc:blastinput rdf:about=" <cbrc:requiresquerysequence>>test MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGW GQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDE YSNQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVGMA NLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGW GQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFV HDCVNITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVGMANLGCWMLV LFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHS QWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCVNITI KQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVGMANLGCWMLVLFVATWSD LGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKP KTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCVNITIKQHTVTTT TKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVG</cbrc:requiresQuerySequen ce> <cbrc:requiresblastdatabase>swiss</cbrc:requiresblastdatabase> <cbrc:requiresblastprogramname>blastp</cbrc:requiresblastprogramname> </cbrc:blastinput> </rdf:rdf> 1-3 Input RDF for Blast Command for execution Enter the command below. % curl -o outputrdf 6

11 Results Blast results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is BlastInput class, rdf:about attribute is blast.rdf#1 - Define a triple for a query sequence. Subject: BlastInput Predicate: requiresresultintextformat Object: Blast results <cbrc:requiresresultintextformat>blast results </cbrc:requiresresultintextformat> An example of Blast results are shown in

12 <cbrc:blastoutput rdf:about=" <cbrc:requiresresultintextformat>blastp [Aug ] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Query= test (1012 letters) Database: SWISS: SWISS sequence taken from the header [Last update Dec/02/2011] 533,049 sequences; 189,064,225 total letters Searching...done >sp P04156 PRIO_HUMAN RecName: Full=Major prion protein Short=PrP; AltName: Full=ASCR; AltName: Full=PrP27-30; AltName: Full=PrP33-35C; AltName: CD_antigen=CD230; Flags: Precursor; Length = 253 Score = 301 bits (770), Expect = 2e-80, Method: Compositional matrix adjust. Identities = 154/236 (65%), Positives = 154/236 (65%) Query: 1 MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYXXXXXXXXXXX 60 MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRY Sbjct: 1 MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP 60 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 75 (33.5 bits)</cbrc:requiresresultintextformat> </cbrc:blastoutput> </rdf:rdf> 1-4 Blastresults 8

13 1.2. CentroidFold Prepare input RDF Create input RDF for CentroidFold as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is CentroidFoldInput class, rdf:about attribute is <cbrc:centroidfoldinput rdf:about= > - Define a triple for a RNA sequence. Subject: CentroidFoldInput Predicate: requiresqueryrnasequence Object: RNA sequence <cbrc:requiresqueryrnasequence>rnasequence</cbrc:requiresqueryrnaseq uence> - Define a triple for multiple alignment in ClutsalW format Subject: CentroidFoldInput Predicate: requiresclustalwmultiplealignment Object: multiple alignment in ClutsalW format <cbrc:requiresclustalwmultiplealignment> multiple alignment in ClutsalW format </cbrc:requiresclutalwmultiplealignment> - Define a triple for command line options Subject: CentroidFoldInput Predicate: hasoptions Object: options <cbrc:hasoptions>-g 4</cbrc:hasOptions> 9

14 Input RDF for CentroidFold based on those definitions would look like 1-5. Letters in red for a RNA sequence, letters in blue for command line options. <cbrc:centroidfoldinput rdf:about=" <cbrc:requiresqueryrnasequence>>af / RF00381;Antizyme_FSE; UGAUGCCCCUCACCCAUCAGUGAAGAUCCCGGGUGGGCGAGGGAACGGAA GGGAUC >AAVX / RF00381;Antizyme_FSE; UGAUGUCCCUCACCCACCCUUGAAGAUCCCAGGUGGGCGAGGGAAUGGUC AAAGGGAUC >BAAE / RF00381;Antizyme_FSE; UGAUGCCCCUCACCCACAGCUGAAGAUCCCGGGUGGGCGAGGGACUGUCA GGGAUC </cbrc:requiresqueryrnasequence> <cbrc:hasoptions>-g 4</cbrc:hasOptions> </cbrc:centroidfoldinput> </rdf:rdf> 1-5 Input RDF for CentroidFold Command for execution Enter the command below. % curl -o outputrdf 10

15 Results CentroidFold results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is CentroidFoldOutput class, rdf:about attribute is centroid.rdf#1 <cbrc:centroidfoldoutput rdf:about= > - Define a triple for CentroidFold. Subject: CentroidFoldOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat>centroidfold results </cbrc:requiresresultintextformat> - Define a triple for CentroidFold graphic base64 conversion. Subject: CentroidFoldOutput Predicate: requiresresultinbase64binaryformat Object: results <cbrc:requiresresultinbase64binaryformat>centroidfold results </cbrc:requiresresultinbase64binaryformat> * This triple will not be written if noimage is defined for hasoptions in input RDF. An example of CentroidFold results are shown in 1-6. Letters in red for a CentroidFold graphic conversion, letters in blue for CentroidFold results. 11

16 <cbrc:centroidfoldoutput rdf:about=" <cbrc:requiresresultinbase64binaryformat>png:1 ivborw0kggoaaaansuheugaadrcaaa63caiaaaddrqitaacaaeleqvr42uzcw3ltxpfauu7h32dq Cv/0iHoMHhzazSNRFJ8A6pWZtVbgU9dE0b6OSmhHvhwAAAAAAAAAAAAAQC0vfgUAAAAAAAAAAAAA UIxKGAAAAAAAAAAAAACqUQkDAAAAAAAAAAAAQDUqYQAAAAAAAAAAAACoRiUMAAAAAAAAAAAAANWo haeaaaaaaaaaaacggpuwaaaaaaaaaaaaafsjegyaaaaaaaaaaacaaltcaaaaaaaaaaaaafcnshga AAAAAAAAAAAAqlEJAwAAAAAAAAAAAEA1KmEAAAAAAAAAAAAAqEYlDAAAAAAAAAAAAADVqIQBAAAA AACAiVTCcyrhzLS/EmEAAAAAAACYwtQNAABgrlQLd1uGwvkpgCXCAAAAAAAAMJ7BGwAAwFwWCauE 30iBJcIAAAAAAAAwktkbAADAdCrhcZVwLYJgfTAAAAAAAAAMYwIHAACwwsJVu41D4YqUwRJhAAAA AAAAGMMQDgAAYJG9FwnHV8J16YMlwgAAAAAAADCAORwAAMA6KuGoSrg6iTAAAAAAAAAQzSgOAABg qr9urmevnvtbaaaaaaaaqbwdoed/t2vhnaaaaaia+rf2nycbd+qaaiarx9csboowaaaaaaaambjs jjufyd3svaaaaabjru5erkjggg== </cbrc:requiresresultinbase64binaryformat> <cbrc:requiresresultintextformat>>af / RF00381;Antizyme_FSE; UGAUGCCCCUCACCCAUCAGUGAAGAUCCCGGGUGGGCGAGGGAACGGAAGGGAUC...(.(((((.((((((.(...)...)))))).)))))..)... (g=4,th=0.2,e=-10.39) >AAVX / RF00381;Antizyme_FSE; UGAUGUCCCUCACCCACCCUUGAAGAUCCCAGGUGGGCGAGGGAAUGGUCAAAGGGAUC ((((.((((((.((((((..((...)))))))).))))))...))))... (g=4,th=0.2,e=-13.76) >BAAE / RF00381;Antizyme_FSE; UGAUGCCCCUCACCCACAGCUGAAGAUCCCGGGUGGGCGAGGGACUGUCAGGGAUC (((((.(((((.(((((..(((...)))))))).)))))..)))))... (g=4,th=0.2,e=-12.51) </cbrc:requiresresultintextformat> </cbrc:centroidfoldoutput> </rdf:rdf> 1-6 CentroidFold results 12

17 1.3. ClustalW Prepare input RDF Create input RDF for ClustalW as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is ClustalWInput class, rdf:about attribute is clustalw.rdf#1 <cbrc:clustalwinput rdf:about= > - Define a triple for a sequence. Subject: ClustalWInput Predicate: requiressequence Object: sequence <cbrc:requiressequence>more than 3 sequences </cbrc:requiressequence> - Define a triple for command line options Subject: ClustalWInput Predicate: hasoptions Object: options <cbrc:hasoptions>-gapopen=10 -GAPEXT=0.5</cbrc:hasOptions> 13

18 An example of ClustalW results are shown in 1-7. Letters in red for a sequence, letters in blue for options. <cbrc:clustalwinput rdf:about=" <cbrc:requiressequence>>1lyla FNDELRNRREKLAALRQQGVAFPNDFRRDHTSDQLHEEFDAKDNQELESLNIEVSVAGRMMTRRIMGKASFVTLQDVGGRI QLYVARDSLPEGVYNDQFKKWDLGDIIGARGTLFKTQTGELSIHCTELRLLTKALRPLPDQEVRYRQRYLDLIANDKSRQTFVVRSK ILAAIRQFMVARGFMEVETPMMQVIPGGASARPFITHHNALDLDMYLRIAPELYLKRLVVGGFERVFEINRNFRNEGISVRHNPEFT MMELYMAYADYHDLIELTESLFRTLAQEVLGTTKVTYGEHVFDFGKPFEKLTMREAIKKYRPETDMADLDNFDAAKALAESIGITVE KSWGLGRIVTEIFDEVAEAHLIQPTFITEYPAEVSPLARRNDVNPEITDRFEFFIGGREIGNGFSELNDAEDQAERFQEQVNAKAAG DDEAMFYDEDYVTALEYGLPPTAGLGIGIDRMIMLFTNSHTIRDVILFPAMRP >1B8AA MYRTHYSSEITEELNGQKVKVAGWVWEVKDLGGIKFLWIRDRDGIVQITAPKKKVDPELFKLIPKLRSEDVVAVEGVVNFT PKAKLGFEILPEKIVVLNRAETPLPLDPTGKVKAELDTRLNNRFMDLRRPEVMAIFKIRSSVFKAVRDFFHENGFIEIHTPKIIATA TEGGTELFPMKYFEEDAFLAESPQLYKEIMMASGLDRVYEIAPIFRAEEHNTTRHLNEAWSIDSEMAFIEDEEEVMSFLERLVAHAI NYVREHNAKELDILNFELEEPKLPFPRVSYDKALEILGDLGKEIPWGEDIDTEGERLLGKYMMENENAPLYFLYQYPSEAKPFYIMK YDNKPEICRAFDLEYRGVEISSGGQREHRHDILVEQIKEKGLNPESFEFYLKAFRYGMPPHGGFGLGAERLIKQMLDLPNIREVILF PRDRRRLTP </cbrc:requiressequence> <cbrc:hasoptions>-gapopen=10 GAPEXT=2</cbrc:hasOptions> </cbrc:clustalwinput> </rdf:rdf> 1-7 Input RDF for ClustalW Command for execution Enter the command below. % curl -o outputrdf 14

19 Results ClustalW results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is ClustalWOutput class, rdf:about attribute is clustalw.rdf#1 <cbrc:clustalwoutput rdf:about= > - Define a triple for ClustalW. Subject: ClustalWOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat>clustalw results </cbrc:requiresresultintextformat> An example of ClustalW results are shown in 1-8. Letters in red for ClustalW results. 15

20 <cbrc:clustalwoutput rdf:about=" <cbrc:requiresresultintextformat>clustal W (1.83) Multiple Sequence Alignments 1ATIA E64328 B64744 E64454 G ADJA G64930 D SESA A PYSA y Pyrococcus JT0942 1B8AA 1ASZB 1LYLA S WTPPRYFNMMFQDLRGPRGGRGLLAYLRPETAQGIFVNFKNVLDATSRKLGFGIAQIGK ELGEVKKFNLMFVTSIGPGGKR--TGYMRPETAQGIFIQFRRLAQFFRNKLPFGVVQIGK KFRDEVRPRFGVMRSREFLMKDAYSFHTSQESLQETYDAMYAAYSKIFSRMGLDFRAVQA HGGKTQLDVKLALRPTSETPIYYMMK-LWVKVHTDLPIKIYQIVN DHGGREMALRPEMTSPVVRFYLNELKNLQKPL--RLYYFAN DRGGRSLTLRPEGTAAMVRAYLEHGMKVWPQP-VRLWMAGP VDMCRGPHVPNMRFCHHFKLMKTAGAYWRGDSNNKMLQRIYGTAWADKKALNAYLQRLEE KGHPLSELSRKIVAKEEKKEEGEESKFYLLNPETEEIIELNENNINIIKDEELLALAKHE KALGEEAKRLEEALREKEARLEALLLQVPLPPWPGAPVGGE KLGEELDAAKAELDALQAEIRDIALTIPNLPADEVPVGKD EGFRLEGPLGEEVEGRLLLRTHTSPMQVRYMVAHTP LKAIVGVLRKEGWAEVSKTKEGLTLKLSEKGKKAEKRAIDIALEVL RPEMAQRLKTRAKITSLVRRFMDDHGFLDIETPMLTKATPE FTPKAKLGFEILPEKIVVLNRAETPLPLDPTGKVKAELD IVKKVDEPIKSATVQNLEIHITKIYTISETPEALPILLEDASRSEAE DSLPEGVYNDQFKKWDLGDIIGARGTLFKTQTGELSIHCTELRL FETRFVGPGHSQGMNLWLMTSPEYHMKRLLVAGCGPVFQ ( D64449: , G64930: ) : ) : );</cbrc:requiresResultInTextFormat> </cbrc:clustalwoutput> </rdf:rdf> 1-8 ClustalW results 16

21 1.4. IPknot Prepare input RDF Create input RDF for IPknot as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is IPknotInput class, rdf:about attribute is ipknot.rdf#1 <cbrc:ipknotinput rdf:about= > - Define a triple for a RNA sequence. Subject: IPknotInput Predicate: requiresqueryrnasequence Object: RNA sequence <cbrc:requiresqueryrnasequence>rnasequence</cbrc:requiresqueryrnaseq uence> - Define a triple for multiple alignment in ClutsalW format Subject: IPknotInput Predicate: requiresclustalwmultiplealignment Object: multiple alignment in ClutsalW format <cbrc:requiresclustalwmultiplealignment> multiple alignment in ClutsalW format </cbrc:requiresclutalwmultiplealignment> - Define a triple for command line options Subject: IPknotInput Predicate: hasoptions Object: options <cbrc:hasoptions>-i</cbrc:hasoptions> 17

22 Input RDF for IPknotbased on those definitions would look like 1-9. Letters in red for a RNA sequence, letters in blue for command line options. <cbrc:ipknotinput rdf:about=" <cbrc:requiresqueryrnasequence>>tomato_mosaic_virus.1 GUGUCUUGGAGCGCGCGGAGUAAACAUAUAUGGUUCAUAUAUGUCCGUAGGCACGUAAAAAAAGCGA >Tobacco_mosaic_virus.1 GUGUCUUGGAUCGCGCGGGUCAAAUGUAUAUGGUUCAUAUACAUCCGCAGGCACGUAAUAAA-GCGA >Rehmannia_mosaic_vir.1 GUGUCUUGGUUCGCGCGGGUCAAGUGUAUAUGGUGCAUAUACAUCCGUAGGCACGUAAUAAA-GCGA >B.pepper.1 GUGUCUUGGAACGCGCGGGUCAAAUAUAAGUGGUUCACUUAUAUCCGUAGGCACGAAAAAUU-GCGU</cbrc:requi resqueryrnasequence> <cbrc:hasoptions>-i</cbrc:hasoptions> </cbrc:ipknotinput> </rdf:rdf> 1-9 Input RDF for IPknot Command for execution Enter the following command: % curl -o outputrdf 18

23 Results IPknot results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is IPknotOutput class, rdf:about attribute is ipknot.rdf#1 <cbrc:ipknotoutput rdf:about= > - Define a triple for IPknot. Subject: IPknotOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat>ipknot results </cbrc:requiresresultintextformat> An example of IPknot results are shown in Letters in red for a IPknot results. <cbrc:ipknotoutput rdf:about=" <cbrc:requiresresultintextformat>>tomato_mosaic_virus.1 GUGUCUUGGAGCGCGCGGAGUAAACAUAUAUGGUUCAUAUAUGUCCGUAGGCACGUAAAAAAAGCGA ((((((...((((...(((((((((...)))))))))))))))))))... >Rehmannia_mosaic_vir.1 GUGUCUUGGUUCGCGCGGGUCAAGUGUAUAUGGUGCAUAUACAUCCGUAGGCACGUAAUAAA-GCGA (((((..(((...[[[[)))...((((((((...)))))))).]]]].)))))... </cbrc:requiresresultintextformat> </cbrc:ipknotoutput> </rdf:rdf> 1-10 IPknot results 19

24 1.5. Mafft Prepare input RDF Create input RDF for Mafft as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is MafftInput class, rdf:about attribute is mafft.rdf#1 <cbrc:mafftinput rdf:about= > - Define a triple for a sequence. Subject: MafftInput Predicate: requiressequence Object: sequence <cbrc:requiressequence>more than 3 sequences </cbrc:requiressequence> - Define a triple for command line options Subject: MafftInput Predicate: hasoptions Object: options <cbrc:hasoptions>--retree 2 --maxiterate 0 --bl 62 --op ep clustalout</cbrc:hasoptions> An example of Mafft results are shown in Letters in red for a sequence, letters in blue for options. 20

25 <cbrc:mafftinput rdf:about=" <cbrc:requiressequence>>1lyla FNDELRNRREKLAALRQQGVAFPNDFRRDHTSDQLHEEFDAKDNQELESLNIEVSVAGRM MTRRIMGKASFVTLQDVGGRIQLYVARDSLPEGVYNDQFKKWDLGDIIGARGTLFKTQTG ELSIHCTELRLLTKALRPLPDQEVRYRQRYLDLIANDKSRQTFVVRSKILAAIRQFMVAR EQVNAKAAGDDEAMFYDEDYVTALEYGLPPTAGLGIGIDRMIMLFTNSHTIRDVILFPAM RP >1B8AA MYRTHYSSEITEELNGQKVKVAGWVWEVKDLGGIKFLWIRDRDGIVQITAPKKKVDPELF KLIPKLRSEDVVAVEGVVNFTPKAKLGFEILPEKIVVLNRAETPLPLDPTGKVKAELDTR LNNRFMDLRRPEVMAIFKIRSSVFKAVRDFFHENGFIEIHTPKIIATATEGGTELFPMKY PNIREVILFPRDRRRLTP >1ASZB EDTAKDNYGKLPLIQSRDSDRTGQKRVKFVDLDEAKDSDKEVLFRARVHNTRQQGATLAF LTLRQQASLIQGLVKANKEGTISKNMVKWAGSLNLESIVLVRGIVKKVDEPIKSATVQNL EIHITKIYTISETPEALPILLEDASRSEAEAEAAGLPVVNLDTRLDYRVIDLRTVTNQAI FRIQAGVCELFREYLATKKFTEVHTPKLLGAPSEGGSSVFEVTYFKGKAYLAQSPQFNKQ QLIVADFERVYEIGPVFRAENSNTHRHMTEFTGLDMEMAFEEHYHEVLDTLSELFVFIFS KFLGKLVRDKYDTDFYILDKFPLEIRPFYTMPDPANPKYSNSYDFFMRGEEILSGAQRIH EIYEKLKGKFRVHIDDRDIRPGRKFNDWEIKGVPLRIEVGPKDIENKKITLFRRDTMEKF QVDETQLMEVVEKTLNNIMENIKNRAWEKFENFITILEDINPDEIKNILSEKRGVILVPF KEEIYNEELEEKVEATILGETEYKGNKYIAIAKTY </cbrc:requiressequence> <cbrc:hasoptions></cbrc:hasoptions> </cbrc:mafftinput> </rdf:rdf> 1-11 Input RDF for Mafft 21

26 Command for execution Enter the command below. % curl -o outputrdf Results Mafft results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is MafftOutput class, rdf:about attribute is mafft.rdf#1 - Define a triple for Mafft results Subject: MafftOutput Predicate: requiresresultintextformat Object: Mafft results <cbrc:requiresresultintextformat>mafft results </cbrc:requiresresultintextformat> An example of Mafft results are shown in Letters in red for Mafft results 22

27 <cbrc:mafftoutput rdf:about=" <cbrc:requiresresultintextformat>clustal format alignment by MAFFT FFT-NS-2 (v6.717b) 1LYLA 1B8AA 1ASZB FNDELRNRREKLAALRQQGVAFPNDFRRDHTSDQLHEEFDAKDNQELESLNIEVSVAGRM MYRTHY------SSEITEELNGQKVKVAGWV EDTAKDNYGKLPLIQSRDSDRTGQKRVKFVD-L 1ADJA PYSA LYLA 1B8AA 1ASZB 1ADJA 1PYSA LVVGGFER------VFEINR-NFRNE MMASGLDR------VYEIAP-IFRAE LIVADFER------VYEIGP-VFRAE YLEHGMKVWPQ-----PVRLWMAGP-MFRAE MVAHTP-----PFRIVVPGR-VFRFE----- * 1LYLA 1B8AA 1ASZB 1ADJA 1PYSA -----AMRP RDRRRLTP RDPKRLRP FLGEDELRAGEVTLKRLATGEQVRLSREEVPGYLLQALG ----KFLEQFKGVL </cbrc:requiresresultintextformat> </cbrc:mafftoutput> </rdf:rdf> 1-12 Mafft results 23

28 1.6. Psipred Prepare input RDF Create input RDF for Psipred as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is PsiPredInput class, rdf:about attribute is psipred.rdf#1 <cbrc:psipredinput rdf:about= > - Define a triple for an amino acid sequence. Subject: PsiPredInput Predicate: requiresqueryproteinsequence Object: amino acid sequence <cbrc:requiresqueryproteinsequence> amino acid sequence </cbrc:requiresqueryproteinsequence> 24

29 Input RDF for Psipred based on those definitions would look like Letters in red for a sequence. <cbrc:psipredinput rdf:about=" <cbrc:requiresqueryproteinsequence>>test MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGW GQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDE YSNQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVGMA NLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGW GQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFV HDCVNIMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCVNITIKQHTVTTT TKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVG</cbrc:requiresQueryProtei nsequence> </cbrc:psipredinput> </rdf:rdf> 1-13 Input RDF for Psipred Command for execution Enter the command below. % curl -o outputrdf 25

30 Results Psipred results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is PsiPredOutput class, rdf:about attribute is psipred.rdf#1 - <cbrc:psipredoutput rdf:about= > - Define a triple for Psipred. Subject: PsiPredOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat>psipred results </cbrc:requiresresultintextformat> An example of Psipred results are shown in Letters in red for Psipred results. 26

31 <cbrc:psipredoutput rdf:about=" <cbrc:requiresresultintextformat># PSIPRED HFORMAT (PSIPRED V2.5 by David Jones) Conf: Pred: CCCCCHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC AA: MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQP Conf: Pred: CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHCCCHHHHH AA: HGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGA Conf: Pred: HHHCCHHHHHHHHHCCCEECCCCCCCHHHHHHHHHHCCCCCEECCCHHHCCCCCCEEEEE AA: VVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCV Conf: Pred: HHHHHHHHHHHHHHHHHHHHHHHHHHCCCEEEEECCCCHHHHHHHHHHHCCC AA: DVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVG </cbrc:requiresresultintextformat> </cbrc:psipredoutput> </rdf:rdf> 1-14 Psipred results 27

32 1.7. Raccess Prepare input RDF Create input RDF for Raccess as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is RaccessInput class, rdf:about attribute is raccess.rdf#1 - <cbrc:raccessinput rdf:about= > - Define a triple for a RNA sequence. Subject: RaccessInput Predicate: requiresqueryrnasequence Object: sequence <cbrc:requiresqueryrnasequence>rna sequence </cbrc:requiresqueryrnasequence> - Define a triple for command line options Subject: RaccessInput Predicate: hasoptions Object: options <cbrc:hasoptions>-access_len=50 </cbrc:hasoptions> Input RDF for Raccess based on those definitions would look like Letters in red for a RNA sequence, letters in blue for command line options. 28

33 <cbrc:raccessinput rdf:about=" <cbrc:requiresqueryrnasequence>>gi ref NM_ Homo sapiens vasohibin 1 (VASH1), mrna GCCCCTGCGCGCCGCCCGAGCCGGTCCCGCTGAGCCGCGGGCCCCGTGCCCTGCGATGGCTCGGCTGGTG CAGCGCGGCGCCAGGTGCCAGCCGTCCTCCCGCTGAGACGCGCCCGAGTGGGGACCCGCTGGGCCTCGGG GCTCGCAGCCTTCGCCTCCCCGCCGCGCCCGCTCCCTTTCTGGGGACTCCGCCGCTGTTTCTGGGGACGA GGGGACAGGGGACCCAGACAAAGCCCACTTTGTGCAGGGAGTTGGCCGCAGGCGGGGAATGTGCGCGTCG GCGCGCGCCCCCTCCCCGCTCCCGGCCAGCTGCGAGTCTTGGCTCCCGGACTTGTCTCGTCGCGTCGGAG AAATCGCCCCCCAGCGCCGCTCTCCCGCCCGGGGGTCTTGGTTCCGAGCTCGCGCGGCCGGGAGTCGCCT CGGTCTTCCTTGGGGCGCGCGCAGATGTGAGCGTGCGAGAGTTGTGTAGGGGATTTTGTTCCCTCCGAAA CTGAGACCCAGGGCGCCCAGTGGGCACCCGTGCCTTGACTCTGTCCTTTCTGCAGCCGCTGGTCCGAGCT GTCTGGCCTCAGTTTCCCTCCGACTTTTCTCCGCTCTGCCAGCCCTCACTGCTGCCCGTCATTGTTCTCG CAGTTAGATGGGGGTGCTTTGTGACGGCTGCCAAGTTGGGGTGTGTTCTCTTTATTCCGTTTTTCAAACA GAACAAGGCCTCCAAGGCTGACCCCAGACAACCCACCCCCTCGGACCCTAATTCACCTTATTGCACTGAT TTTTTTTATCAAGTCGTATTTTATTGTACAGGAGCCACGCCCTGATTTCTTAAAGGCGCCTTGCACTCTG GCCATGTGTTATCTCTGCAGCCGGTGTGTGGGAGGCCTCTTGTGAGCCAGTTGTTTTCCCGCCTCCACCA CCCCCCTCGAAGATTTAGGGATGCCAGGGGGGAAGAAGGTGGCTGGGGGTGGCAGCAGCGGTGCCACTCC AACGTCCGCTGCGGCCACCGCCCCCTCTGGGGTCAGGCGTTTGGAGACCAGCGAAGGAACCTCAGCCCAG CCAAGCTGCTCTCGCTCCCACTGAGCCAAGCCCCCTAACTTTGGGCCTAGAGGCCGTTAGTAT. </cbrc:requiresqueryrnasequence> <cbrc:hasoptions></cbrc:hasoptions> </cbrc:raccessinput> </rdf:rdf> 1-15 Input RDF for Raccess Command for execution Enter the command below. % curl -o outputrdf 29

34 Results Raccess results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is RaccessOutput class, rdf:about attribute is raccess.rdf#1 <cbrc:raccessoutput rdf:about= > - Define a triple for Raccess. Subject: RaccessOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat> Raccess results </cbrc:requiresresultintextformat> An example of Raccess results are shown in Letters in red for Raccess results. 30

35 <cbrc:raccessoutput rdf:about=" <cbrc:requiresresultintextformat>>gi ref NM_ Homo sapiens vasohibin 1 (VASH1), mrna /cbrc:requiresResultInTextFormat> </cbrc:raccessoutput> </rdf:rdf> 1-16 Raccess results 31

36 1.8. RactIP Prepare input RDF Create input RDF for RactIPInput as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is RactIPInput class, rdf:about attribute is ractip.rdf#1 <cbrc:ractipinput rdf:about= > - Define a triple for a RNA sequence. Subject: RactIPInput Predicate: requiresqueryrnasequence Object: RNA sequence <cbrc:requiresqueryrnasequence>rnasequence</cbrc:requiresqueryrnaseq uence> - Define a triple for target RNA sequence Subject: RactIPInput Predicate: requiressubjectrnasequence Object: sequence - <cbrc:requiressubjectrnasequence>rna sequence - </cbrc:requiressubjectrnasequence> - Define a triple for command line options Subject: RactIPInput Predicate: hasoptions Object: options <cbrc:hasoptions>-i</cbrc:hasoptions> 32

37 Input RDF for RactIP based on those definitions would look like Letters in red for a RNA sequence, letters in blue for command line options. <cbrc:ractipinput rdf:about=" <cbrc:requiresqueryrnasequence>>r1inv GGCAACGGAUGGUUCGUUGCC</cbrc:requiresQueryRNASequence> <cbrc:requiressubjectrnasequence>>r2inv GCACCGAACCAUCCGGUGC</cbrc:requiresSubjectRNASequence> <cbrc:hasoptions>-i</cbrc:hasoptions> </cbrc:ractipinput> </rdf:rdf> 1-17 Input RDF for RactIP Command for execution Enter the command below. % curl -o outputrdf 33

38 Results RactIP results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is RactIPOutput class, rdf:about attribute is ractip.rdf#1 <cbrc:ractipoutput rdf:about= > - Define a triple for RactIP. Subject: RactIPOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat> RactIP results </cbrc:requiresresultintextformat> An example of RactIP results are shown in Letters in red for RactIP results. 34

39 <cbrc:ractipoutput rdf:about=" <cbrc:requiresresultintextformat>* 0: objval = e+00 infeas = e+00 (0) * 36: objval = e+01 infeas = e+00 (0) OPTIMAL SOLUTION FOUND Integer optimization begins : mip = not found yet <= +inf (1; 0) + 36: >>>>> e+01 <= e % (1; 0) + 36: mip = e+01 <= tree is empty 0.0% (0; 1) INTEGER OPTIMAL SOLUTION FOUND >R1inv GGCAACGGAUGGUUCGUUGCC ((((((([[[[[[[))))))) >R2inv GCACCGAACCAUCCGGUGC ((((((]]]]]]]))))))</cbrc:requiresresultintextformat> </cbrc:ractipoutput> </rdf:rdf> 1-18 RactIP results 35

40 1.9. Wolfpsort Prepare input RDF Create input RDF for Wolfpsort as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is WolfPsortInput class, rdf:about attribute is wolfpsort.rdf#1 <cbrc:wolfpsortinput rdf:about= > - Define a triple for kingdom. Subject: WolfPsortInput Predicate: requireskingdominformation Object: animal plant or fungi <cbrc:requirekingomdinfomationof> string </cbrc:requirekingdominformationof> - Define a triple for an amino acid sequence Subject: WolfPsortInput Predicate: requiresqueryproteinsequence Object: sequence <cbrc:requiresqueryproteinsequence> sequence </cbrc:requiresqueryproteinsequence> 36

41 Input RDF for Wolfpsort based on those definitions would look like Letters in red for a sequence, letters in blue for a kingdom. <cbrc:wolfpsortinput rdf:about=" <cbrc:requireskingdominformationof>animal</cbrc:requireskingdominformationof> <cbrc:requiresqueryproteinsequence>manlgcwmlvlfvatwsdlglckkrpkpggwntggsrypgqgspgg NRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSA MSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYER ESQAYYQRGSSMVLFSSPPVILLISFLIFLIVG</cbrc:requiresQueryProteinSequence> </cbrc:wolfpsortinput> </rdf:rdf> 1-19 Input RDF for Wolfpsort Command for execution Enter the command below. % curl -o outputrdf 37

42 Results Wolfpsort results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is WolfPsortOutput class, rdf:about attribute is wolfpsort.rdf#1 <cbrc:wolfpsortoutput rdf:about= > - Define a triple for Wolfpsort results. Subject: WolfPsortOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat>wolfpsort results </cbrc:requiresresultintextformat> An example of Wolfpsort results are shown in Letters in red for a Wolfpsort results. <cbrc:wolfpsortoutput rdf:about=" <cbrc:requiresresultintextformat># k used for knn is: 32 queryprotein extr 19, E.R. 4, pero 4, lyso 3, E.R._mito 3, mito_pero 3 </cbrc:requiresresultintextformat> </cbrc:wolfpsortoutput> </rdf:rdf> 1-20 Wolfpsort results 38

43 2. AsyncType Analysis Services 2.0. How to use Async Type Analysis Services Async type services (Last, Modelling, PoodleL, PoodleS, Rassie) can be executed by using the following curl command. 1) Obtain polling URI % curl to Table 1-A) For example, input.rdf as an input RDF can be used to execute PoodleL as follows: % curl RDF storing polling URI (in red) will be written to standard out. <cbrc:poodleloutput rdf:about=" <rdfs:isdefinedby rdf:resource= /> </cbrc:poodleloutput> </rdf:rdf> 2-1 RDF storing polling URL 2) Polling to SADI server % curl For example, PoodleL can be executed as follows: 39

44 %curl Results URL will then be written to standard out upon completion of the service. % curl COMPLETE: poodlelresult.rdf ** This standard output is shown only once so that the results cannot be obtained if it ( poodlelresult.rdf ) is lost. Xxx/yyy indicates the location of the data on the CBRC server. 3) Obtain results % curl poodlelresult.rdf -o outputrdf Refer to section 1-1 and 1-2 for input/output RDF format. 40

45 2.1. Last Prepare input RDF Create input RDF for Last as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is LastInput class, rdf:about attribute is last.rdf#1 <cbrc:lastinput rdf:about= > - Define a triple for a query sequence. Subject: LastInput Predicate: requiresquerysequence Object: sequence <cbrc:requiresquerysequence>sequence</cbrc:requiresquerysequence> - Define a triple for a subject sequence. Subject: LastInput Predicate: requiressubjectsequence Object: sequence <cbrc:requiressubjectsequence>subject sequence </cbrc:requiressubjectsequence> - Define a triple forlastdb command line options Subject: LastInput Predicate: hasoptionsforlastdb Object: options - <cbrc:hasoptionsforlastdb>-m110 -w1</cbrc:hasoptionsforlastdb> - Define a triple for Lastal command line options Subject: LastInput Predicate: hasoptionsforlastal Object: options 41

46 <cbrc:hasoptionsforlastal>-j4 -u0 -m10 -l1 -k1 -w0 -g1.0 -s2 -e30 </cbrc:hasoptionsforlastal> Input RDF for Last based on those definitions would look like 2-2. Letters in red for a sequence, letters in blue for command line options. <cbrc:lastinput rdf:about=" <cbrc:requiresquerysequence>>gi ref NC_ Pyrococcus abyssi GE5, complete genome GGGCTTTAGCCTCCTTCACCGCTTCCACGATTTTCTGCCTGTCAAAGGGCATTCTAGACATCCCTCCTTA GGTTTTTAATTAAAAATTCAAGGTGGAGTAAAAAGGGATGTTTTTAAATTTTTCTCACTCTTTCTCGGCC TTCTCAAATAGCTCGTCGTAAACCCCTTCATCTATTTCTCTCTGAACTTCCCTTGGATCCTTGCCTTCGA CGGTAACTCCCATGCTTAAAGCCGTTCCAATGACTTCCTTGGCGGCAGCCTTAAGAGTCAATGCTAGCAT CTGGTTTCTCTTCATCTTAGCTATCTTGATAACTTGCTCCATCGTTAAGTTCCCAACGATATTGTGCTTC CTATCTCGAACTGCTTGGTTACTGGATCTACGATGATCTTCACTGGGACCTGCATCCCAGCGAACTCTTT AACATCAAGAAGCTGACCTACCACGGCCCTGAACTTCCTAGGATCTCCATGTCCCTCATCCTCTTCTTCA</cbrc:requ iresquerysequence> <cbrc:requiressubjectsequence> AGATCCTTAGCCTTGTTGTTCCTTTTCTCAAGGAGCTTTACGCTACCGTCTTCACAGATCTCATAGATCG CGAAAAACTCTGAATCTCCGTAGTGAGCGTCTATGAGATGTTCATCATCCTCCATTCCAAAGGCTACCTT GTTCCTGGGACCTAGGTATCTACCGAGGTACCTACCTATCTTGGGCATTAATGGGGCCTCAGCTATGAAG TAATAACGTCAAGCCCGAGCCTCCTCGCCGCTTCGGCAACTGCACCATCAGCGATGACCGCGATCTTTAC TCTTTAAGGTTCACTGCCACCTCGACACTCTGTGTGAAGTTACGCGGCTTGGCCC</cbrc:requiresSubjectSequ ence> <cbrc:hasoptionsforlastdb>-m110 -w1</cbrc:hasoptionsforlastdb> <cbrc:hasoptionsforlastal>-j4 -u0 -m10 -l1 -k1 -w0 -g1.0 -s2 -e30</cbrc:hasoptionsforlastal> </cbrc:lastinput> </rdf:rdf> 2-2 Input RDF for Last 42

47 Command for execution Enter the command below. % curl In case of Last, because of asynchronous communication, RDF corresponding HTTP response code 202 is returned to standard out (2-3) Polling to SADI server is performed using defined by.isdefinedby. % curl COMPLETE: <cbrc:lastoutput rdf:about=" <rdfs:isdefinedby rdf:resource= /> </cbrc:lastoutput> </rdf:rdf> 2-3 RDF corresponding HTTP response code 202 % curl seq1_seq2result.rdf -ooutputrdf 43

48 Results Last results will be written in RDF format defined by the following: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is LastOutput class, rdf:about attribute is last.rdf#1 <cbrc:lastoutput rdf:about= > - Define a triple for Last. Subject: LastOutput Predicate: requiresresultintextformat Object: results <cbrc:requiresresultintextformat>last results </cbrc:requiresresultintextformat> - Define a triple for Last graphic base64 conversion. Subject: LastOutput Predicate: requiresresultinbase64binaryformat Object: results <cbrc:requiresresultinbase64binaryformat>last results </cbrc:requiresresultinbase64binaryformat> * This triple will not be written if noimage is defined for lastaloptions in input RDF. An example of Last results are shown in 2-4. Letters in red for a Last graphic conversion, letters in blue for Last results. 44

49 <cbrc:lastoutput rdf:about=" <cbrc:requiresresultinbase64binaryformat>ivborw0kggoaaaansuheugaaa8caaapocaaaaadw ntnlaaanoeleqvr42u3daxijrrzf0dz/zmsj ohblqllzburnnc73occmgmiw0gp7kvm/rhlsqlrhuwa65k9+dcah37/91/jh/spjw8fhgl/6q42v /lbjh3/7ya8khfp61+ie7ltqj98b93xph79/5bf/ubjvf/j4/euf/6uv//mifvjo0tgvhcrjpbdz AAAASUVORK5CYII=</cbrc:requiresResultInBase64BinaryFormat> <cbrc:requiresresultintextformat># LAST version 58 # # a=7 b=1 c= e=30 d=18 x=27 y=10 # u=0 s=2 m=10 l=1 k=1 i= w=0 t= g=1 j=4 # seq1 a score=30 s gi ref NC_ GTTGAGAGTTCCAATAAGACTAAAATAGGATTGAAA s GATCCTTAGCCTTGTTGTTCCTTTTCTCAAGGAGCTTTACGCTACCGTCTTCACAGATCTCATAGATCG GTGGCGAGTTCCAATAAGACTAAAATAGAATTGAAA p # CPU time: 0.36 seconds </cbrc:requiresresultintextformat> </cbrc:lastoutput> </rdf:rdf> 2-4 Last results 45

50 2.2. Modelling Prepare input RDF Create input RDF for Modelling as follows: - Define vocabulary RDF and CBRC OWL in RDF header xmlns:rdf xmlns:cbrc > - Subject is ModellingInput class, rdf:about attribute is modelling.rdf#1 <cbrc:modellinginput rdf:about= > - Define a triple for an amino acid sequence. Subject: ModellingInput Predicate: requiresqueryproteinsequence Object: sequence <cbrc:requiresqueryproteinsequence>sequence </cbrc:requiresqueryproteinsequence> - Define a triple for BLAST. Subject: ModellingInput Predicate: requiresblastprogramname Object: BLAST or PSI-BLAST <cbrc:requiresblastprogramname>blast or PSI-BLAST</cbrc:requiresBlastProgramName> - Define a triple for Iteration (PSI-BLAST) Subject: ModellingInput Predicate: setupiterationnumber Object: Iteration <cbrc:setupiterationnumber>3</cbrc:setupiterationnumber> - Define a triple for E-value Subject: ModellingInput Predicate: setupevaluethreshold 46

51 Object: E-Value <cbrc:setupevaluethreshold> </cbrc:setupevaluethreshold> - Define a triple for hit region coverage threshold. Subject: ModellingInput Predicate: setuphitregioncoveragethreshold Object: threshold <cbrc:setuphitregioncoveragethreshold>60.0 </cbrc:setuphitregioncoveragethreshold> - Define a triple for hit region identity. Subject: ModellingInput Predicate: setuphitregionidentitythreshold Object: identity <cbrc:setuphitregionidentitythreshold>30.0 </cbrc:setuphitregionidentitythreshold> - Define a triple for minimum sequence length Subject: ModellingInput Predicate: setupminimumsequencelength Object: identity <cbrc:setupminimumsequencelength>30 </cbrc:setupminimumsequencelength> - Define a triple for template coverage threshold. Subject: ModellingInput Predicate: setuptemplatecoveragethreshold Object: threshold <cbrc:setuptemplatecoveragethreshold>90.0 </cbrc:setuptemplatecoveragethreshold> - Define a triple for template coverage identity. Subject: ModellingInput Predicate: setuptemplateidentitythreshold Object: identity <cbrc:setuptemplateidentitythreshold>90.0 </cbrc:setuptemplateidentitythreshold> - Define a triple for MODELLER license key. Subject: ModellingInput Predicate: requireslicensekey Object: key 47

52 <cbrc:requireslicensekey>***</cbrc:requireslicensekey> ** Modelling cannot be used without MODELLER license key. A key must be obtained through - Define a triple for number of models. Subject: ModellingInput Predicate: setupmodelnumber Object: number <cbrc:setupmodelnumber>10</cbrc:setupmodelnumber> Input RDF for Modelling based on those definitions would look like 2-5. Letters in red for a sequence, letters in blue for command line options. 48

53 <cbrc:modellinginput rdf:about=" <cbrc:requiresqueryproteinsequence>>sample MNGTEGPNFYVPFSNKTGVVRSPFEAPQYYLAEPWQFSMLAAYMFLLIMLGFPINFLTLYVTVQHKKLRTPLNYILLNLAV ADLFMVFGGFTTTLYTSLHGYFVFGPTGCNLEGFFATLGGEIALWSLVVLAIERYVVVCKPMSNFRFGENHAIMGVAFTWVMALACA APPLVGWSRYIPEGMQCSCGIDYYTPHEETNNESFVIYMFVVHFIIPLIVIFFCYGQLVFTVKEAAAQQQESATTQKAEKEVTRMVI IMVIAFLICWLPYAGVAFYIFTHQGSDFGPIFMTIPAFFAKTSAVYNPVIYIMMNKQFRNCMVTTLCCGKNKREIRLMKNREAAREC RRKKKEYVKCLENRVAVLENQNKTLIEELKTLKDLYSNKMSEEGPQVKIREASKDNVDFILSNVDLAMANSLRRVMIAEIPTLAIDS VEVETNTTVLADEFIAHRLGLIPLQSMDIEQLEYSRDCFCEDHCDKCSVVLTLQAFGESESTTNVYSKDLVIVSNLMGRNIGHPIIQ DKEGNGVLICKLRKGQELKLTCVAKKGIAKEHAKWGPAAAIEFEYDPWNKLKHTDYWYEQDSAKEWPQSKNCEYEDPPNEGDPFDYK AQADTFYMNVESVGSIPVDQVVVRGIDTLQKKVASILLALTQMDQDKVNFASGDNNTASNMLGSNEDVMMTGAEQDPYSNASQMGNT GSGGYDNAW</cbrc:requiresQueryProteinSequence> <cbrc:requiresblastprogramname>blast</cbrc:requiresblastprogramname> <cbrc:setupevaluethreshold> </cbrc:setupevaluethreshold> <cbrc:setuphitregioncoveragethreshold>60.0</cbrc:setuphitregioncoveragethreshold> <cbrc:setuphitregionidentitythreshold>30.0</cbrc:setuphitregionidentitythreshold> <cbrc:setupminimumsequencelength>30</cbrc:setupminimumsequencelength> <cbrc:setuptemplatecoveragethreshold>95.0</cbrc:setuptemplatecoveragethreshold> <cbrc:setuptemplateidentitythreshold>95.0</cbrc:setuptemplateidentitythreshold> <cbrc:requireslicensekey>***</cbrc:requireslicensekey> <cbrc:setupmodelnumber>10</cbrc:setupmodelnumber> </cbrc:modellinginput> </rdf:rdf> 2-5 Input RDF for Modelling 49

Database Searching Using BLAST

Database Searching Using BLAST Mahidol University Objectives SCMI512 Molecular Sequence Analysis Database Searching Using BLAST Lecture 2B After class, students should be able to: explain the FASTA algorithm for database searching explain

More information

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege Sequence Alignment GBIO0002 Archana Bhardwaj University of Liege 1 What is Sequence Alignment? A sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity.

More information

Lecture 5 Advanced BLAST

Lecture 5 Advanced BLAST Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 5 Advanced BLAST BLAST Recap Sequence Alignment Complexity and indexing BLASTN and BLASTP Basic parameters

More information

Tutorial 4 BLAST Searching the CHO Genome

Tutorial 4 BLAST Searching the CHO Genome Tutorial 4 BLAST Searching the CHO Genome Accessing the CHO Genome BLAST Tool The CHO BLAST server can be accessed by clicking on the BLAST button on the home page or by selecting BLAST from the menu bar

More information

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading:

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading: 24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, 2010 3 BLAST and FASTA This lecture is based on the following papers, which are all recommended reading: D.J. Lipman and W.R. Pearson, Rapid

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

Sequence alignment theory and applications Session 3: BLAST algorithm

Sequence alignment theory and applications Session 3: BLAST algorithm Sequence alignment theory and applications Session 3: BLAST algorithm Introduction to Bioinformatics online course : IBT Sonal Henson Learning Objectives Understand the principles of the BLAST algorithm

More information

Wilson Leung 05/27/2008 A Simple Introduction to NCBI BLAST

Wilson Leung 05/27/2008 A Simple Introduction to NCBI BLAST A Simple Introduction to NCBI BLAST Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment Resources: The BLAST web server is available at http://www.ncbi.nih.gov/blast/

More information

Basic Local Alignment Search Tool (BLAST)

Basic Local Alignment Search Tool (BLAST) BLAST 26.04.2018 Basic Local Alignment Search Tool (BLAST) BLAST (Altshul-1990) is an heuristic Pairwise Alignment composed by six-steps that search for local similarities. The most used access point to

More information

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment

Wilson Leung 01/03/2018 An Introduction to NCBI BLAST. Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment An Introduction to NCBI BLAST Prerequisites: Detecting and Interpreting Genetic Homology: Lecture Notes on Alignment Resources: The BLAST web server is available at https://blast.ncbi.nlm.nih.gov/blast.cgi

More information

Lab 4: Multiple Sequence Alignment (MSA)

Lab 4: Multiple Sequence Alignment (MSA) Lab 4: Multiple Sequence Alignment (MSA) The objective of this lab is to become familiar with the features of several multiple alignment and visualization tools, including the data input and output, basic

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2017 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

BLAST, Profile, and PSI-BLAST

BLAST, Profile, and PSI-BLAST BLAST, Profile, and PSI-BLAST Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 26 Free for academic use Copyright @ Jianlin Cheng & original sources

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Bioinformatics explained: BLAST. March 8, 2007

Bioinformatics explained: BLAST. March 8, 2007 Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics

More information

MetaPhyler Usage Manual

MetaPhyler Usage Manual MetaPhyler Usage Manual Bo Liu boliu@umiacs.umd.edu March 13, 2012 Contents 1 What is MetaPhyler 1 2 Installation 1 3 Quick Start 2 3.1 Taxonomic profiling for metagenomic sequences.............. 2 3.2

More information

SPARQL Query Examples

SPARQL Query Examples SPARQL Query Examples 2015/07/24 Molecular Profiling Research Center for Drug Discovery (MolProf), AIST 1. frnadb As an example, a SPARQL search for frnadb shall be executed by following search frnadb

More information

SADI Semantic Web Services

SADI Semantic Web Services SADI Semantic Web Services London, UK 8 December 8 2011 SADI Semantic Web Services Instructor: Luke McCarthy http:// sadiframework.org/training/ 2 Contents 2.1 Introduction to Semantic Web Services 2.1

More information

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be 48 Bioinformatics I, WS 09-10, S. Henz (script by D. Huson) November 26, 2009 4 BLAST and BLAT Outline of the chapter: 1. Heuristics for the pairwise local alignment of two sequences 2. BLAST: search and

More information

Introduction to Computational Molecular Biology

Introduction to Computational Molecular Biology 18.417 Introduction to Computational Molecular Biology Lecture 13: October 21, 2004 Scribe: Eitan Reich Lecturer: Ross Lippert Editor: Peter Lee 13.1 Introduction We have been looking at algorithms to

More information

Bioinformatics. Sequence alignment BLAST Significance. Next time Protein Structure

Bioinformatics. Sequence alignment BLAST Significance. Next time Protein Structure Bioinformatics Sequence alignment BLAST Significance Next time Protein Structure 1 Experimental origins of sequence data The Sanger dideoxynucleotide method F Each color is one lane of an electrophoresis

More information

Biochemistry 324 Bioinformatics. Multiple Sequence Alignment (MSA)

Biochemistry 324 Bioinformatics. Multiple Sequence Alignment (MSA) Biochemistry 324 Bioinformatics Multiple Sequence Alignment (MSA) Big- Οh notation Greek omicron symbol Ο The Big-Oh notation indicates the complexity of an algorithm in terms of execution speed and storage

More information

BLAST MCDB 187. Friday, February 8, 13

BLAST MCDB 187. Friday, February 8, 13 BLAST MCDB 187 BLAST Basic Local Alignment Sequence Tool Uses shortcut to compute alignments of a sequence against a database very quickly Typically takes about a minute to align a sequence against a database

More information

Alignments BLAST, BLAT

Alignments BLAST, BLAT Alignments BLAST, BLAT Genome Genome Gene vs Built of DNA DNA Describes Organism Protein gene Stored as Circular/ linear Single molecule, or a few of them Both (depending on the species) Part of genome

More information

Pairwise Sequence Alignment. Zhongming Zhao, PhD

Pairwise Sequence Alignment. Zhongming Zhao, PhD Pairwise Sequence Alignment Zhongming Zhao, PhD Email: zhongming.zhao@vanderbilt.edu http://bioinfo.mc.vanderbilt.edu/ Sequence Similarity match mismatch A T T A C G C G T A C C A T A T T A T G C G A T

More information

Similarity Searches on Sequence Databases

Similarity Searches on Sequence Databases Similarity Searches on Sequence Databases Lorenza Bordoli Swiss Institute of Bioinformatics EMBnet Course, Zürich, October 2004 Swiss Institute of Bioinformatics Swiss EMBnet node Outline Importance of

More information

HORIZONTAL GENE TRANSFER DETECTION

HORIZONTAL GENE TRANSFER DETECTION HORIZONTAL GENE TRANSFER DETECTION Sequenzanalyse und Genomik (Modul 10-202-2207) Alejandro Nabor Lozada-Chávez Before start, the user must create a new folder or directory (WORKING DIRECTORY) for all

More information

How to Run NCBI BLAST on zcluster at GACRC

How to Run NCBI BLAST on zcluster at GACRC How to Run NCBI BLAST on zcluster at GACRC BLAST: Basic Local Alignment Search Tool Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu 1 OVERVIEW What is BLAST?

More information

BIR pipeline steps and subsequent output files description STEP 1: BLAST search

BIR pipeline steps and subsequent output files description STEP 1: BLAST search Lifeportal (Brief description) The Lifeportal at University of Oslo (https://lifeportal.uio.no) is a Galaxy based life sciences portal lifeportal.uio.no under the UiO tools section for phylogenomic analysis,

More information

NCBI BLAST: a better web interface

NCBI BLAST: a better web interface Published online 24 April 2008 Nucleic Acids Research, 2008, Vol. 36, Web Server issue W5 W9 doi:10.1093/nar/gkn201 NCBI BLAST: a better web interface Mark Johnson, Irena Zaretskaya, Yan Raytselis, Yuri

More information

RDF. Mario Arrigoni Neri

RDF. Mario Arrigoni Neri RDF Mario Arrigoni Neri WEB Generations Internet phase 1: static contents HTML pages FTP resources User knows what he needs and where to retrieve it Internet phase 2: web applications Custom presentation

More information

JET 2 User Manual 1 INSTALLATION 2 EXECUTION AND FUNCTIONALITIES. 1.1 Download. 1.2 System requirements. 1.3 How to install JET 2

JET 2 User Manual 1 INSTALLATION 2 EXECUTION AND FUNCTIONALITIES. 1.1 Download. 1.2 System requirements. 1.3 How to install JET 2 JET 2 User Manual 1 INSTALLATION 1.1 Download The JET 2 package is available at www.lcqb.upmc.fr/jet2. 1.2 System requirements JET 2 runs on Linux or Mac OS X. The program requires some external tools

More information

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio. 1990. CS 466 Saurabh Sinha Motivation Sequence homology to a known protein suggest function of newly sequenced protein Bioinformatics

More information

WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches 4-3 DSAP: BLASTn Page p. 7-1 NCBI BLAST Home Page p. 7-1 NCBI BLASTN search page p. 7-2 Copy sequence from DSAP or wave form program p. 7-2 Choose a database

More information

A Coprocessor Architecture for Fast Protein Structure Prediction

A Coprocessor Architecture for Fast Protein Structure Prediction A Coprocessor Architecture for Fast Protein Structure Prediction M. Marolia, R. Khoja, T. Acharya, C. Chakrabarti Department of Electrical Engineering Arizona State University, Tempe, USA. Abstract Predicting

More information

BLAST - Basic Local Alignment Search Tool

BLAST - Basic Local Alignment Search Tool Lecture for ic Bioinformatics (DD2450) April 11, 2013 Searching 1. Input: Query Sequence 2. Database of sequences 3. Subject Sequence(s) 4. Output: High Segment Pairs (HSPs) Sequence Similarity Measures:

More information

BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J.

BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J. BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J. Buhler Prerequisites: BLAST Exercise: Detecting and Interpreting

More information

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham Semantics Matthew J. Graham CACR Methods of Computational Science Caltech, 2011 May 10 semantic web The future of the Internet (Web 3.0) Decentralized platform for distributed knowledge A web of databases

More information

Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation

Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation Cui Tao and David W. Embley Department of Computer Science Brigham Young University, Provo, Utah 84602, U.S.A. Abstract

More information

Chapter 4: Blast. Chaochun Wei Fall 2014

Chapter 4: Blast. Chaochun Wei Fall 2014 Course organization Introduction ( Week 1-2) Course introduction A brief introduction to molecular biology A brief introduction to sequence comparison Part I: Algorithms for Sequence Analysis (Week 3-11)

More information

Similarity searches in biological sequence databases

Similarity searches in biological sequence databases Similarity searches in biological sequence databases Volker Flegel september 2004 Page 1 Outline Keyword search in databases General concept Examples SRS Entrez Expasy Similarity searches in databases

More information

PyMod 2. User s Guide. PyMod 2 Documention (Last updated: 7/11/2016)

PyMod 2. User s Guide. PyMod 2 Documention (Last updated: 7/11/2016) PyMod 2 User s Guide PyMod 2 Documention (Last updated: 7/11/2016) http://schubert.bio.uniroma1.it/pymod/index.html Department of Biochemical Sciences A. Rossi Fanelli, Sapienza University of Rome, Italy

More information

Bioinformatics for Biologists

Bioinformatics for Biologists Bioinformatics for Biologists Sequence Analysis: Part I. Pairwise alignment and database searching Fran Lewitter, Ph.D. Director Bioinformatics & Research Computing Whitehead Institute Topics to Cover

More information

BGGN-213: FOUNDATIONS OF BIOINFORMATICS. The find-a-gene project assignment Dr. Barry Grant Nov 2017

BGGN-213: FOUNDATIONS OF BIOINFORMATICS. The find-a-gene project assignment   Dr. Barry Grant Nov 2017 BGGN-213: FOUNDATIONS OF BIOINFORMATICS The find-a-gene project assignment https://bioboot.github.io/bggn213_f17/ Dr. Barry Grant Nov 2017 Overview: The find-a-gene project is a required assignment for

More information

BLOSUM Trie for Faster Hit Detection in FSA Protein BLAST

BLOSUM Trie for Faster Hit Detection in FSA Protein BLAST BLOSUM Trie for Faster Hit Detection in FSA Protein BLAST M Anuradha Research scholar Department of Computer Science & Systems Engineering, Andhra University Visakhapatnam - 53 3 K Suman Nelson Software

More information

Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation

Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation Automatic Hidden-Web Table Interpretation, Conceptualization, and Semantic Annotation Cui Tao and David W. Embley Department of Computer Science Brigham Young University, Provo, Utah 84602, U.S.A. Abstract

More information

Chapter 13: Advanced topic 3 Web 3.0

Chapter 13: Advanced topic 3 Web 3.0 Chapter 13: Advanced topic 3 Web 3.0 Contents Web 3.0 Metadata RDF SPARQL OWL Web 3.0 Web 1.0 Website publish information, user read it Ex: Web 2.0 User create content: post information, modify, delete

More information

BGGN 213 Foundations of Bioinformatics Barry Grant

BGGN 213 Foundations of Bioinformatics Barry Grant BGGN 213 Foundations of Bioinformatics Barry Grant http://thegrantlab.org/bggn213 Recap From Last Time: 25 Responses: https://tinyurl.com/bggn213-02-f17 Why ALIGNMENT FOUNDATIONS Why compare biological

More information

Assessing Transcriptome Assembly

Assessing Transcriptome Assembly Assessing Transcriptome Assembly Matt Johnson July 9, 2015 1 Introduction Now that you have assembled a transcriptome, you are probably wondering about the sequence content. Are the sequences from the

More information

Heuristic methods for pairwise alignment:

Heuristic methods for pairwise alignment: Bi03c_1 Unit 03c: Heuristic methods for pairwise alignment: k-tuple-methods k-tuple-methods for alignment of pairs of sequences Bi03c_2 dynamic programming is too slow for large databases Use heuristic

More information

AlignMe Manual. Version 1.1. Rene Staritzbichler, Marcus Stamm, Kamil Khafizov and Lucy R. Forrest

AlignMe Manual. Version 1.1. Rene Staritzbichler, Marcus Stamm, Kamil Khafizov and Lucy R. Forrest AlignMe Manual Version 1.1 Rene Staritzbichler, Marcus Stamm, Kamil Khafizov and Lucy R. Forrest Max Planck Institute of Biophysics Frankfurt am Main 60438 Germany 1) Introduction...3 2) Using AlignMe

More information

CAP BLAST. BIOINFORMATICS Su-Shing Chen CISE. 8/20/2005 Su-Shing Chen, CISE 1

CAP BLAST. BIOINFORMATICS Su-Shing Chen CISE. 8/20/2005 Su-Shing Chen, CISE 1 CAP 5510-6 BLAST BIOINFORMATICS Su-Shing Chen CISE 8/20/2005 Su-Shing Chen, CISE 1 BLAST Basic Local Alignment Prof Search Su-Shing Chen Tool A Fast Pair-wise Alignment and Database Searching Tool 8/20/2005

More information

Linking Data with RDF

Linking Data with RDF Linking Data with RDF Wiltrud Kessler Institut für Maschinelle Sprachverarbeitung Universität Stuttgart Semantic Web Winter 2014/15 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike

More information

Enabling Semantic Web Programming by Integrating RDF and Common Lisp

Enabling Semantic Web Programming by Integrating RDF and Common Lisp Enabling Semantic Web Programming by Integrating RDF and Common Lisp An Introduction to NRC s Wilbur RDF & DAML Toolkit Ora Lassila Nokia Research Center July 2001 Common Lisp & Frame Systems Traditional

More information

How to use KAIKObase Version 3.1.0

How to use KAIKObase Version 3.1.0 How to use KAIKObase Version 3.1.0 Version3.1.0 29/Nov/2010 http://sgp2010.dna.affrc.go.jp/kaikobase/ Copyright National Institute of Agrobiological Sciences. All rights reserved. Outline 1. System overview

More information

Bioinformatics Ontology: Towards the Automatics Generation of Bioinformatics Workflow for Web Services

Bioinformatics Ontology: Towards the Automatics Generation of Bioinformatics Workflow for Web Services Bioinformatics Ontology: Towards the Automatics Generation of Bioinformatics Workflow for Web Services Konagaya Akihiko Project Director Advanced Genome Information Technology Research Group RIKEN Genomic

More information

CS 284A: Algorithms for Computational Biology Notes on Lecture: BLAST. The statistics of alignment scores.

CS 284A: Algorithms for Computational Biology Notes on Lecture: BLAST. The statistics of alignment scores. CS 284A: Algorithms for Computational Biology Notes on Lecture: BLAST. The statistics of alignment scores. prepared by Oleksii Kuchaiev, based on presentation by Xiaohui Xie on February 20th. 1 Introduction

More information

Bioinformatics Database Worksheet

Bioinformatics Database Worksheet Bioinformatics Database Worksheet (based on http://www.usm.maine.edu/~rhodes/goodies/matics.html) Where are the opsin genes in the human genome? Point your browser to the NCBI Map Viewer at http://www.ncbi.nlm.nih.gov/mapview/.

More information

Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison

Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison Van Hoa NGUYEN IRISA/INRIA Rennes Rennes, France Email: vhnguyen@irisa.fr Dominique LAVENIER CNRS/IRISA Rennes, France Email:

More information

Notes for installing a local blast+ instance of NCBI BLAST F. J. Pineda 09/25/2017

Notes for installing a local blast+ instance of NCBI BLAST F. J. Pineda 09/25/2017 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Notes for installing a local blast+ instance of NCBI BLAST F. J. Pineda 09/25/2017

More information

RDF. Charlie Abela Department of Artificial Intelligence

RDF. Charlie Abela Department of Artificial Intelligence RDF Charlie Abela Department of Artificial Intelligence charlie.abela@um.edu.mt Last Lecture Introduced XPath and XQuery as languages that allow for accessing and extracting node information from XML Problems?

More information

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services.

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services. 1. (24 points) Identify all of the following statements that are true about the basics of services. A. If you know that two parties implement SOAP, then you can safely conclude they will interoperate at

More information

Example of repeats: ATGGTCTAGGTCCTAGTGGTC Motivation to find them: Genomic rearrangements are often associated with repeats Trace evolutionary

Example of repeats: ATGGTCTAGGTCCTAGTGGTC Motivation to find them: Genomic rearrangements are often associated with repeats Trace evolutionary Outline Hash Tables Repeat Finding Exact Pattern Matching Keyword Trees Suffix Trees Heuristic Similarity Search Algorithms Approximate String Matching Filtration Comparing a Sequence Against a Database

More information

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall The Semantic Web Revisited Nigel Shadbolt Tim Berners-Lee Wendy Hall Today sweb It is designed for human consumption Information retrieval is mainly supported by keyword-based search engines Some problems

More information

Combinatorial Pattern Matching

Combinatorial Pattern Matching Combinatorial Pattern Matching Outline Hash Tables Repeat Finding Exact Pattern Matching Keyword Trees Suffix Trees Heuristic Similarity Search Algorithms Approximate String Matching Filtration Comparing

More information

Homology Modeling FABP

Homology Modeling FABP Homology Modeling FABP Homology modeling is a technique used to approximate the 3D structure of a protein when no experimentally determined structure exists. It operates under the principle that protein

More information

RDF. Dr. Mustafa Jarrar. Knowledge Engineering (SCOM7348) University of Birzeit

RDF. Dr. Mustafa Jarrar. Knowledge Engineering (SCOM7348) University of Birzeit Mustafa Jarrar Lecture Notes, Knowledge Engineering (SCOM7348) University of Birzeit 1 st Semester, 2011 Knowledge Engineering (SCOM7348) RDF Dr. Mustafa Jarrar University of Birzeit mjarrar@birzeit.edu

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Sequence pairwise alignment Score statistics: E-value and p-value Heuristic algorithms: BLAST and FASTA Database search: gene finding and annotations

More information

B L A S T! BLAST: Basic local alignment search tool. Copyright notice. February 6, Pairwise alignment: key points. Outline of tonight s lecture

B L A S T! BLAST: Basic local alignment search tool. Copyright notice. February 6, Pairwise alignment: key points. Outline of tonight s lecture February 6, 2008 BLAST: Basic local alignment search tool B L A S T! Jonathan Pevsner, Ph.D. Introduction to Bioinformatics pevsner@jhmi.edu 4.633.0 Copyright notice Many of the images in this powerpoint

More information

CS313 Exercise 4 Cover Page Fall 2017

CS313 Exercise 4 Cover Page Fall 2017 CS313 Exercise 4 Cover Page Fall 2017 Due by the start of class on Thursday, October 12, 2017. Name(s): In the TIME column, please estimate the time you spent on the parts of this exercise. Please try

More information

Introduction to Phylogenetics Week 2. Databases and Sequence Formats

Introduction to Phylogenetics Week 2. Databases and Sequence Formats Introduction to Phylogenetics Week 2 Databases and Sequence Formats I. Databases Crucial to bioinformatics The bigger the database, the more comparative research data Requires scientists to upload data

More information

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into

More information

C E N T R. Introduction to bioinformatics 2007 E B I O I N F O R M A T I C S V U F O R I N T. Lecture 13 G R A T I V. Iterative homology searching,

C E N T R. Introduction to bioinformatics 2007 E B I O I N F O R M A T I C S V U F O R I N T. Lecture 13 G R A T I V. Iterative homology searching, C E N T R E F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U Introduction to bioinformatics 2007 Lecture 13 Iterative homology searching, PSI (Position Specific Iterated) BLAST basic idea use

More information

RDF Graph Data Model

RDF Graph Data Model Mustafa Jarrar: Lecture Notes on RDF Data Model Birzeit University, 2018 Version 7 RDF Graph Data Model Mustafa Jarrar Birzeit University 1 Watch this lecture and download the slides Course Page: http://www.jarrar.info/courses/ai/

More information

For return on 19 January 2018 (late submission: 2 February 2018)

For return on 19 January 2018 (late submission: 2 February 2018) Semantic Technologies Autumn 2017 Coursework For return on 19 January 2018 (late submission: 2 February 2018) Electronic submission:.pdf and.owl files only 1. (6%) Consider the following XML document:

More information

Outline RDF. RDF Schema (RDFS) RDF Storing. Semantic Web and Metadata What is RDF and what is not? Why use RDF? RDF Elements

Outline RDF. RDF Schema (RDFS) RDF Storing. Semantic Web and Metadata What is RDF and what is not? Why use RDF? RDF Elements Knowledge management RDF and RDFS 1 RDF Outline Semantic Web and Metadata What is RDF and what is not? Why use RDF? RDF Elements RDF Schema (RDFS) RDF Storing 2 Semantic Web The Web today: Documents for

More information

PyMod Documentation (Version 2.1, September 2011)

PyMod Documentation (Version 2.1, September 2011) PyMod User s Guide PyMod Documentation (Version 2.1, September 2011) http://schubert.bio.uniroma1.it/pymod/ Emanuele Bramucci & Alessandro Paiardini, Francesco Bossa, Stefano Pascarella, Department of

More information

Semantic Web In Depth: Resource Description Framework. Dr Nicholas Gibbins 32/4037

Semantic Web In Depth: Resource Description Framework. Dr Nicholas Gibbins 32/4037 Semantic Web In Depth: Resource Description Framework Dr Nicholas Gibbins 32/4037 nmg@ecs.soton.ac.uk RDF syntax(es) RDF/XML is the standard syntax Supported by almost all tools RDF/N3 (Notation3) is also

More information

Mustafa Jarrar: Lecture Notes on RDF Schema Birzeit University, Version 3. RDFS RDF Schema. Mustafa Jarrar. Birzeit University

Mustafa Jarrar: Lecture Notes on RDF Schema Birzeit University, Version 3. RDFS RDF Schema. Mustafa Jarrar. Birzeit University Mustafa Jarrar: Lecture Notes on RDF Schema Birzeit University, 2018 Version 3 RDFS RDF Schema Mustafa Jarrar Birzeit University 1 Watch this lecture and download the slides Course Page: http://www.jarrar.info/courses/ai/

More information

Lecture 4: January 1, Biological Databases and Retrieval Systems

Lecture 4: January 1, Biological Databases and Retrieval Systems Algorithms for Molecular Biology Fall Semester, 1998 Lecture 4: January 1, 1999 Lecturer: Irit Orr Scribe: Irit Gat and Tal Kohen 4.1 Biological Databases and Retrieval Systems In recent years, biological

More information

COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1. Database Searching

COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1. Database Searching COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1 Database Searching In database search, we typically have a large sequence database

More information

Principles of Bioinformatics. BIO540/STA569/CSI660 Fall 2010

Principles of Bioinformatics. BIO540/STA569/CSI660 Fall 2010 Principles of Bioinformatics BIO540/STA569/CSI660 Fall 2010 Lecture 11 Multiple Sequence Alignment I Administrivia Administrivia The midterm examination will be Monday, October 18 th, in class. Closed

More information

Semantic Web Technologies: Web Ontology Language

Semantic Web Technologies: Web Ontology Language Semantic Web Technologies: Web Ontology Language Motivation OWL Formal Semantic OWL Synopsis OWL Programming Introduction XML / XML Schema provides a portable framework for defining a syntax RDF forms

More information

The BLASTER suite Documentation

The BLASTER suite Documentation The BLASTER suite Documentation Hadi Quesneville Bioinformatics and genomics Institut Jacques Monod, Paris, France http://www.ijm.fr/ijm/recherche/equipes/bioinformatique-genomique Last modification: 05/09/06

More information

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA.

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Fasta is used to compare a protein or DNA sequence to all of the

More information

The Semantic Web. Web Programming. Uta Priss ZELL, Ostfalia University. The Semantic Web RDF and OWL Ontologies

The Semantic Web. Web Programming. Uta Priss ZELL, Ostfalia University. The Semantic Web RDF and OWL Ontologies The Semantic Web Web Programming Uta Priss ZELL, Ostfalia University 2013 Web Programming The Semantic Web Slide 1/13 Outline The Semantic Web RDF and OWL Ontologies Web Programming The Semantic Web Slide

More information

Optimizing Bioinformatics Workflow Execution Through Pipelining Techniques

Optimizing Bioinformatics Workflow Execution Through Pipelining Techniques Optimizing Bioinformatics Workflow Execution Through Pipelining Techniques Melissa Lemos 1 *, Luiz Fernando B.Seibel 1, Antonio Basílio de Miranda 2, Marco Antonio Casanova 1 1 Department of Informatics,

More information

Read mapping with BWA and BOWTIE

Read mapping with BWA and BOWTIE Read mapping with BWA and BOWTIE Before We Start In order to save a lot of typing, and to allow us some flexibility in designing these courses, we will establish a UNIX shell variable BASE to point to

More information

Geneious 2.0. Biomatters Ltd

Geneious 2.0. Biomatters Ltd Geneious 2.0 Biomatters Ltd August 2, 2006 2 Contents 1 Getting Started 5 1.1 Downloading & Installing Geneious.......................... 5 1.2 Using Geneious for the first time............................

More information

Sequence Alignment: Mo1va1on and Algorithms. Lecture 2: August 23, 2012

Sequence Alignment: Mo1va1on and Algorithms. Lecture 2: August 23, 2012 Sequence Alignment: Mo1va1on and Algorithms Lecture 2: August 23, 2012 Mo1va1on and Introduc1on Importance of Sequence Alignment For DNA, RNA and amino acid sequences, high sequence similarity usually

More information

OSM Lecture (14:45-16:15) Takahira Yamaguchi. OSM Exercise (16:30-18:00) Susumu Tamagawa

OSM Lecture (14:45-16:15) Takahira Yamaguchi. OSM Exercise (16:30-18:00) Susumu Tamagawa OSM Lecture (14:45-16:15) Takahira Yamaguchi OSM Exercise (16:30-18:00) Susumu Tamagawa TBL 1 st Proposal Information Management: A Proposal (1989) Links have the following types: depends on is part of

More information

BioExtract Server User Manual

BioExtract Server User Manual BioExtract Server User Manual University of South Dakota About Us The BioExtract Server harnesses the power of online informatics tools for creating and customizing workflows. Users can query online sequence

More information

Automating Data Analysis with PERL

Automating Data Analysis with PERL Automating Data Analysis with PERL Lecture Note for Computational Biology 1 (LSM 5191) Jiren Wang http://www.bii.a-star.edu.sg/~jiren BioInformatics Institute Singapore Outline Regular Expression and Pattern

More information

Sequence Alignment: BLAST

Sequence Alignment: BLAST E S S E N T I A L S O F N E X T G E N E R A T I O N S E Q U E N C I N G W O R K S H O P 2015 U N I V E R S I T Y O F K E N T U C K Y A G T C Class 6 Sequence Alignment: BLAST Be able to install and use

More information

Appendix D: Completed Annotation Report for the Spinophilin G Isoform of Drosophila erecta

Appendix D: Completed Annotation Report for the Spinophilin G Isoform of Drosophila erecta Appendix D: Completed Annotation Report for the Spinophilin G Isoform of Drosophila erecta Annotation report Student Name: xxxxxxx & xxxxxxxxx Student E-mail: xxxxxxx@amherst.edu & xxxxxx@amherst.edu Faculty

More information

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services Contents G52IWS: The Semantic Web Chris Greenhalgh 2007-11-10 Introduction to the Semantic Web Semantic Web technologies Overview RDF OWL Semantic Web Services Concluding comments 1 See Developing Semantic

More information

Scientific Programming Practical 10

Scientific Programming Practical 10 Scientific Programming Practical 10 Introduction Luca Bianco - Academic Year 2017-18 luca.bianco@fmach.it Biopython FROM Biopython s website: The Biopython Project is an international association of developers

More information

BLAST. Basic Local Alignment Search Tool. Used to quickly compare a protein or DNA sequence to a database.

BLAST. Basic Local Alignment Search Tool. Used to quickly compare a protein or DNA sequence to a database. BLAST Basic Local Alignment Search Tool Used to quickly compare a protein or DNA sequence to a database. There is no such thing as a free lunch BLAST is fast and highly sensitive compared to competitors.

More information

BLAST & Genome assembly

BLAST & Genome assembly BLAST & Genome assembly Solon P. Pissis Tomáš Flouri Heidelberg Institute for Theoretical Studies November 17, 2012 1 Introduction Introduction 2 BLAST What is BLAST? The algorithm 3 Genome assembly De

More information

Bioinformatics Sequence comparison 2 local pairwise alignment

Bioinformatics Sequence comparison 2 local pairwise alignment Bioinformatics Sequence comparison 2 local pairwise alignment David Gilbert Bioinformatics Research Centre www.brc.dcs.gla.ac.uk Department of Computing Science, University of Glasgow Lecture contents

More information