Analyzer of Bio-resource Citations. World Data Center of Microorganisms(WDCM)

Size: px
Start display at page:

Download "Analyzer of Bio-resource Citations. World Data Center of Microorganisms(WDCM)"

Transcription

1 Analyzer of Bio-resource Citations World Data Center of Microorganisms(WDCM)

2

3 Outlines Introduction of ABC Homepage and function of ABC Text mining for microorganism : classification, clustering, and applications

4 ABC -- Analyzer of Bio-resource Citations Find out the citations information for microorganism strains, culture collections from papers and patents. Find out the sequences information for strains and culture collections

5 ABC Data resources Dates In Resource In ABC Journal Paper Patent Journal Paper Patent Derwent Innovations IndexSM 1963~ >3 million ,349 PubMed 1940~ >25,000 >22 million --- 2,048 49, Highwire 1953~ 1,774 >2 million --- 1, , Web of science 1900~ ~10,000 >40 million

6 Analyzer of Bio-resource Citations Data resources Papers Resources patents Sequences CGMCC ATCC PubMed HighWire PMC Derwent Innovations Index Genome Genbank

7 Introduction of ABC Paper resources We totally gather 1,880,853 papers meta-info from these sources about this 437 culture collections. And we could extract 128,667 strains from 173,901 papers of this paper set. Source Highwire PubMed PMC Highwire&PubMed Highwire&PMC PubMed&PMC Highwire&PM&PMC Papers 1,412,691 papers 692,807 papers 113,038 papers 298,523 papers 31,333 papers 15,750 papers 7,923 papers

8 Literature resource Finding the sources which are valuable Resource Pubmed Highwire PMC Journal Ranking classify Paper PDF Retrieval 580 culture collections have been searched using the prefix of their strains, 111,240 strains are described in 144,134 papers We listed top 200 journals for culture strains (from 1953 January to 2011 July). More than 11,000 papers PDF in 17 journals be download in our severs. And full text search engineer be built based on LUCENE.

9 Data mining from public resources Pubmed/PMC: Securing data by CGI Pubmed/P MC Perl script Accessing remote server Storing data in XML XML HighWire: Securing data by web HighWire Web Crewlers Data download Storing data in html HTML

10 Data mining in full text The workflow of Lucene The flat files in our server Index files Target paper list Build index by Lucene Keyword search by Lucene Perl script query strain, organism in mysql database For others query MySQL pre_fix_tex t Perl script MySQL table: doc_strain_ref_id,title_ hash,strain_org,strain_ org_id,match_times

11 Strains count: Level 1 > Level 2 > Level 3 Quality control for strain citation Level 1 Mining strain ID using the acronym of culture collection e.g. JCM CBS CGMCC Level 2 Filtering strain ID according to the pattern e.g. JCM + number CBS+number ; CBS+number+.+number ; CGMCC+number ; CGMCC+number+.+number; Level 3 Matching strain ID in catalogue of culture collection e.g. JCM CBS CGMCC

12 Data filter for strain citation We completed strains filter of 24 culture collections. CBS, JCM, BCC, CCARM, LIPIMC, VTCC, MCC- MNH,ITDI,UL,UPCC, NBRC For example: JCM(paper/strain) CBS(paper/strain) CGMCC(paper/strai n) Level / / /504 Level / / /451 Level / / /64

13 New Functions in ABC --Microorganism search

14 New Functions in ABC --Dynamic statistics for each culture collection

15

16 ABC

17 ABC

18 ABC

19 ABC

20 ABC User can upload paper information and PDF document which is related to his/her culture collection.

21 ABC

22

23 Text Mining & Application Preprocessing: To Get Keywords

24 Text Mining & Application Example:

25 Text Mining & Application Application: Automatic Text Classification Method: a. First, define the category, and set up the training set. b. Second, represent each document as a vector, compute their similarity. c. Assign a document to the category which has the biggest similarity value.

26 Text Mining & Application Example: Documents From PubMed Central Class number percent Bacteriology % Environmental Microbiology % Genetics,Microbial % Industrial Microbiology % Mycology % Plant Pathology % Virology % Other %

27 Text Mining & Application Score the documents: For short:ijsem Extract those terms have relation to strains.

28 Text Mining & Application Score the documents: term weight strain 3625 temperature 404 acid 1741 atcc 393 growth 1407 optimum 379 genus 1388 microbiology 370 sequence 1382 evolutionary 370 fatty 1302 carbon 312 gene 1298 bacteria 303 type 1244 cell 302 medium 777 tree 290 analysis 720 production 287 novel 551 description 281 international 497 family 254 journal 441 systematic 244

29 Text Mining & Application Result Page Show:

30 Thanks for your attention!

When you use the EzTaxon server for your study, please cite the following article:

When you use the EzTaxon server for your study, please cite the following article: Microbiology Activity #11 - Analysis of 16S rrna sequence data In sexually reproducing organisms, species are defined by the ability to produce fertile offspring. In bacteria, species are defined by several

More information

Visualization and text mining of patent and non-patent data

Visualization and text mining of patent and non-patent data of patent and non-patent data Anton Heijs Information Solutions Delft, The Netherlands http://www.treparel.com/ ICIC conference, Nice, France, 2008 Outline Introduction Applications on patent and non-patent

More information

funricegenes Comprehensive understanding and application of rice functional genes

funricegenes Comprehensive understanding and application of rice functional genes funricegenes Comprehensive understanding and application of rice functional genes Part I Display of information in this database as static web pages https://funricegenes.github.io/ At the homepage of our

More information

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009 Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images

More information

Making the most of Oxford Journals Online Collection.

Making the most of Oxford Journals Online Collection. Making the most of Oxford Journals Online Collection. Part 2 Searching Oxford Journals & Expanding Your Search searching oxford journals & expanding your search This is one of a set of five modules that

More information

Relational Retrieval Using a Combination of Path-Constrained Random Walks

Relational Retrieval Using a Combination of Path-Constrained Random Walks Relational Retrieval Using a Combination of Path-Constrained Random Walks Ni Lao, William W. Cohen University 2010.9.22 Outline Relational Retrieval Problems Path-constrained random walks The need for

More information

Data Mining Technologies for Bioinformatics Sequences

Data Mining Technologies for Bioinformatics Sequences Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment

More information

Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha

Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha Physics Institute,

More information

An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery

An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery An Integrated Framework to Enhance the Web Content Mining and Knowledge Discovery Simon Pelletier Université de Moncton, Campus of Shippagan, BGI New Brunswick, Canada and Sid-Ahmed Selouani Université

More information

COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas

COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas First of all connect once again to the CBS system: Open ssh shell client. Press Quick

More information

LIBRARY Polytechnique Montréal. EndNote X7. Importing Instructions

LIBRARY Polytechnique Montréal. EndNote X7. Importing Instructions LIBRARY Polytechnique Montréal EndNote X7 Importing Instructions July 2013 Contents Introduction... 3 The Library Catalogue... 3 Direct Connexion... 3 ABI/INFORM Complete... 4 Indirect Import (with the

More information

Using open access literature to guide full-text query formulation. Heather A. Piwowar and Wendy W. Chapman. Background

Using open access literature to guide full-text query formulation. Heather A. Piwowar and Wendy W. Chapman. Background Using open access literature to guide full-text query formulation Heather A. Piwowar and Wendy W. Chapman Background Much scientific knowledge is contained in the details of the full-text biomedical literature.

More information

Prior Art Research Overview

Prior Art Research Overview Derwent Innovation Blueprint for Success Quickly Research Prior Art for an Invention How can you market your Invention? What niche does it fill? Is it patentable? You need to search for prior art to find

More information

Web Portfolio Design and Applications

Web Portfolio Design and Applications Web Portfolio Design and Applications Table of Contents Preface... viii Chapter I. Introduction to the Web Portfolio... 1 Introduction... 1 Background... 2 Web Literature and Review... 4 Who Needs a Web

More information

Getting Started with your Explorer for Institutions access

Getting Started with your Explorer for Institutions access Getting Started with your Explorer for Institutions access Welcome to the Altmetric Explorer for Institutions! Your access enables you to: Explore the full Altmetric database of over 7 million research

More information

Hebei University of Technology A Text-Mining-based Patent Analysis in Product Innovative Process

Hebei University of Technology A Text-Mining-based Patent Analysis in Product Innovative Process A Text-Mining-based Patent Analysis in Product Innovative Process Liang Yanhong, Tan Runhua Abstract Hebei University of Technology Patent documents contain important technical knowledge and research results.

More information

American Institute of Physics

American Institute of Physics American Institute of Physics (http://journals.aip.org/)* Founded in 1931, the American Institute of Physics (AIP) is a not-for-profit scholarly society established for the purpose of promoting the advancement

More information

Derwent Innovations Index

Derwent Innovations Index Derwent Innovations Index DERWENT INNOVATIONS INDEX Quick reference card ISI Web of Knowledge SM Derwent Innovations Index is a powerful patent research tool, combining Derwent World Patents Index, Patents

More information

ScienceDirect Hungary Library Information Tour

ScienceDirect Hungary Library Information Tour ScienceDirect Hungary Library Information Tour 27-28 May 2013 Silvie Niedworok Product Sales Manager Elsevier B.V. s.niedworok@elsevier.com ScienceDirect ScienceDirect is Elsevier s extensive and unique

More information

SCHOLARONE MANUSCRIPTS TM REVIEWER GUIDE

SCHOLARONE MANUSCRIPTS TM REVIEWER GUIDE SCHOLARONE MANUSCRIPTS TM REVIEWER GUIDE TABLE OF CONTENTS Select an item in the table of contents to go to that topic in the document. INTRODUCTION... 2 THE REVIEW PROCESS... 2 RECEIVING AN INVITATION...

More information

How are XML-based Marc21 and Dublin Core Records Indexed and ranked by General Search Engines in Dynamic Online Environments?

How are XML-based Marc21 and Dublin Core Records Indexed and ranked by General Search Engines in Dynamic Online Environments? How are XML-based Marc21 and Dublin Core Records Indexed and ranked by General Search Engines in Dynamic Online Environments? A. Hossein Farajpahlou Professor, Dept. Lib. and Info. Sci., Shahid Chamran

More information

Patentics User Guide [Type the date] Jenny Qiu

Patentics User Guide [Type the date] Jenny Qiu Patentics User Guide [Type the date] Jenny Qiu Patentics User Guide 2 INTRODUCTION... 3 RELEVANCY RANKING... 3 LIMITING THE SEARCH SET... 4 BOOLEAN SEARCH... 4 ADDITIONAL BOOLEAN FILTERS... 5 LOGIC OPERATORS...

More information

Best Practices to Ensure Comprehensive Prior Art Searches

Best Practices to Ensure Comprehensive Prior Art Searches Best Practices to Ensure Comprehensive Prior Art Searches December 2012 Best Practices to Ensure Comprehensive Prior Art Searches 2 Introduction Various types of Prior Art searching require that patent

More information

Taxonomic classification of SSU rrna community sequence data using CREST

Taxonomic classification of SSU rrna community sequence data using CREST Taxonomic classification of SSU rrna community sequence data using CREST 2014 Workshop on Genomics, Cesky Krumlov Anders Lanzén Overview 1. Familiarise yourself with CREST installation...2 2. Download

More information

INDEPTH Network Introduction to NADA

INDEPTH Network Introduction to NADA INDEPTH Network Introduction to NADA Sandeep Bhujbal ishare2 Support Team Outline What is Nada? Concepts of Nada. Why NADA for ishare2? INDEPTH Network NADA National Data Archive provided by the World

More information

J OVE VIDEO JOURNAL USER GUIDE

J OVE VIDEO JOURNAL USER GUIDE J OVE VIDEO JOURNAL USER GUIDE Edited August 07 www. subscriptions@jove.com + 67 0 777 ABOUT JOVE VIDEO JOURNAL JoVE Video Journal is the first peer-reviewed scientific journal combining high-quality video

More information

Data Mining: Decision Trees

Data Mining: Decision Trees Applies to: SAP BI 7.0. For more information, visit the EDW homepage Summary This article about the Data Mining and the Data Mining methods provided by SAP in brief. It explains the classification method

More information

NCBI News, November 2009

NCBI News, November 2009 Peter Cooper, Ph.D. NCBI cooper@ncbi.nlm.nh.gov Dawn Lipshultz, M.S. NCBI lipshult@ncbi.nlm.nih.gov Featured Resource: New Discovery-oriented PubMed and NCBI Homepage The NCBI Site Guide A new and improved

More information

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities Journal of Applied Science and Engineering Innovation, Vol.6, No.1, 2019, pp.26-30 ISSN (Print): 2331-9062 ISSN (Online): 2331-9070 Application of Individualized Service System for Scientific and Technical

More information

Web of Science. Platform Release 5.30 Release 1. Nina Chang Product Release Date: August 12, 2018 EXTERNAL RELEASE DOCUMENTATION

Web of Science. Platform Release 5.30 Release 1. Nina Chang Product Release Date: August 12, 2018 EXTERNAL RELEASE DOCUMENTATION Web of Science EXTERNAL RELEASE DOCUMENTATION Platform Release 5.30 Release 1 Nina Chang Product Release Date: August 12, 2018 Document Version: 1.0 Date of issue : August 8, 2018 RELEASE OVERVIEW The

More information

ANALYSIS OF LARGE GRAPH DATA WITH GRADOOP AND KNIME

ANALYSIS OF LARGE GRAPH DATA WITH GRADOOP AND KNIME ANALYSIS OF LARGE GRAPH DATA WITH GRADOOP AND KNIME ALEXANDER KIPP (ROBERT BOSCH GMBH), STEFFEN DIENST, STEFAN KÜHNE (UNIVERSITÄT LEIPZIG), TOBIAS KÖTTER (KNIME) Bosch Smart Semantics Application fields

More information

MetaPhyler Usage Manual

MetaPhyler Usage Manual MetaPhyler Usage Manual Bo Liu boliu@umiacs.umd.edu March 13, 2012 Contents 1 What is MetaPhyler 1 2 Installation 1 3 Quick Start 2 3.1 Taxonomic profiling for metagenomic sequences.............. 2 3.2

More information

B.Sc. BIOTECHNOLOGY. Course structure of three- year B.Sc. Degree Course in Biotechnology Under Choice Based Credit System

B.Sc. BIOTECHNOLOGY. Course structure of three- year B.Sc. Degree Course in Biotechnology Under Choice Based Credit System B.Sc. BIOTECHNOLOGY Course structure of three- year B.Sc. Degree Course in Under Choice Based Credit System SEMESTER I BBT 101 Core Course-Theory Animal Science BBT 102 Plant Science BBT 103 Foundations

More information

Extracting Algorithms by Indexing and Mining Large Data Sets

Extracting Algorithms by Indexing and Mining Large Data Sets Extracting Algorithms by Indexing and Mining Large Data Sets Vinod Jadhav 1, Dr.Rekha Rathore 2 P.G. Student, Department of Computer Engineering, RKDF SOE Indore, University of RGPV, Bhopal, India Associate

More information

Databases available to ISU researchers:

Databases available to ISU researchers: Databases available to ISU researchers: Table of Contents Web of Knowledge Overview 3 Web of Science 4 Cited Reference Searching 5 Secondary Cited Author Searching 8 Eliminating Self-Citations 9 Saving

More information

DATA MINING - 1DL105, 1DL111

DATA MINING - 1DL105, 1DL111 1 DATA MINING - 1DL105, 1DL111 Fall 2007 An introductory class in data mining http://user.it.uu.se/~udbl/dut-ht2007/ alt. http://www.it.uu.se/edu/course/homepage/infoutv/ht07 Kjell Orsborn Uppsala Database

More information

Gegenees genome format...7. Gegenees comparisons...8 Creating a fragmented all-all comparison...9 The alignment The analysis...

Gegenees genome format...7. Gegenees comparisons...8 Creating a fragmented all-all comparison...9 The alignment The analysis... User Manual: Gegenees V 1.1.0 What is Gegenees?...1 Version system:...2 What's new...2 Installation:...2 Perspectives...4 The workspace...4 The local database...6 Populate the local database...7 Gegenees

More information

1. Type of Deposit and Information: Please select the appropriate type of deposit and provide the requested information:

1. Type of Deposit and Information: Please select the appropriate type of deposit and provide the requested information: Non-Budapest Treaty Deposit Form (34) Patent Depository 10801 University Boulevard Manassas, Virginia 20110-2209 USA Telephone: (800) 638-6597 Facsimile: (703) 334-2932 Email: patentdeposit@atcc.org THIS

More information

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si Information Retrieval, Information Extraction, and Text Mining Applications for Biology Slides by Suleyman Cetintas & Luo Si 1 Outline Introduction Overview of Literature Data Sources PubMed, HighWire

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,

More information

Retrieval of Highly Related Documents Containing Gene-Disease Association

Retrieval of Highly Related Documents Containing Gene-Disease Association Retrieval of Highly Related Documents Containing Gene-Disease Association K. Santhosh kumar 1, P. Sudhakar 2 Department of Computer Science & Engineering Annamalai University Annamalai Nagar, India. santhosh09539@gmail.com,

More information

Using PatSeer to Categorize records around NFC (Near Field Communication) and to gather various Insights from it

Using PatSeer to Categorize records around NFC (Near Field Communication) and to gather various Insights from it Using PatSeer to Categorize records around NFC (Near Field Communication) and to gather various Insights from it Background This report is Part 2 of the NFC analysis reports and the first part can be accessed

More information

Genome Browser. Background and Strategy

Genome Browser. Background and Strategy Genome Browser Background and Strategy Contents What is a genome browser? Purpose of a genome browser Examples Structure Extra Features Contents What is a genome browser? Purpose of a genome browser Examples

More information

@Note2 tutorial. Hugo Costa Ruben Rodrigues Miguel Rocha

@Note2 tutorial. Hugo Costa Ruben Rodrigues Miguel Rocha @Note2 tutorial Hugo Costa (hcosta@silicolife.com) Ruben Rodrigues (pg25227@alunos.uminho.pt) Miguel Rocha (mrocha@di.uminho.pt) 23-01-2018 The document presents a typical workflow using @Note2 platform

More information

Scholarly Big Data: Leverage for Science

Scholarly Big Data: Leverage for Science Scholarly Big Data: Leverage for Science C. Lee Giles The Pennsylvania State University University Park, PA, USA giles@ist.psu.edu http://clgiles.ist.psu.edu Funded in part by NSF, Allen Institute for

More information

Finding and Exporting Data. BioMart

Finding and Exporting Data. BioMart September 2017 Finding and Exporting Data Not sure what tool to use to find and export data? BioMart is used to retrieve data for complex queries, involving a few or many genes or even complete genomes.

More information

Application of Patent Networks to Information Retrieval: A Preliminary Study

Application of Patent Networks to Information Retrieval: A Preliminary Study Application of Patent Networks to Information Retrieval: A Preliminary Study CS224W (Jure Leskovec): Final Project 12/07/2010 Siddharth Taduri Civil and Environmental Engineering, Stanford University,

More information

ScienceDirect. University of Wolverhampton. Goes beyond search to research

ScienceDirect. University of Wolverhampton. Goes beyond search to research 1 University of Wolverhampton ScienceDirect Goes beyond search to research Michaela Kurschildgen, Customer Consultant, Elsevier, m.kurschildgen@elsevier.com 24 November 2014 AGENDA 2 Background Who are

More information

A Retrieval Mechanism for Multi-versioned Digital Collection Using TAG

A Retrieval Mechanism for Multi-versioned Digital Collection Using TAG A Retrieval Mechanism for Multi-versioned Digital Collection Using Dr M Thangaraj #1, V Gayathri *2 # Associate Professor, Department of Computer Science, Madurai Kamaraj University, Madurai, TN, India

More information

Project Report on winter

Project Report on winter Project Report on 01-60-538-winter Yaxin Li, Xiaofeng Liu October 17, 2017 Li, Liu October 17, 2017 1 / 31 Outline Introduction a Basic Search Engine with Improvements Features PageRank Classification

More information

Your Gateway to Research. ISI Web of Knowledge Products

Your Gateway to Research. ISI Web of Knowledge Products QUICK REFERENCE CARD Your Gateway to Research ISI Web of Knowledge is the dynamic, fully integrated environment that gives you control of the discovery process by providing a single source for high-quality

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

Development of Contents Management System Based on Light-Weight Ontology

Development of Contents Management System Based on Light-Weight Ontology Development of Contents Management System Based on Light-Weight Ontology Kouji Kozaki, Yoshinobu Kitamura, and Riichiro Mizoguchi Abstract In the Structuring Nanotechnology Knowledge project, a material-independent

More information

Receiving and Responding to an Invitation Logging Into Your Reviewer Center... 2 Forgot Your Password?... 3 Help Documentation...

Receiving and Responding to an Invitation Logging Into Your Reviewer Center... 2 Forgot Your Password?... 3 Help Documentation... SCHOLARONE MANUSCRIPTS REVIEWER GUIDE CONTENTS Receiving and Responding to an Invitation...................................... 1 Logging Into Your Reviewer Center.............................................

More information

e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3

e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3 e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3 1 National Institute of Pharmaceutical Education and

More information

Adaptive and Personalized System for Semantic Web Mining

Adaptive and Personalized System for Semantic Web Mining Journal of Computational Intelligence in Bioinformatics ISSN 0973-385X Volume 10, Number 1 (2017) pp. 15-22 Research Foundation http://www.rfgindia.com Adaptive and Personalized System for Semantic Web

More information

D B M G Data Base and Data Mining Group of Politecnico di Torino

D B M G Data Base and Data Mining Group of Politecnico di Torino DataBase and Data Mining Group of Data mining fundamentals Data Base and Data Mining Group of Data analysis Most companies own huge databases containing operational data textual documents experiment results

More information

Searching the Evidence in PubMed

Searching the Evidence in PubMed CAMBRIDGE UNIVERSITY LIBRARY MEDICAL LIBRARY Supporting Literature Searching Searching the Evidence in PubMed July 2017 Supporting Literature Searching Searching the Evidence in PubMed How to access PubMed

More information

SciVerse Scopus. 1. Scopus introduction and content coverage. 2. Scopus in comparison with Web of Science. 3. Basic functionalities of Scopus

SciVerse Scopus. 1. Scopus introduction and content coverage. 2. Scopus in comparison with Web of Science. 3. Basic functionalities of Scopus Prepared by: Jawad Sayadi Account Manager, United Kingdom Elsevier BV Radarweg 29 1043 NX Amsterdam The Netherlands J.Sayadi@elsevier.com SciVerse Scopus SciVerse Scopus 1. Scopus introduction and content

More information

Cost. For an explanation of JISC Banding and charging, please go to:

Cost. For an explanation of JISC Banding and charging, please go to: Web of Science provides access to current and retrospective multidisciplinary information from approximately 8,700 of the most prestigious, high impact research journals in the world. Web of Science also

More information

Web Analysis in 4 Easy Steps. Rosaria Silipo, Bernd Wiswedel and Tobias Kötter

Web Analysis in 4 Easy Steps. Rosaria Silipo, Bernd Wiswedel and Tobias Kötter Web Analysis in 4 Easy Steps Rosaria Silipo, Bernd Wiswedel and Tobias Kötter KNIME Forum Analysis KNIME Forum Analysis Steps: 1. Get data into KNIME 2. Extract simple statistics (how many posts, response

More information

Data mining fundamentals

Data mining fundamentals Data mining fundamentals Elena Baralis Politecnico di Torino Data analysis Most companies own huge bases containing operational textual documents experiment results These bases are a potential source of

More information

EBP. Accessing the Biomedical Literature for the Best Evidence

EBP. Accessing the Biomedical Literature for the Best Evidence Accessing the Biomedical Literature for the Best Evidence Structuring the search for information and evidence Basic search resources Starting the search EBP Lab / Practice: Simple searches Using PubMed

More information

Tutorial 4 BLAST Searching the CHO Genome

Tutorial 4 BLAST Searching the CHO Genome Tutorial 4 BLAST Searching the CHO Genome Accessing the CHO Genome BLAST Tool The CHO BLAST server can be accessed by clicking on the BLAST button on the home page or by selecting BLAST from the menu bar

More information

Trilateral Search Guidebook in Biotechnology. [Ver.1 Publication ]

Trilateral Search Guidebook in Biotechnology. [Ver.1 Publication ] Trilateral Project DR2 Biotechnology Trilateral Search Guidebook in Biotechnology [Ver.1 Publication ] Part I 26 April 2007 United States Patent and trademark Office European Patent Office Japan Patent

More information

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s

Representation/Indexing (fig 1.2) IR models - overview (fig 2.1) IR models - vector space. Weighting TF*IDF. U s e r. T a s k s Summary agenda Summary: EITN01 Web Intelligence and Information Retrieval Anders Ardö EIT Electrical and Information Technology, Lund University March 13, 2013 A Ardö, EIT Summary: EITN01 Web Intelligence

More information

Managing and Mining Full-Text with QUOSA

Managing and Mining Full-Text with QUOSA University of Miami Scholarly Repository Faculty Research, Publications, and Presentations Department of Health Informatics 8-1-2011 Managing and Mining Full-Text with QUOSA Mary Moore PhD University of

More information

Value-added Features of Commercial Patent Information Resources

Value-added Features of Commercial Patent Information Resources Value-added Features of Commercial Patent Information Resources Andrew Czajkowski Head, Innovation and Technology Support Section Lusaka July 16, 2014 Overview Patent Databases Free Coverage Commercial

More information

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline

Relevance Feedback and Query Reformulation. Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price. Outline Relevance Feedback and Query Reformulation Lecture 10 CS 510 Information Retrieval on the Internet Thanks to Susan Price IR on the Internet, Spring 2010 1 Outline Query reformulation Sources of relevance

More information

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS 1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,

More information

International Books and Monographs Conferences, Symposia, Meetings Journal Articles Patents

International Books and Monographs Conferences, Symposia, Meetings Journal Articles Patents BIOSIS Toxicology Description BIOSIS Toxicology is a subset of BIOSIS Previews, with a focus on toxicology and related topics. Records are drawn from journal articles, conference papers, monographs and

More information

Structure Mining for Intellectual Networks

Structure Mining for Intellectual Networks Structure Mining for Intellectual Networks Ryutaro Ichise 1, Hideaki Takeda 1, and Kosuke Ueyama 2 1 National Institute of Informatics, 2-1-2 Chiyoda-ku Tokyo 101-8430, Japan, {ichise,takeda}@nii.ac.jp

More information

Biomedical literature mining for knowledge discovery

Biomedical literature mining for knowledge discovery Biomedical literature mining for knowledge discovery REZARTA ISLAMAJ DOĞAN National Center for Biotechnology Information National Library of Medicine Outline Biomedical Literature Access Challenges in

More information

Chapter 2. Architecture of a Search Engine

Chapter 2. Architecture of a Search Engine Chapter 2 Architecture of a Search Engine Search Engine Architecture A software architecture consists of software components, the interfaces provided by those components and the relationships between them

More information

PubMed Database Interface (Basic Course)

PubMed Database Interface (Basic Course) PubMed Database Interface (Basic Course) This work is licensed under a Creative Commons Attribution 4.0 International License Table of Contents Connecting to PubMed Navigating through PubMed Selecting

More information

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services

A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services A Comparative Study of the Search and Retrieval Features of OAI Harvesting Services V. Indrani 1 and K. Thulasi 2 1 Information Centre for Aerospace Science and Technology, National Aerospace Laboratories,

More information

Searching in All the Right Places. How Is Information Organized? Chapter 5: Searching for Truth: Locating Information on the WWW

Searching in All the Right Places. How Is Information Organized? Chapter 5: Searching for Truth: Locating Information on the WWW Chapter 5: Searching for Truth: Locating Information on the WWW Fluency with Information Technology Third Edition by Lawrence Snyder Searching in All the Right Places The Obvious and Familiar To find tax

More information

Created by: JJ O'Brien Dr. Stephen Carley Dr. Alan Porter

Created by: JJ O'Brien Dr. Stephen Carley Dr. Alan Porter May 2014 Created by: JJ O'Brien Dr. Stephen Carley (stephen.carley@gmail.com) Dr. Alan Porter (alan.porter@isye.gatech.edu) Some of the individual macros and thesauri were not created by the authors of

More information

MURDOCH RESEARCH REPOSITORY

MURDOCH RESEARCH REPOSITORY MURDOCH RESEARCH REPOSITORY http://researchrepository.murdoch.edu.au/ This is the author s final version of the work, as accepted for publication following peer review but without the publisher s layout

More information

An Application for Monitoring Solr

An Application for Monitoring Solr An Application for Monitoring Solr Yamin Alam Gauhati University Institute of Science and Technology, Guwahati Assam, India Nabamita Deb Gauhati University Institute of Science and Technology, Guwahati

More information

User Guide How to Conduct an Audit By: PCE Systems January, 2018

User Guide How to Conduct an Audit By: PCE Systems January, 2018 User Guide How to Conduct an Audit By: PCE Systems January, 2018 1 How to Conduct an Audit The following steps will outline how to conduct an audit and manage provider responses. This guide assumes audit

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

DALA Project: Digital Archive System for Long Term Access

DALA Project: Digital Archive System for Long Term Access 2010 International Conference on Distributed Framework for Multimedia Applications (DFmA) DALA Project: Digital Archive System for Long Term Access Mardhani Riasetiawan 1,2, Ahmad Kamil Mahmood 2 1 Master

More information

Oxford Journals Collection Online. at

Oxford Journals Collection Online. at Oxford Journals Collection Online at www.oupjournals.org The Oxford Journals Collection Over 180 titles in Science Technical Medical (STM), Social Sciences, Arts & Humanities, and Professional Studies

More information

The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases

The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases International Symposium on Integrative Bioinformatics 2010 The LAILAPS Search Engine - A Feature Model for Relevance Ranking in Life Science Databases M Lange, K Spies, C Colmsee, S Flemming, M Klapperstück,

More information

Bruno Martins. 1 st Semester 2012/2013

Bruno Martins. 1 st Semester 2012/2013 Link Analysis Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2012/2013 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 4

More information

Page Images STN AnaVist. All information from Database Summary Sheets Additional subject information Current price list

Page Images STN AnaVist. All information from Database Summary Sheets Additional subject information Current price list Subject Coverage File Type Features Database Description Database Language Database Name Database Producer Display Fields File Data Directory Thesaurus Price List Property Fields Sample Records Search

More information

Database of Curated Mutations (DoCM) ournal/v13/n10/full/nmeth.4000.

Database of Curated Mutations (DoCM)     ournal/v13/n10/full/nmeth.4000. Database of Curated Mutations (DoCM) http://docm.genome.wustl.edu/ http://www.nature.com/nmeth/j ournal/v13/n10/full/nmeth.4000.h tml Home Page Information in DoCM DoCM uses many data sources to compile

More information

Our Task At Hand Aggregate data from every group

Our Task At Hand Aggregate data from every group Where magical things happen Our Task At Hand Aggregate data from every group That s not too bad? Make it accessible to the public Just some basic HTML? Simple enough, right? Our Real Task Manage 1 million+

More information

Environmental Sample Classification E.S.C., Josh Katz and Kurt Zimmer

Environmental Sample Classification E.S.C., Josh Katz and Kurt Zimmer Environmental Sample Classification E.S.C., Josh Katz and Kurt Zimmer Goal: The task we were given for the bioinformatics capstone class was to construct an interface for the Pipas lab that integrated

More information

Outline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity

Outline. Possible solutions. The basic problem. How? How? Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity Outline Relevance Feedback, Query Expansion, and Inputs to Ranking Beyond Similarity Lecture 10 CS 410/510 Information Retrieval on the Internet Query reformulation Sources of relevance for feedback Using

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

Visitor and Faculty Recruitment. Visitor pages. First page is just flat html code

Visitor and Faculty Recruitment. Visitor pages. First page is just flat html code Visitor pages First page is just flat html code 1 Visitor pages The pages are protected by.htaccess with shared username and password the activities that a hosting faculty member might need and what other

More information

What is Discover! Additional Resources CountryWatch MathSciNet Literature Resource Center ProQuest: Historical Newspapers PAIS

What is Discover! Additional Resources CountryWatch MathSciNet Literature Resource Center ProQuest: Historical Newspapers PAIS What is Discover! A single interface to search our library s content. This platform provides users with an easy, yet powerful means of accessing resources through a single search. What resources are searched

More information

Web of Science 8.0. Alain Frey, Customer Education and Support

Web of Science 8.0. Alain Frey, Customer Education and Support Web of Science 8.0 Alain Frey, Customer Education and Support alain.frey@thomsonreuters.com Introduction Web of Science Web interface to the Science Citation Index Expanded Social Sciences Citation Index

More information

Exploring Automated Patent Search with KNIME Possibilities, Limits, Future

Exploring Automated Patent Search with KNIME Possibilities, Limits, Future Exploring Automated Patent Search with KNIME Possibilities, Limits, Future Alexander Klenner-Bajaja, PhD aklenner@epo.org European Patent Office Offices: Berlin, Vienna, Munich, The Hague (Rijswijk), Brussels

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Derwent Innovations Index

Derwent Innovations Index ISI WEB OF KNOWLEDGE SM Derwent Innovations Index Quick Reference Card Derwent Innovations Index is a powerful patent research tool, combining Derwent World Patents Index, Patents Citation Index TM, and

More information

Managing big biological sequence data with Biostrings and DECIPHER. Erik Wright University of Wisconsin-Madison

Managing big biological sequence data with Biostrings and DECIPHER. Erik Wright University of Wisconsin-Madison Managing big biological sequence data with Biostrings and DECIPHER Erik Wright University of Wisconsin-Madison What you should learn How to use the Biostrings and DECIPHER packages Creating a database

More information