Semantic Knowledge Discovery OntoChem IT Solutions

Size: px
Start display at page:

Download "Semantic Knowledge Discovery OntoChem IT Solutions"

Transcription

1 Semantic Knowledge Discovery OntoChem IT Solutions OntoChem IT Solutions GmbH Blücherstr Halle (Saale) Germany Tel Fax: mail: info(at)ontochem.com

2 Get the Gold! Towards automated knowledge mining... OntoChem is creating the equipment! 2

3 Agenda Available ontologies & software Services OCMiner processing pipeline technology Project examples: Extraction of Structure-Property-Relationship data (SPR) Compound bio-profiles and bio-similarity Automated toxicology reports News and Alerts 3

4 Available Ontologies & Software Out-of the box ontologies, taxonomies, vocabularies: Chemistry chemical classes, chemical groups, MolPuzzler, chemformula, alloys... >100 million terms Gene products genes, proteins, cleavage products, peptides, mutants, isoforms > 4 million terms Diseases & Effects diseases (MeSH, SNOMED, ICD10, MedDRA...), physiological effects, pharmacology, cosmetology, nutrition, health, flavors Anatomy, Companies, Geopolitical, Species, Persons, Wikipedia... more than 40 LifeScience taxonomies available 4

5 Available Ontologies & Software Out-of the box software: Ontology editors suited for very large ontologies, first chemistry ontology editor OCMiner enterprise annotation & search engine www-browser GUI for semantic search and report generation company intranet indexing and search service OCMiner web-service (CTD BioCreative IV 2013 Challenge) OntoChem has fastest service! integrated into your IT solution OCMiner hot-folder annotation service for your local PC or workgroup 5

6 Available Ontologies & Software 6

7 Services OntoChem Service Examples: custom topical annotation dictionaries, taxonomies, ontologies (multilingual... high throughput document normalization and standardization (e.g Excel 2.0 documents, table and image extraction...) high quality chemistry annotation, extraction and searching (ChemAxon, Opsin, MolPuzzler, anaphora and table treatment, databases) workflows to extract compound-property facts (search engines, document classification, reactions, SAR & SPR extraction...) exploring the life science knowledge space (triples, N-tuples, databases, chemical reactions, knowledge inference) automated report generation (toxicology reports, compound bioprofiles, concept similarity) scientific trend analytics, news and alert systems 7

8 Agenda Available ontologies & software Services OCMiner processing pipeline technology Project examples: Extraction of Structure-Property-Relationship data (SPR) Compound bio-profiles and bio-similarity Automated toxicology reports News and Alerts 8

9 Flowchart OCMiner UIMA Pipeline OCMiner UIMA Pipeline identify document type document classifier XML detagger language detector normalize text tokenize text picture PDF OCR acronym abbrev detector person annotator document structure domain domain domain annotators annotators annotators 1 n 1 n 1 n Text PDF XML doc Office doc PDF reader XML reader Office reader dictionary cleanup & rule combiner chemistry annotator name-2- structure coordinated entity resolution formula & molpuzzler context handler class/group resolution NE confidence relationship extraction consumer BRAT consumer index consumer XML 9

10 Running OCMiner 10

11 Running OCMiner 11

12 Searching with OCMiner 12

13 Agenda Available ontologies & software Services OCMiner processing pipeline technology Project examples: Extraction of Structure-Property-Relationship data (SPR) Compound bio-profiles and bio-similarity Automated toxicology reports News and Alerts 13

14 Compound Property Data Where is the data? most biological data on compounds is found in tables table normalization / correction needed most physico-chemical data on compounds in experimental part structured understanding of experimental part chemical reactions most SPR relationships are anaphoric label resolution:...compound 3a has a melting point of C... anaphora resolution:...this phenol is a strong antibiotic 14

15 SAR Data in Tables... 15

16 Molecule Puzzler Markush from Tables 16

17 ... Extracted into SDF file or Database 103 compounds, >4000 data points for upload into database 17

18 ...Semantic Property Fact Extraction 18

19 Agenda Available ontologies & software Services OCMiner processing pipeline technology Project examples: Extraction of Structure-Property-Relationship data (SPR) Compound bio-profiles and bio-similarity Automated toxicology reports News and Alerts 19

20 Co-Occurrences 20

21 Knowledge Mining 21

22 Compound Bio-Profiles 22

23 Compound Bio-Similarity 23

24 Agenda Available ontologies & software Services OCMiner processing pipeline technology Project examples: Extraction of Structure-Property-Relationship data (SPR) Compound bio-profiles and bio-similarity Automated toxicology reports News and Alerts 24

25 Automated Tox Reports 25

26 Agenda Available ontologies & software Services OCMiner processing pipeline technology Project examples: Extraction of Structure-Property-Relationship data (SPR) Compound bio-profiles and bio-similarity Automated toxicology reports News and Alerts 26

27 Example: Vitamin News & Alert System Filtering out negative and positive news for vitamins: 27

28 Example: Fish Oil Alert System 28

Unstructured Text in Big Data The Elephant in the Room

Unstructured Text in Big Data The Elephant in the Room Unstructured Text in Big Data The Elephant in the Room David Milward ICIC, October 2013 Click Unstructured to to edit edit Master Master Big title Data style title style Big Data Volume, Variety, Velocity

More information

A Framework for BioCuration (part II)

A Framework for BioCuration (part II) A Framework for BioCuration (part II) Text Mining for the BioCuration Workflow Workshop, 3rd International Biocuration Conference Friday, April 17, 2009 (Berlin) Martin Krallinger Spanish National Cancer

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

Customisable Curation Workflows in Argo

Customisable Curation Workflows in Argo Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:

More information

KNIME Enalos+ Molecular Descriptor nodes

KNIME Enalos+ Molecular Descriptor nodes KNIME Enalos+ Molecular Descriptor nodes A Brief Tutorial Novamechanics Ltd Contact: info@novamechanics.com Version 1, June 2017 Table of Contents Introduction... 1 Step 1-Workbench overview... 1 Step

More information

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009 Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images

More information

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si

Information Retrieval, Information Extraction, and Text Mining Applications for Biology. Slides by Suleyman Cetintas & Luo Si Information Retrieval, Information Extraction, and Text Mining Applications for Biology Slides by Suleyman Cetintas & Luo Si 1 Outline Introduction Overview of Literature Data Sources PubMed, HighWire

More information

Humboldt-University of Berlin

Humboldt-University of Berlin Humboldt-University of Berlin Exploiting Link Structure to Discover Meaningful Associations between Controlled Vocabulary Terms exposé of diploma thesis of Andrej Masula 13th October 2008 supervisor: Louiqa

More information

Acquiring Experience with Ontology and Vocabularies

Acquiring Experience with Ontology and Vocabularies Acquiring Experience with Ontology and Vocabularies Walt Melo Risa Mayan Jean Stanford The author's affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended

More information

Life Sciences Oracle Based Solutions. June 2004

Life Sciences Oracle Based Solutions. June 2004 Life Sciences Oracle Based Solutions June 2004 Overview of Accelrys Leading supplier of computation tools to the life science and informatics research community: Bioinformatics Cheminformatics Modeling/Simulation

More information

The genexplain platform. Workshop SW2: Pathway Analysis in Transcriptomics, Proteomics and Metabolomics

The genexplain platform. Workshop SW2: Pathway Analysis in Transcriptomics, Proteomics and Metabolomics The genexplain platform Workshop SW2: Pathway Analysis in Transcriptomics, Proteomics and Metabolomics Saturday, March 17, 2012 2 genexplain GmbH Am Exer 10b D-38302 Wolfenbüttel Germany E-mail: olga.kel-margoulis@genexplain.com,

More information

Semantic Annotation, Search and Analysis

Semantic Annotation, Search and Analysis Semantic Annotation, Search and Analysis Borislav Popov, Ontotext Ontology A machine readable conceptual model a common vocabulary for sharing information machine-interpretable definitions of concepts in

More information

Improving Interoperability of Text Mining Tools with BioC

Improving Interoperability of Text Mining Tools with BioC Improving Interoperability of Text Mining Tools with BioC Ritu Khare, Chih-Hsuan Wei, Yuqing Mao, Robert Leaman, Zhiyong Lu * National Center for Biotechnology Information, 8600 Rockville Pike, Bethesda,

More information

SELF-SERVICE SEMANTIC DATA FEDERATION

SELF-SERVICE SEMANTIC DATA FEDERATION SELF-SERVICE SEMANTIC DATA FEDERATION WE LL MAKE YOU A DATA SCIENTIST Contact: IPSNP Computing Inc. Chris Baker, CEO Chris.Baker@ipsnp.com (506) 721 8241 BIG VISION: SELF-SERVICE DATA FEDERATION Biomedical

More information

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper. Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...

More information

Languages and tools for building and using ontologies. Simon Jupp, James Malone

Languages and tools for building and using ontologies. Simon Jupp, James Malone An overview of ontology technology Languages and tools for building and using ontologies Simon Jupp, James Malone jupp@ebi.ac.uk, malone@ebi.ac.uk Outline Languages OWL and OBO classes, individuals, relations,

More information

About the Edinburgh Pathway Editor:

About the Edinburgh Pathway Editor: About the Edinburgh Pathway Editor: EPE is a visual editor designed for annotation, visualisation and presentation of wide variety of biological networks, including metabolic, genetic and signal transduction

More information

NCI Thesaurus, managing towards an ontology

NCI Thesaurus, managing towards an ontology NCI Thesaurus, managing towards an ontology CENDI/NKOS Workshop October 22, 2009 Gilberto Fragoso Outline Background on EVS The NCI Thesaurus BiomedGT Editing Plug-in for Protege Semantic Media Wiki supports

More information

The National Cancer Institute's Thésaurus and Ontology

The National Cancer Institute's Thésaurus and Ontology The National Cancer Institute's Thésaurus and Ontology Jennifer Golbeck 1, Gilberto Fragoso 2, Frank Hartel 2, Jim Hendler 1, Jim Oberthaler 2, Bijan Parsia 1 1 University of Maryland, College Park 2 National

More information

Text mining tools for semantically enriching the scientific literature

Text mining tools for semantically enriching the scientific literature Text mining tools for semantically enriching the scientific literature Sophia Ananiadou Director National Centre for Text Mining School of Computer Science University of Manchester Need for enriching the

More information

Transitioning to Symyx

Transitioning to Symyx Whitepaper Transitioning to Symyx Notebook by Accelrys from Third-Party Electronic Lab Notebooks Ordinarily in a market with strong growth, vendors do not focus on competitive displacement of competitor

More information

Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies

Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies Anthony C. Arvanites Lead Discovery Informatics Company Introduction Founded: 1999 Platform: Combining

More information

Disease Information and Semantic Web

Disease Information and Semantic Web Rheinische Friedrich-Wilhelms-Universität Bonn Institute of Computer Science III Disease Information and Semantic Web Master s Thesis Supervisor: Prof. Sören Auer, Heiner OberKampf Turan Gojayev München,

More information

Things to consider when using Semantics in your Information Management strategy. Toby Conrad Smartlogic

Things to consider when using Semantics in your Information Management strategy. Toby Conrad Smartlogic Things to consider when using Semantics in your Information Management strategy Toby Conrad Smartlogic toby.conrad@smartlogic.com +1 773 251 0824 Some of Smartlogic s 250+ Customers Awards Trend Setting

More information

Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (

Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data ( Bioqueries: A Social Community Sharing Experiences while Querying Biological Linked Data (http://bioqueries.uma.es) María Jesús García-Godoy, Ismael Navas-Delgado, José Francisco Aldana Montes Computing

More information

ToxPredict Beta Testing Report Template

ToxPredict Beta Testing Report Template ToxPredict Beta Testing Report Template Grant Agreement Acronym Name Coordinator Health-F5-2008-200787 OpenTox An Open Source Predictive Toxicology Framework Douglas Connect Contract No. Document Type:

More information

Triple store databases and their role in high throughput, automated extensible data analysis

Triple store databases and their role in high throughput, automated extensible data analysis Triple store databases and their role in high throughput, automated extensible data analysis San Diego CINF Talk: Workflow! Introduction to the Combechem Project! Smart Dark Labs! Semantics & Databases!

More information

Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research

Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research Interoperability and Semantics in Use- Application of UML, XMI and MDA to Precision Medicine and Cancer Research Ian Fore, D.Phil. Associate Director, Biorepository and Pathology Informatics Senior Program

More information

ACCELERATE YOUR SHAREPOINT ADOPTION AND ROI WITH CONTENT INTELLIGENCE

ACCELERATE YOUR SHAREPOINT ADOPTION AND ROI WITH CONTENT INTELLIGENCE June 30, 2012 San Diego Convention Center ACCELERATE YOUR SHAREPOINT ADOPTION AND ROI WITH CONTENT INTELLIGENCE Stuart Laurie, Senior Consultant #SPSSAN Agenda 1. Challenges 2. What comes out of the box

More information

Indexing chemical names and structures from documents: putting it all together

Indexing chemical names and structures from documents: putting it all together Indexing chemical names and structures from documents: putting it all together Daniel Bonniot de Ruisselet ChemAxon ChemAxon US UGM September 26 th 2012 ChemAxon s Naming Technology Structure to Name Name

More information

From Visual Data Exploration and Analysis to Scientific Conclusions

From Visual Data Exploration and Analysis to Scientific Conclusions From Visual Data Exploration and Analysis to Scientific Conclusions Alexandra Vamvakidou, PhD September 15th, 2016 HUMAN HEALTH ENVIRONMENTAL HEALTH 2014 PerkinElmer The Power of a Visual Data We Collect

More information

Integration in the 21 st -Century Enterprise. Thomas Blackadar American Chemical Society Meeting New York, September 10, 2003

Integration in the 21 st -Century Enterprise. Thomas Blackadar American Chemical Society Meeting New York, September 10, 2003 Integration in the 21 st -Century Enterprise Thomas Blackadar American Chemical Society Meeting New York, September 10, 2003 The Integration Bill of Rights Integrate = to form, coordinate, or blend into

More information

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

Knowledge Representations. How else can we represent knowledge in addition to formal logic? Knowledge Representations How else can we represent knowledge in addition to formal logic? 1 Common Knowledge Representations Formal Logic Production Rules Semantic Nets Schemata and Frames 2 Production

More information

Facilitating Data Discovery through a Semantic Data Catalog

Facilitating Data Discovery through a Semantic Data Catalog Facilitating Data Discovery through a Semantic Data Catalog HUMAN HEALTH ENVIRONMENTAL HEALTH 1 2014 PerkinElmer Agenda Why is Data Visualization only part of the story? What is a Semantic Data Catalog?

More information

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision A Semantic Web-Based Approach for Harvesting Multilingual Textual Definitions from Wikipedia to Support ICD-11 Revision Guoqian Jiang 1,* Harold R. Solbrig 1 and Christopher G. Chute 1 1 Department of

More information

Table of Contents 1 Introduction A Declarative Approach to Entity Resolution... 17

Table of Contents 1 Introduction A Declarative Approach to Entity Resolution... 17 Table of Contents 1 Introduction...1 1.1 Common Problem...1 1.2 Data Integration and Data Management...3 1.2.1 Information Quality Overview...3 1.2.2 Customer Data Integration...4 1.2.3 Data Management...8

More information

Quick Reference Guide

Quick Reference Guide Quick Reference Guide Contents 1. The Query Page 3 2. Constructing Queries: Reactions 4 Substances 5 Medical Chemistry 7 Literature 8 Properties 9 Natural Products 10 3. Results: Filters 11 Analysis View

More information

Life Science Research Center (LSRC) Rachel Henning, Dr. Eliot Randle

Life Science Research Center (LSRC) Rachel Henning, Dr. Eliot Randle Life Science Research Center (LSRC) Rachel Henning, Dr. Eliot Randle Infotrieve The leading integrated solution provider of information management and services Over 3,000 organizations and over 50,000

More information

User guide for GEM-TREND

User guide for GEM-TREND User guide for GEM-TREND 1. Requirements for Using GEM-TREND GEM-TREND is implemented as a java applet which can be run in most common browsers and has been test with Internet Explorer 7.0, Internet Explorer

More information

TWO SIDES OF A MIGRATION PROCESS

TWO SIDES OF A MIGRATION PROCESS EGIS TWO SIDES OF A MIGRATION PROCESS Tamás Nagy (Egis), András Dancsó (Egis), László Vágó (Egis), Balázs Volk (Egis), Gábor Pőcze (ComCix), Ferenc Darvas (ComCix) ChemAxon EUGM 2015 Kamilla: the choice

More information

Technical Computing with MATLAB

Technical Computing with MATLAB Technical Computing with MATLAB University Of Bath Seminar th 19 th November 2010 Adrienne James (Application Engineering) 1 Agenda Introduction to MATLAB Importing, visualising and analysing data from

More information

Visualization and text mining of patent and non-patent data

Visualization and text mining of patent and non-patent data of patent and non-patent data Anton Heijs Information Solutions Delft, The Netherlands http://www.treparel.com/ ICIC conference, Nice, France, 2008 Outline Introduction Applications on patent and non-patent

More information

Classification of Protein Crystallization Imagery

Classification of Protein Crystallization Imagery Classification of Protein Crystallization Imagery Xiaoqing Zhu, Shaohua Sun, Samuel Cheng Stanford University Marshall Bern Palo Alto Research Center September 2004, EMBC 04 Outline Background X-ray crystallography

More information

Annotating Spatio-Temporal Information in Documents

Annotating Spatio-Temporal Information in Documents Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de

More information

Natural Language Processing with PoolParty

Natural Language Processing with PoolParty Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense

More information

SureChem and ChEMBL. ACS CINF webinar. John P. Overington & Nicko Goncharoff

SureChem and ChEMBL. ACS CINF webinar. John P. Overington & Nicko Goncharoff SureChem and ChEMBL ACS CINF webinar John P. Overington & Nicko Goncharoff 8 th April 2014 Assay/Target ChEMBL Data for Drug Discovery 1. Scientific facts 3. Insight, tools and resources for translational

More information

Tools and Infrastructure for Supporting Enterprise Knowledge Graphs

Tools and Infrastructure for Supporting Enterprise Knowledge Graphs Tools and Infrastructure for Supporting Enterprise Knowledge Graphs Sumit Bhatia, Nidhi Rajshree, Anshu Jain, and Nitish Aggarwal IBM Research sumitbhatia@in.ibm.com, {nidhi.rajshree,anshu.n.jain}@us.ibm.com,nitish.aggarwal@ibm.com

More information

Chemical name recognition with harmonized feature-rich conditional random fields

Chemical name recognition with harmonized feature-rich conditional random fields Chemical name recognition with harmonized feature-rich conditional random fields David Campos, Sérgio Matos, and José Luís Oliveira IEETA/DETI, University of Aveiro, Campus Universitrio de Santiago, 3810-193

More information

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company Taxonomy Tools: Collaboration, Creation & Integration Dave Clarke Global Taxonomy Director dave.clarke@dowjones.com Dow Jones & Company Introduction Software Tools for Taxonomy 1. Collaboration 2. Creation

More information

Using DAML format for representation and integration of complex gene networks: implications in novel drug discovery

Using DAML format for representation and integration of complex gene networks: implications in novel drug discovery Using DAML format for representation and integration of complex gene networks: implications in novel drug discovery K. Baclawski Northeastern University E. Neumann Beyond Genomics T. Niu Harvard School

More information

KNIME Enalos+ Modelling nodes

KNIME Enalos+ Modelling nodes KNIME Enalos+ Modelling nodes A Brief Tutorial Novamechanics Ltd Contact: info@novamechanics.com Version 1, June 2017 Table of Contents Introduction... 1 Step 1-Workbench overview... 1 Step 2-Building

More information

Alternative Tools for Mining The Biomedical Literature

Alternative Tools for Mining The Biomedical Literature Yale University From the SelectedWorks of Rolando Garcia-Milian May 14, 2014 Alternative Tools for Mining The Biomedical Literature Rolando Garcia-Milian, Yale University Available at: https://works.bepress.com/rolando_garciamilian/1/

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

Building innovative drug discovery alliances. Migrating to ChemAxon

Building innovative drug discovery alliances. Migrating to ChemAxon Building innovative drug discovery alliances Migrating to ChemAxon Evotec AG, Migrating to ChemAxon, May 2011 Agenda Evotec Why migrate? Searching for Library Enumeration Replacement Migrating a small

More information

The Expansive Reach of ChemSpider as a Resource for the Chemistry Community. Antony Williams University of Oregon, April 24 th 2013

The Expansive Reach of ChemSpider as a Resource for the Chemistry Community. Antony Williams University of Oregon, April 24 th 2013 The Expansive Reach of ChemSpider as a Resource for the Chemistry Community Antony Williams University of Oregon, April 24 th 2013 The World of Online Chemistry Property databases Compound aggregators

More information

How to Save, Print and Export Answers

How to Save, Print and Export Answers How to Save, Print and Export Answers Keep your SciFinder answers for future use Keep answer sets for future use with print, save and export capabilities. To generate a hardcopy of part or all of your

More information

Editing Pathway/Genome Databases

Editing Pathway/Genome Databases Editing Pathway/Genome Databases By Ron Caspi This presentation can be found at http://bioinformatics.ai.sri.com/ptools/tutorial/sessions/ 1 Pathway Tools in Editing Mode The database is separate from

More information

LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases

LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases LIDER Survey Overview Participant profile (organisation type, industry sector) Relevant use-cases Discovering and extracting information Understanding opinion Content and data (Data Management) Monitoring

More information

What is Text Mining? Sophia Ananiadou National Centre for Text Mining University of Manchester

What is Text Mining? Sophia Ananiadou National Centre for Text Mining   University of Manchester National Centre for Text Mining www.nactem.ac.uk University of Manchester Outline Aims of text mining Text Mining steps Text Mining uses Applications 2 Aims Extract and discover knowledge hidden in text

More information

Projects Tools BLAH proposal Conclusion. OntoGene/BioMeXT

Projects Tools BLAH proposal Conclusion. OntoGene/BioMeXT OntoGene/BioMeXT The Bio Term Hub and OGER Lenz Furrer, Nico Colic, Fabio Rinaldi University of Zurich and Swiss Institute of Bioinformatics January 10, 2018 Outline Projects Tools BLAH proposal Conclusion

More information

Automated Classification. Lars Marius Garshol Topic Maps

Automated Classification. Lars Marius Garshol Topic Maps Automated Classification Lars Marius Garshol Topic Maps 2007 2007-03-21 Automated classification What is it? Why do it? 2 What is automated classification? Create parts of a topic map

More information

Overview of BioCreative VI Precision Medicine Track

Overview of BioCreative VI Precision Medicine Track Overview of BioCreative VI Precision Medicine Track Mining scientific literature for protein interactions affected by mutations Organizers: Rezarta Islamaj Dogan (NCBI) Andrew Chatr-aryamontri (BioGrid)

More information

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London

Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services. Patrick Wendel Imperial College, London Discovery Net : A UK e-science Pilot Project for Grid-based Knowledge Discovery Services Patrick Wendel Imperial College, London Data Mining and Exploration Middleware for Distributed and Grid Computing,

More information

e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3

e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3 e-scider: A tool to retrieve, prioritize and analyze the articles from PubMed database Sujit R. Tangadpalliwar 1, Rakesh Nimbalkar 2, Prabha Garg* 3 1 National Institute of Pharmaceutical Education and

More information

An overview of Graph Categories and Graph Primitives

An overview of Graph Categories and Graph Primitives An overview of Graph Categories and Graph Primitives Dino Ienco (dino.ienco@irstea.fr) https://sites.google.com/site/dinoienco/ Topics I m interested in: Graph Database and Graph Data Mining Social Network

More information

Getting Started with SciFinder 2007

Getting Started with SciFinder 2007 Getting Started with SciFinder 2007 for Windows November 2006 Copyright 2006 American Chemical Society. All Rights Reserved. SciFinder is a registered trademark of the American Chemical Society. Getting

More information

Mining the Biomedical Research Literature. Ken Baclawski

Mining the Biomedical Research Literature. Ken Baclawski Mining the Biomedical Research Literature Ken Baclawski Data Formats Flat files Spreadsheets Relational databases Web sites XML Documents Flexible very popular text format Self-describing records XML Documents

More information

ScienceDirect Empowering researchers at every step

ScienceDirect Empowering researchers at every step ScienceDirect Empowering researchers at every step Training ScienceDirect for Universitas Diponegoro By Ujang Sanusi Customer Consultant ELSEVIER INDONESIA 2 Elsevier is a world-leading Science, Health

More information

A Technical Introduction to the Semantic Search Engine SeMedico

A Technical Introduction to the Semantic Search Engine SeMedico Talk in the Semesterprojekt Entwicklung einer Suchmaschine für Alternativmethoden zu Tierversuchen January 12, 2018 Humboldt-Universität zu Berlin A Technical Introduction to the Semantic Search Engine

More information

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University Outline Communities Ontology value, issues, problems, solutions Ontology languages Terms for ontology Ontologies April

More information

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge Discover hidden information from your texts! Information overload is a well known issue in the knowledge industry. At the same time most of this information becomes available in natural language which

More information

Morphit 5. Edge User Group Meeting 2017 Ted Hawkins PhD VP Product Delivery

Morphit 5. Edge User Group Meeting 2017 Ted Hawkins PhD VP Product Delivery Morphit 5 Edge User Group Meeting 2017 Ted Hawkins PhD VP Product Delivery Contents What is Morphit and what makes it good for scientists? New features and benefits of v5 Workshop Advert Morphit - Yet

More information

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites Access IT Training 2003 Google indexed 3,3 billion of pages http://searchenginewatch.com/3071371 2005 Google s index contains 8,1 billion of websites http://blog.searchenginewatch.com/050517-075657 Estimated

More information

Precise Medication Extraction using Agile Text Mining

Precise Medication Extraction using Agile Text Mining Precise Medication Extraction using Agile Text Mining Chaitanya Shivade *, James Cormack, David Milward * The Ohio State University, Columbus, Ohio, USA Linguamatics Ltd, Cambridge, UK shivade@cse.ohio-state.edu,

More information

State of the Art and Trends in Search Engine Technology. Gerhard Weikum

State of the Art and Trends in Search Engine Technology. Gerhard Weikum State of the Art and Trends in Search Engine Technology Gerhard Weikum (weikum@mpi-inf.mpg.de) Commercial Search Engines Web search Google, Yahoo, MSN simple queries, chaotic data, many results key is

More information

Visual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017

Visual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017 Visual Concept Detection and Linked Open Data at the TIB AV- Portal Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017 Agenda 1. TIB and TIB AV-Portal 2. Automated Video Analysis 3. Visual

More information

Open PHACTS. An Introduction and Explanation March Acknowledgements: Contains contributions from across the Open PHACTS partners.

Open PHACTS. An Introduction and Explanation March Acknowledgements: Contains contributions from across the Open PHACTS partners. Open PHACTS An Introduction and Explanation March 2012 Acknowledgements: Contains contributions from across the Open PHACTS partners. Public Domain Drug Discovery Data: Pharma are accessing, processing,

More information

Web Analysis in 4 Easy Steps. Rosaria Silipo, Bernd Wiswedel and Tobias Kötter

Web Analysis in 4 Easy Steps. Rosaria Silipo, Bernd Wiswedel and Tobias Kötter Web Analysis in 4 Easy Steps Rosaria Silipo, Bernd Wiswedel and Tobias Kötter KNIME Forum Analysis KNIME Forum Analysis Steps: 1. Get data into KNIME 2. Extract simple statistics (how many posts, response

More information

Integrated Access to Biological Data. A use case

Integrated Access to Biological Data. A use case Integrated Access to Biological Data. A use case Marta González Fundación ROBOTIKER, Parque Tecnológico Edif 202 48970 Zamudio, Vizcaya Spain marta@robotiker.es Abstract. This use case reflects the research

More information

Semantic Annotation and Linking of Medical Educational Resources

Semantic Annotation and Linking of Medical Educational Resources 5 th European IFMBE MBEC, Budapest, September 14-18, 2011 Semantic Annotation and Linking of Medical Educational Resources N. Dovrolis 1, T. Stefanut 2, S. Dietze 3, H.Q. Yu 3, C. Valentine 3 & E. Kaldoudi

More information

Architecting Knowledge Middleware

Architecting Knowledge Middleware Architecting Knowledge Middleware WWW 2002, Honolulu, May 9, 2002 Alfred Z. Spector Vice President, Services and Software IBM Research Division aspector@us.ibm.com Thomas J. Watson Research Center PO Box

More information

Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction

Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction Pavel P. Kuksa, Rutgers University Yanjun Qi, Bing Bai, Ronan Collobert, NEC Labs Jason Weston, Google Research NY Vladimir

More information

Linking SharePoint Documents with Structured Data. Towards Unified Views of Business-critical Information. Andreas Blumauer Director PoolParty Ltd, UK

Linking SharePoint Documents with Structured Data. Towards Unified Views of Business-critical Information. Andreas Blumauer Director PoolParty Ltd, UK Linking SharePoint Documents with Structured Data Towards Unified Views of Business-critical Information Andreas Blumauer Director PoolParty Ltd, UK 2 Andreas Blumauer serves customers Semantic Web Company

More information

Programming in the Life Sciences

Programming in the Life Sciences Programming in the Life Sciences In the Maastricht Science Programme Open PHACTS Community Workshop London, 26 June 2014 1 Who am I? Teacher at Dept. Bioinformatics BiGCaT, NUTRIM, FHML, UM. http://chem-bla-ics.blogspot.com/

More information

JChem Extensions for KNIME KNIME.com products

JChem Extensions for KNIME KNIME.com products JChem Extensions for KNIME KNIME.com products ChemAxon 2011 US User Group Meeting San Diego, CA Takahiro Ohshima Overview INFOCOM KNIME JChem Extensions Marvin Family Nodes KNIME.com products KNIME Enterprise

More information

EVENT EXTRACTION WITH COMPLEX EVENT CLASSIFICATION USING RICH FEATURES

EVENT EXTRACTION WITH COMPLEX EVENT CLASSIFICATION USING RICH FEATURES Journal of Bioinformatics and Computational Biology Vol. 8, No. 1 (2010) 131 146 c 2010 The Authors DOI: 10.1142/S0219720010004586 EVENT EXTRACTION WITH COMPLEX EVENT CLASSIFICATION USING RICH FEATURES

More information

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services

Enabling Open Science: Data Discoverability, Access and Use. Jo McEntyre Head of Literature Services Enabling Open Science: Data Discoverability, Access and Use Jo McEntyre Head of Literature Services www.ebi.ac.uk About EMBL-EBI Part of the European Molecular Biology Laboratory International, non-profit

More information

TEXT MINING: THE NEXT DATA FRONTIER

TEXT MINING: THE NEXT DATA FRONTIER TEXT MINING: THE NEXT DATA FRONTIER An Infrastructural Approach Dr. Petr Knoth CORE (core.ac.uk) Knowledge Media institute, The Open University United Kingdom 2 OpenMinTeD Establish an open and sustainable

More information

An UIMA based Tool Suite for Semantic Text Processing

An UIMA based Tool Suite for Semantic Text Processing An UIMA based Tool Suite for Semantic Text Processing Katrin Tomanek, Ekaterina Buyko, Udo Hahn Jena University Language & Information Engineering Lab StemNet Knowledge Management for Immunology in life

More information

PROJECT PERIODIC REPORT

PROJECT PERIODIC REPORT PROJECT PERIODIC REPORT Grant Agreement number: 257403 Project acronym: CUBIST Project title: Combining and Uniting Business Intelligence and Semantic Technologies Funding Scheme: STREP Date of latest

More information

Software review. Biomolecular Interaction Network Database

Software review. Biomolecular Interaction Network Database Biomolecular Interaction Network Database Keywords: protein interactions, visualisation, biology data integration, web access Abstract This software review looks at the utility of the Biomolecular Interaction

More information

Investigating Collaboration Dynamics in Different Ontology Development Environments

Investigating Collaboration Dynamics in Different Ontology Development Environments Investigating Collaboration Dynamics in Different Ontology Development Environments Marco Rospocher DKM, Fondazione Bruno Kessler rospocher@fbk.eu Tania Tudorache and Mark Musen BMIR, Stanford University

More information

Supervised Models for Coreference Resolution [Rahman & Ng, EMNLP09] Running Example. Mention Pair Model. Mention Pair Example

Supervised Models for Coreference Resolution [Rahman & Ng, EMNLP09] Running Example. Mention Pair Model. Mention Pair Example Supervised Models for Coreference Resolution [Rahman & Ng, EMNLP09] Many machine learning models for coreference resolution have been created, using not only different feature sets but also fundamentally

More information

User Manual. Ver. 3.0 March 19, 2012

User Manual. Ver. 3.0 March 19, 2012 User Manual Ver. 3.0 March 19, 2012 Table of Contents 1. Introduction... 2 1.1 Rationale... 2 1.2 Software Work-Flow... 3 1.3 New in GenomeGems 3.0... 4 2. Software Description... 5 2.1 Key Features...

More information

National metadata repository for databases of registers and trials

National metadata repository for databases of registers and trials National metadata repository for databases of registers and trials IBE, Medical Faculty, Ludwig-Maximilians-Universität München, Germany Institute for Medical Informatics (IMISE), Universität Leipzig,

More information

Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework. Maryann E.

Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework. Maryann E. Ontology-based annotation of multiscale imaging data: Utilizing and building the Neuroscience Information Framework Maryann E. Martone University of California, San Diego What does this mean? 3D Volumes

More information

ANALYSIS OF LARGE GRAPH DATA WITH GRADOOP AND KNIME

ANALYSIS OF LARGE GRAPH DATA WITH GRADOOP AND KNIME ANALYSIS OF LARGE GRAPH DATA WITH GRADOOP AND KNIME ALEXANDER KIPP (ROBERT BOSCH GMBH), STEFFEN DIENST, STEFAN KÜHNE (UNIVERSITÄT LEIPZIG), TOBIAS KÖTTER (KNIME) Bosch Smart Semantics Application fields

More information

SciFinder Training Materials

SciFinder Training Materials SciFinder Training Materials # Contents Page How to Create a Substance Answer Set - Search by chemical structure, molecular formula, and substance identifier How to Work with a Substance Answer Set - Analyze

More information

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic

Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic Semantic MediaWiki A Tool for Collaborative Vocabulary Development Harold Solbrig Division of Biomedical Informatics Mayo Clinic Outline MediaWiki what it is, how it works Semantic MediaWiki MediaWiki

More information

How Co-Occurrence can Complement Semantics?

How Co-Occurrence can Complement Semantics? How Co-Occurrence can Complement Semantics? Atanas Kiryakov & Borislav Popov ISWC 2006, Athens, GA Semantic Annotations: 2002 #2 Semantic Annotation: How and Why? Information extraction (text-mining) for

More information