Exploring Automated Patent Search with KNIME Possibilities, Limits, Future

Size: px
Start display at page:

Download "Exploring Automated Patent Search with KNIME Possibilities, Limits, Future"

Transcription

1 Exploring Automated Patent Search with KNIME Possibilities, Limits, Future Alexander Klenner-Bajaja, PhD

2 European Patent Office Offices: Berlin, Vienna, Munich, The Hague (Rijswijk), Brussels Staff: over 7000 Mission: Granting European Patents New building in Rijswijk, finished approx

3 Why Search European Patent Convention 3 Information Management`s Task: Support Search

4 Patent Search Filing numbers increase

5 Introduction What do we want, where are we? Application Query Formulation Search PriorArt relevant documents 5

6 The EPO`s Tool landscape Translation API Boolean Search Engine Search Evaluation Lucene Search Engine Concept Extraction MetaData Neo4J Fulltext MongoDB ImageDB GoldStandard SQL 6

7 The EPO`s Tool landscape Translation API Boolean Search Engine Search Evaluation Lucene Search Engine GoldStandard SQL MetaData Neo4J Fulltext MongoDB ImageDB Concept Extraction 7

8 The EPO`s Tool landscape Translation API Boolean Search Engine Search Evaluation Lucene Search Engine GoldStandard SQL MetaData Neo4J Fulltext MongoDB ImageDB Concept Extraction 8

9 The EPO`s Tool landscape Translation API Boolean Search Engine Search Evaluation Lucene Search Engine GoldStandard SQL MetaData Neo4J Fulltext MongoDB ImageDB Concept Extraction 9

10 One platform to allow rapid prototyping and evaluation 10

11 The current Search System A Lucene elastic search based system, documents are returned as ranked lists Search Space Lucene query k 1 11

12 Patent Gold Standards We have manually curated search reports for about 40 million simple patent families The relevant documents are mentioned in the search report as either X(I,N),A,Y,... documents median: 5 citations in search reports 12

13 Setting up a benchmarking environment We need to move away from anecdotal evidence to statistically meaningful facts TAPAS Applications MAP:0.4 MAP:0.2 Method 1 Method 2 Patent Corpus * Exploiting real queries SEARCH INDEX Evaluate! 13

14 Using KNIME to enhance automated search Evaluation and Feedback Application Query Formulation Search PriorArt relevant documents 14

15 Use Case: Does Translation help to find relevant Prior Art? 15

16 %-relevant retrieved Evaluating the results Does Translation help to find relevant Prior Art? 100 %-relevant retrieved Relevant Combined Translated Original Expected results Queries Queries %-relevant retrieved , , ,

17 Patents have multi-modal information content: Images Images Chemical Formulas Flow Diagrams Circuits Technical Drawings 17

18 Image Search the Google way Standard Google image search (very strong on real world images) is currently not suited for technical drawings Extremely complicated endeavour Google results 18

19 Use Case: Image Search Query Filtering and Visualisation Search Space State of the art Image processing 19

20 Image Search Adaptive Hierarchical Density Histogram (AHDH) Exploit distribution of image points New results AHDH closest match 20

21 Can we optimize our Search Tool Parameters? 21

22 Use Case: Parameter Optimisation Swarm Optimization 22

23 Can we optimize our Search Tool Parameters? End Start 23

24 What are the possibilities? We were able to create a system where all components sharing the same (table based) interface. We make full use of many existing components and combine them successfully with internal and external additional developments Text mining Image Similarity Machine Learning based document similarity Influence of translation to patent search Rapid Prototyping Evaluation and Feedback 24

25 What are the limits? Very practical issues like memory freezes of the eclipse Environment Sometimes extreme over head introduction e.g. text mining nodes (String to Document) Not suitable from our experience for full corpus analytics (100 million full text documents and more) Getting lost in yellow nodes (ungroup, pivot, regroup, filter, missing value,...) 25

26 Future Exposing stable services as web service through the KNIME server web-interface Exposing workflow creation to a wider range of user (right now its IM only) Connecting more services Using of streaming to overcome some of the limits 26

27 Thank you for your attention 27

28 28

29 Supplement 29

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna

More information

Conference of Directors of National Libraries in Asia and Oceania. Hanoi, 20 April 2009

Conference of Directors of National Libraries in Asia and Oceania. Hanoi, 20 April 2009 Conference of Directors of National Libraries in Asia and Oceania Hanoi, 20 April 2009 Use of Open Source Software at the National Library of Australia Jan Fullerton Director-General National Library of

More information

Patent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF

Patent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF Patent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF Julia Jürgens, Sebastian Kastner, Christa Womser-Hacker, and Thomas Mandl University of Hildesheim,

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics Unit 10 Databases Computer Concepts 2016 ENHANCED EDITION 10 Unit Contents Section A: Database Basics Section B: Database Tools Section C: Database Design Section D: SQL Section E: Big Data Unit 10: Databases

More information

Large Scale Graph Solutions: Use-cases And Lessons Learnt

Large Scale Graph Solutions: Use-cases And Lessons Learnt Large Scale Graph Solutions: Use-cases And Lessons Learnt Principal Engineer, AI/Cloud Platforms Abstraction Is The Art Euler s Bridges - Seven Bridges of Königsberg G = (V, E); V(id, attr1, attr2,..);

More information

Giving Your Headings Meaningful Names (Desktop and Plus) p. 158 Rearranging the Order of the Output p. 160 Formatting Data p. 163 Formatting Columns

Giving Your Headings Meaningful Names (Desktop and Plus) p. 158 Rearranging the Order of the Output p. 160 Formatting Data p. 163 Formatting Columns Acknowledgments p. xxi Introduction p. xxiii Getting Started with Discoverer An Overview of Discoverer p. 3 Business Intelligence and Your Organization p. 4 Business Intelligence and Trends p. 5 Discoverer's

More information

Europeana Core Service Platform

Europeana Core Service Platform Europeana Core Service Platform DELIVERABLE D7.1: Strategic Development Plan, Architectural Planning Revision Final Date of submission 30 October 2015 Author(s) Marcin Werla, PSNC Pavel Kats, Europeana

More information

An overview of Graph Categories and Graph Primitives

An overview of Graph Categories and Graph Primitives An overview of Graph Categories and Graph Primitives Dino Ienco (dino.ienco@irstea.fr) https://sites.google.com/site/dinoienco/ Topics I m interested in: Graph Database and Graph Data Mining Social Network

More information

Can you name one application that does not need any data? Can you name one application that does not need organized data?

Can you name one application that does not need any data? Can you name one application that does not need organized data? Introduction Why Databases? Can you name one application that does not need any data? No, a program itself is data Can you name one application that does not need organized data? No, programs = algorithms

More information

Invenio: A Modern Digital Library for Grey Literature

Invenio: A Modern Digital Library for Grey Literature Invenio: A Modern Digital Library for Grey Literature Jérôme Caffaro, CERN Samuele Kaplun, CERN November 25, 2010 Abstract Grey literature has historically played a key role for researchers in the field

More information

Sustaining Personalization

Sustaining Personalization Sustaining Personalization How to Run a Successful Personalization Program John Berndt CEO and Chief Strategist, The Berndt Group Ed Kapuscisnki Senior Engineer, The Berndt Group www.berndtrgroup.net Lets

More information

Open Source Search. Andreas Pesenhofer. max.recall information systems GmbH Künstlergasse 11/1 A-1150 Wien Austria

Open Source Search. Andreas Pesenhofer. max.recall information systems GmbH Künstlergasse 11/1 A-1150 Wien Austria Open Source Search Andreas Pesenhofer max.recall information systems GmbH Künstlergasse 11/1 A-1150 Wien Austria max.recall information systems max.recall is a software and consulting company enabling

More information

Core Technology Development Team Meeting

Core Technology Development Team Meeting Core Technology Development Team Meeting To hear the meeting, you must call in Toll-free phone number: 1-866-740-1260 Access Code: 2201876 For international call in numbers, please visit: https://www.readytalk.com/account-administration/international-numbers

More information

Web of Science. Platform Release Nina Chang Product Release Date: March 25, 2018 EXTERNAL RELEASE DOCUMENTATION

Web of Science. Platform Release Nina Chang Product Release Date: March 25, 2018 EXTERNAL RELEASE DOCUMENTATION Web of Science EXTERNAL RELEASE DOCUMENTATION Platform Release 5.28 Nina Chang Product Release Date: March 25, 2018 Document Version: 1.0 Date of issue: March 22, 2018 RELEASE OVERVIEW The following features

More information

Knowledge Engineering and Data Mining. Knowledge engineering has 6 basic phases:

Knowledge Engineering and Data Mining. Knowledge engineering has 6 basic phases: Knowledge Engineering and Data Mining Knowledge Engineering The process of building intelligent knowledge based systems is called knowledge engineering Knowledge engineering has 6 basic phases: 1. Problem

More information

Make the most of your access to ScienceDirect

Make the most of your access to ScienceDirect 1 Make the most of your access to ScienceDirect Present Future 2 ScienceDirect Training Deck We re here to help you make the most of your access to ScienceDirect. ScienceDirect offers researchers the latest

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York

Clustering. Robert M. Haralick. Computer Science, Graduate Center City University of New York Clustering Robert M. Haralick Computer Science, Graduate Center City University of New York Outline K-means 1 K-means 2 3 4 5 Clustering K-means The purpose of clustering is to determine the similarity

More information

1

1 1 2 3 6 7 8 9 10 Storage & IO Benchmarking Primer Running sysbench and preparing data Use the prepare option to generate the data. Experiments Run sysbench with different storage systems and instance

More information

Distributed Databases: SQL vs NoSQL

Distributed Databases: SQL vs NoSQL Distributed Databases: SQL vs NoSQL Seda Unal, Yuchen Zheng April 23, 2017 1 Introduction Distributed databases have become increasingly popular in the era of big data because of their advantages over

More information

Creating a Recommender System. An Elasticsearch & Apache Spark approach

Creating a Recommender System. An Elasticsearch & Apache Spark approach Creating a Recommender System An Elasticsearch & Apache Spark approach My Profile SKILLS Álvaro Santos Andrés Big Data & Analytics Solution Architect in Ericsson with more than 12 years of experience focused

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

PROJECT PERIODIC REPORT

PROJECT PERIODIC REPORT PROJECT PERIODIC REPORT Grant Agreement number: 257403 Project acronym: CUBIST Project title: Combining and Uniting Business Intelligence and Semantic Technologies Funding Scheme: STREP Date of latest

More information

Automatically Generating Queries for Prior Art Search

Automatically Generating Queries for Prior Art Search Automatically Generating Queries for Prior Art Search Erik Graf, Leif Azzopardi, Keith van Rijsbergen University of Glasgow {graf,leif,keith}@dcs.gla.ac.uk Abstract This report outlines our participation

More information

Cross Reference Strategies for Cooperative Modalities

Cross Reference Strategies for Cooperative Modalities Cross Reference Strategies for Cooperative Modalities D.SRIKAR*1 CH.S.V.V.S.N.MURTHY*2 Department of Computer Science and Engineering, Sri Sai Aditya institute of Science and Technology Department of Information

More information

A Process-Centric Data Mining and Visual Analytic Tool for Exploring Complex Social Networks

A Process-Centric Data Mining and Visual Analytic Tool for Exploring Complex Social Networks KDD 2013 Workshop on Interactive Data Exploration and Analytics (IDEA) A Process-Centric Data Mining and Visual Analytic Tool for Exploring Complex Social Networks Denis Dimitrov, Georgetown University,

More information

CPC Update - EPO CPC Annual Meeting for National Offices. Pierre Held Georg Schiwy Geneva, 19 February 2019

CPC Update - EPO CPC Annual Meeting for National Offices. Pierre Held Georg Schiwy Geneva, 19 February 2019 CPC Update - EPO CPC Annual Meeting for National Offices Pierre Held Georg Schiwy Geneva, 19 February 2019 Agenda: CPC implementation at National Offices IT Matters CPC International project: impact on

More information

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich Computational Databases: Inspirations from Statistical Software Linnea Passing, linnea.passing@tum.de Technical University of Munich Data Science Meets Databases Data Cleansing Pipelines Fuzzy joins Data

More information

Gale Digital Scholar Lab Getting Started Walkthrough Guide

Gale Digital Scholar Lab Getting Started Walkthrough Guide Getting Started Logging In Your library or institution will provide you with your login link. You will have the option to sign in with a Google or Microsoft Account, this is so you have a personal account

More information

Data Management Glossary

Data Management Glossary Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative

More information

Word Disambiguation in Web Search

Word Disambiguation in Web Search Word Disambiguation in Web Search Rekha Jain Computer Science, Banasthali University, Rajasthan, India Email: rekha_leo2003@rediffmail.com G.N. Purohit Computer Science, Banasthali University, Rajasthan,

More information

Shaping Business Strategy Through Competitive Intelligence: strategic use of Intellectual Property Information

Shaping Business Strategy Through Competitive Intelligence: strategic use of Intellectual Property Information Topic 7 Shaping Business Strategy Through Competitive Intelligence: strategic use of Intellectual Property Information Training of Trainer s Program, Teheran 9 June2015 By Matthias Kuhn, MBA University

More information

Metadata Ingestion and Processinng

Metadata Ingestion and Processinng biomedical and healthcare Data Discovery Index Ecosystem Ingestion and Processinng Jeffrey S. Grethe, Ph.D. 2017 BioCADDIE All Hands Meeting prototype Ingestion Indexing Repositories Ingestion ElasticSearch

More information

Why NoSQL? Why Riak?

Why NoSQL? Why Riak? Why NoSQL? Why Riak? Justin Sheehy justin@basho.com 1 What's all of this NoSQL nonsense? Riak Voldemort HBase MongoDB Neo4j Cassandra CouchDB Membase Redis (and the list goes on...) 2 What went wrong with

More information

Annotating Spatio-Temporal Information in Documents

Annotating Spatio-Temporal Information in Documents Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de

More information

Chapter 5. Database Processing

Chapter 5. Database Processing Chapter 5 Database Processing No, Drew, You Don t Know Anything About Creating Queries." AllRoad Parts operational database used to determine which parts to consider for 3D printing. If Addison and Drew

More information

Search Engine Architecture. Hongning Wang

Search Engine Architecture. Hongning Wang Search Engine Architecture Hongning Wang CS@UVa CS@UVa CS4501: Information Retrieval 2 Document Analyzer Classical search engine architecture The Anatomy of a Large-Scale Hypertextual Web Search Engine

More information

SAP Analytics Cloud Best Practices for BI Platform Live Universes

SAP Analytics Cloud Best Practices for BI Platform Live Universes NOTE: Delete the yellow stickers when finished. See the SAP Image Library for other available images. Once the custom image is inserted, click Format Send Backward Send to Back, so the motion band is on

More information

Multi-modal Information Retrieval experiences from Context-Aware Image Management, CAIM

Multi-modal Information Retrieval experiences from Context-Aware Image Management, CAIM Multi-modal Information Retrieval experiences from Context-Aware Image Management, CAIM Joan Nordbotten Dept. Of Information and Media Science University of Bergen, Norway 1 Outline Multi-modal Information

More information

TEXT MINING: THE NEXT DATA FRONTIER

TEXT MINING: THE NEXT DATA FRONTIER TEXT MINING: THE NEXT DATA FRONTIER An Infrastructural Approach Dr. Petr Knoth CORE (core.ac.uk) Knowledge Media institute, The Open University United Kingdom 2 OpenMinTeD Establish an open and sustainable

More information

Combining Information Retrieval and Relevance Feedback for Concept Location

Combining Information Retrieval and Relevance Feedback for Concept Location Combining Information Retrieval and Relevance Feedback for Concept Location Sonia Haiduc - WSU CS Graduate Seminar - Jan 19, 2010 1 Software changes Software maintenance: 50-90% of the global costs of

More information

CONCLUSIONS AND RECOMMENDATIONS

CONCLUSIONS AND RECOMMENDATIONS Chapter 4 CONCLUSIONS AND RECOMMENDATIONS UNDP and the Special Unit have considerable experience in South-South cooperation and are well positioned to play a more active and effective role in supporting

More information

Social Media Intelligence Text and Network Mining combined. Dr. Rosaria Silipo

Social Media Intelligence Text and Network Mining combined. Dr. Rosaria Silipo Social Media Intelligence Text and Network Mining combined Dr. Rosaria Silipo rosariasilipo@yahoo.com Previously on PAW... PAW San Francisco 2012 2 Social Media Analysis Water Water Everywhere, and not

More information

Web of Science. Platform Release Nina Chang Product Release Date: December 10, 2017 EXTERNAL RELEASE DOCUMENTATION

Web of Science. Platform Release Nina Chang Product Release Date: December 10, 2017 EXTERNAL RELEASE DOCUMENTATION Web of Science EXTERNAL RELEASE DOCUMENTATION Platform Release 5.27 Nina Chang Product Release Date: December 10, 2017 Document Version: 1.0 Date of issue: December 7, 2017 RELEASE OVERVIEW The following

More information

Patent Image Retrieval

Patent Image Retrieval Patent Image Retrieval Stefanos Vrochidis IRF Symposium 2008 Vienna, November 6, 2008 Aristotle University of Thessaloniki Overview 1. Introduction 2. Related Work in Patent Image Retrieval 3. Patent Image

More information

One Search Many Answers

One Search Many Answers One Search Many Answers Bringing together results from multiple databases through the DiscoveryGate Platform Carmen Nitsche, VP Content Fall 2009 ACS Meeting Washington, D.C. Information Driven R&D Is

More information

A B2B Search Engine. Abstract. Motivation. Challenges. Technical Report

A B2B Search Engine. Abstract. Motivation. Challenges. Technical Report Technical Report A B2B Search Engine Abstract In this report, we describe a business-to-business search engine that allows searching for potential customers with highly-specific queries. Currently over

More information

NOMAD Metadata for all

NOMAD Metadata for all EMMC Workshop on Interoperability NOMAD Metadata for all Cambridge, 8 Nov 2017 Fawzi Mohamed FHI Berlin NOMAD Center of excellence goals 200,000 materials known to exist basic properties for very few highly

More information

Electronic Records Management System (ERMS)

Electronic Records Management System (ERMS) Electronic Records Management System (ERMS) ERMS Folder Creation and Titling Conventions 2015 Contents 1. Introduction... 3 2. ERMS Folders... 3 3. Creating an ERMS Folder... 4 4. ERMS Folder Creation

More information

A data-driven framework for archiving and exploring social media data

A data-driven framework for archiving and exploring social media data A data-driven framework for archiving and exploring social media data Qunying Huang and Chen Xu Yongqi An, 20599957 Oct 18, 2016 Introduction Social media applications are widely deployed in various platforms

More information

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper. Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...

More information

Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit

Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit Martin Scharm, Dagmar Waltemath Department of Systems Biology and Bioinformatics University of Rostock

More information

Linked Data in Archives

Linked Data in Archives Linked Data in Archives Publish, Enrich, Refine, Reconcile, Relate Presented 2012-08-23 SAA 2012, Linking Data Across Libraries, Archives, and Museums Corey A Harper Semantic Web TBL s original vision

More information

Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies

Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies Rapid Application Development using InforSense Open Workflow and Oracle Chemistry Cartridge Technologies Anthony C. Arvanites Lead Discovery Informatics Company Introduction Founded: 1999 Platform: Combining

More information

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure Twan Goosen 1 (CLARIN ERIC), Nuno Freire 2, Clemens Neudecker 3, Maria Eskevich

More information

Spotfire and Tableau Positioning. Summary

Spotfire and Tableau Positioning. Summary Licensed for distribution Summary So how do the products compare? In a nutshell Spotfire is the more sophisticated and better performing visual analytics platform, and this would be true of comparisons

More information

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours)

Certification. F. Genova (thanks to I. Dillo and Hervé L Hours) Certification F. Genova (thanks to I. Dillo and Hervé L Hours) Perhaps the biggest challenge in sharing data is trust: how do you create a system robust enough for scientists to trust that, if they share,

More information

CORE: Improving access and enabling re-use of open access content using aggregations

CORE: Improving access and enabling re-use of open access content using aggregations CORE: Improving access and enabling re-use of open access content using aggregations Petr Knoth CORE (Connecting REpositories) Knowledge Media institute The Open University @petrknoth 1/39 Outline 1. The

More information

The application of Randomized HITS algorithm in the fund trading network

The application of Randomized HITS algorithm in the fund trading network The application of Randomized HITS algorithm in the fund trading network Xingyu Xu 1, Zhen Wang 1,Chunhe Tao 1,Haifeng He 1 1 The Third Research Institute of Ministry of Public Security,China Abstract.

More information

ELENA: Creating a Smart Space for Learning. Zoltán Miklós (presenter) Bernd Simon Vienna University of Economics

ELENA: Creating a Smart Space for Learning. Zoltán Miklós (presenter) Bernd Simon Vienna University of Economics ELENA: Creating a Smart Space for Learning Zoltán Miklós (presenter) Bernd Simon Vienna University of Economics Overview Motivation, goals Architecture, implementation Interoperability: Querying resources

More information

Welcome to EPO online services

Welcome to EPO online services Welcome to EPO online services Welcome to EPO online services! With EPO online services you can interact with the European Patent Office electronically in a tailor-made, state-of-the-art secure environment,

More information

BIG DATA COURSE CONTENT

BIG DATA COURSE CONTENT BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data

More information

Visualization and text mining of patent and non-patent data

Visualization and text mining of patent and non-patent data of patent and non-patent data Anton Heijs Information Solutions Delft, The Netherlands http://www.treparel.com/ ICIC conference, Nice, France, 2008 Outline Introduction Applications on patent and non-patent

More information

Analyzer of Bio-resource Citations. World Data Center of Microorganisms(WDCM)

Analyzer of Bio-resource Citations. World Data Center of Microorganisms(WDCM) Analyzer of Bio-resource Citations World Data Center of Microorganisms(WDCM) http://abc.wdcm.org/ Outlines Introduction of ABC Homepage and function of ABC Text mining for microorganism : classification,

More information

Multiple Retrieval Models and Regression Models for Prior Art Search

Multiple Retrieval Models and Regression Models for Prior Art Search Multiple Retrieval Models and Regression Models for Prior Art Search Patrice Lopez and Laurent Romary Humboldt Universität zu Berlin - Institut für Deutsche Sprache und Linguistik patrice lopez@hotmail.com

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS What are all those Azure* and Power* services and why do I want them? Dr Greg Low SQL Down Under greg@sqldownunder.com Who is Greg? CEO and Principal Mentor at SDU Data Platform MVP Microsoft Regional

More information

Matrex Table of Contents

Matrex Table of Contents Matrex Table of Contents Matrex...1 What is the equivalent of a spreadsheet in Matrex?...2 Why I should use Matrex instead of a spreadsheet application?...3 Concepts...4 System architecture in the future

More information

Semantic Annotation of Web Resources Using IdentityRank and Wikipedia

Semantic Annotation of Web Resources Using IdentityRank and Wikipedia Semantic Annotation of Web Resources Using IdentityRank and Wikipedia Norberto Fernández, José M.Blázquez, Luis Sánchez, and Vicente Luque Telematic Engineering Department. Carlos III University of Madrid

More information

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS 1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,

More information

Real-time Fraud Detection with Innovative Big Graph Feature. Gaurav Deshpande, VP Marketing, TigerGraph; Mingxi Wu, VP Engineering, TigerGraph

Real-time Fraud Detection with Innovative Big Graph Feature. Gaurav Deshpande, VP Marketing, TigerGraph; Mingxi Wu, VP Engineering, TigerGraph Real-time Fraud Detection with Innovative Big Graph Feature Gaurav Deshpande, VP Marketing, TigerGraph; Mingxi Wu, VP Engineering, TigerGraph Speaking Today Gaurav Deshpande VP Marketing, TigerGraph gaurav@tigergraph.com

More information

enova: Project meeting and User needs analysis report 23rd June 2011

enova: Project meeting and User needs analysis report 23rd June 2011 enova: Project meeting and User needs analysis report 23rd June 2011 23 rd June 2011 1 ENOVA PROJECT MEETING 23 rd June 2011 2 1. Carlos/James to report on progress with test site/server and flag any potential

More information

Cloud Computing Concepts, Models, and Terminology

Cloud Computing Concepts, Models, and Terminology Cloud Computing Concepts, Models, and Terminology Chapter 1 Cloud Computing Advantages and Disadvantages https://www.youtube.com/watch?v=ojdnoyiqeju Topics Cloud Service Models Cloud Delivery Models and

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

IBM IT Training Services. Lotus Software WebSphere Portal. Aya Soffer, Manager, Search Technologies Dept IBM Corporation

IBM IT Training Services. Lotus Software WebSphere Portal. Aya Soffer, Manager, Search Technologies Dept IBM Corporation IBM IT Training Services IBM WebSphere Portal and Lotus Workplace technical symposium Session Number: B0F2 Session Title: Text Search and Portal Integration Speaker's e-mail: ayas@il.ibm.com Aya Soffer,

More information

Chapter 10. Conclusion Discussion

Chapter 10. Conclusion Discussion Chapter 10 Conclusion 10.1 Discussion Question 1: Usually a dynamic system has delays and feedback. Can OMEGA handle systems with infinite delays, and with elastic delays? OMEGA handles those systems with

More information

In-Memory Analytics with EXASOL and KNIME //

In-Memory Analytics with EXASOL and KNIME // Watch our predictions come true! In-Memory Analytics with EXASOL and KNIME // Dr. Marcus Dill Analytics 2020 The volume and complexity of data today and in the future poses great challenges for IT systems.

More information

Mineração de Dados Aplicada

Mineração de Dados Aplicada Data Exploration August, 9 th 2017 DCC ICEx UFMG Summary of the last session Data mining Data mining is an empiricism; It can be seen as a generalization of querying; It lacks a unified theory; It implies

More information

Crowdsourcing Codebook Enhancements A DDI-based Approach

Crowdsourcing Codebook Enhancements A DDI-based Approach Crowdsourcing Codebook Enhancements A DDI-based Approach FCSM, December 2 nd 2015 Lars Vilhuber (Cornell University) Benjamin Perry (Cornell University) Venkata Kambhampaty (Cornell University) Kyle Brumsted

More information

Putting Open Access into Practice

Putting Open Access into Practice Putting Open Access into Practice Dr. Nancy Pontika Connecting Repositories (CORE) Knowledge Media Institute Open University Twitter: @oacore VTT, Espoo (Finland) 11-12 May 2015 This work is licensed under

More information

Enriching knowledge graphs with text processing techniques

Enriching knowledge graphs with text processing techniques Enriching knowledge graphs with text processing techniques ERCIM News 111:https://ercim-news.ercim.eu/en111/r-i/collaboration-spotting-a-visual-analytics-platform-to-assist-knowledge-discovery J.-M. Le

More information

Digital Preservation: How to Plan

Digital Preservation: How to Plan Digital Preservation: How to Plan Preservation Planning with Plato Christoph Becker Vienna University of Technology http://www.ifs.tuwien.ac.at/~becker Sofia, September 2009 Outline Why preservation planning?

More information

What is KNIME? workflows nodes standard data mining, data analysis data manipulation

What is KNIME? workflows nodes standard data mining, data analysis data manipulation KNIME TUTORIAL What is KNIME? KNIME = Konstanz Information Miner Developed at University of Konstanz in Germany Desktop version available free of charge (Open Source) Modular platform for building and

More information

Derwent Innovation Blueprint for Success Patentability: Plan and execute a search like an examiner

Derwent Innovation Blueprint for Success Patentability: Plan and execute a search like an examiner Derwent Innovation Blueprint for Success Patentability: Plan and execute a search like an examiner Is there patent and non-patent art that may indicate my invention is not new? Are the solutions presented

More information

A Brief Introduction to Data Mining

A Brief Introduction to Data Mining A Brief Introduction to Data Mining L. Torgo ltorgo@dcc.fc.up.pt Departamento de Ciência de Computadores Faculdade de Ciências / Universidade do Porto Feb, 2017 What is Data Mining? Introduction A possible

More information

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13 Bigtable A Distributed Storage System for Structured Data Presenter: Yunming Zhang Conglong Li References SOCC 2010 Key Note Slides Jeff Dean Google Introduction to Distributed Computing, Winter 2008 University

More information

PROBLEM FORMULATION AND RESEARCH METHODOLOGY

PROBLEM FORMULATION AND RESEARCH METHODOLOGY PROBLEM FORMULATION AND RESEARCH METHODOLOGY ON THE SOFT COMPUTING BASED APPROACHES FOR OBJECT DETECTION AND TRACKING IN VIDEOS CHAPTER 3 PROBLEM FORMULATION AND RESEARCH METHODOLOGY The foregoing chapter

More information

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project

Medici for Digital Cultural Heritage Libraries. George Tsouloupas, PhD The LinkSCEEM Project Medici for Digital Cultural Heritage Libraries George Tsouloupas, PhD The LinkSCEEM Project Overview of Digital Libraries A Digital Library: "An informal definition of a digital library is a managed collection

More information

The State of Apache Sling

The State of Apache Sling The State of Apache Sling Carsten Ziegeler cziegeler@apache.org adaptto() 2012 Berlin 1 About Member of the ASF Current PMC Chair of Apache Sling Apache Sling, Felix, Portals, Incubator RnD Team at Adobe

More information

CS-E4420 Information Retrieval

CS-E4420 Information Retrieval CS-E4420 Information Retrieval Course assignments 02-07-2017 Esko Ikkala Agenda General information about the course assignments Short demo: how to set up the necessary programming tools Assignments There

More information

FOSSA WP3 - Deliverable n. 3 Selection of tools to perform periodic interinstitutional

FOSSA WP3 - Deliverable n. 3 Selection of tools to perform periodic interinstitutional DIGIT Unit B1 FOSSA WP3 - Deliverable n. 3 Selection of tools to perform periodic interinstitutional inventories of software assets and standards Date: 25/04/2016 Doc. Version: Final TABLE OF CONTENTS

More information

DicomStar. DicomStar. A Google for DICOM Files. True IT Solutions For You

DicomStar. DicomStar. A Google for DICOM Files. True IT Solutions For You DicomStar DicomStar A Google for DICOM Files Motivation... you need special DICOM images and DICOM files...... finding of images in archives takes a long time...... you need to apply special search queries......

More information

Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha

Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha Physics Institute,

More information

Staff Microsoft Office Training Workshops

Staff Microsoft Office Training Workshops Staff Microsoft Office Training Workshops To see Course Information Hold down the CTRL key on the keyboard & click on the page number Contents Introduction to Office 365... 1 Introduction to Access Database

More information

vector space retrieval many slides courtesy James Amherst

vector space retrieval many slides courtesy James Amherst vector space retrieval many slides courtesy James Allan@umass Amherst 1 what is a retrieval model? Model is an idealization or abstraction of an actual process Mathematical models are used to study the

More information

Things to consider when using Semantics in your Information Management strategy. Toby Conrad Smartlogic

Things to consider when using Semantics in your Information Management strategy. Toby Conrad Smartlogic Things to consider when using Semantics in your Information Management strategy Toby Conrad Smartlogic toby.conrad@smartlogic.com +1 773 251 0824 Some of Smartlogic s 250+ Customers Awards Trend Setting

More information

NATHAN SAKUNKOO. Education: Professional Experience: Awards and Honors

NATHAN SAKUNKOO. Education: Professional Experience: Awards and Honors NATHAN SAKUNKOO 106 Rinconada Ave, Palo Alto, CA 94301 nathans@cs.stanford.edu, www.cs.stanford.edu/~nathans, (650) 353 7447 Education: 2006 2010 Stanford University, CA M.S. in Computer Science GPA 4.03

More information

Providing Information Superiority to Small Tactical Units

Providing Information Superiority to Small Tactical Units Providing Information Superiority to Small Tactical Units Jeff Boleng, PhD Principal Member of the Technical Staff Software Solutions Conference 2015 November 16 18, 2015 Copyright 2015 Carnegie Mellon

More information

Information System on Literature in the Field of ICT for Environmental Sustainability

Information System on Literature in the Field of ICT for Environmental Sustainability International Environmental Modelling and Software Society (iemss) 2010 International Congress on Environmental Modelling and Software Modelling for Environment s Sake, Fifth Biennial Meeting, Ottawa,

More information