UWE has obtained warranties from all depositors as to their title in the material deposited and as to their right to deposit such material.

Size: px
Start display at page:

Download "UWE has obtained warranties from all depositors as to their title in the material deposited and as to their right to deposit such material."

Transcription

1 Vlachidis, A., Tudhope, D. and Binding, C. (2013) Information extraction techniques for the purposes of semantic indexing of archaeological resources. In: Digital Heritage 2013: Interfaces with the Past, York, UK, 6 July Available from: We recommend you cite the published version. The publisher s URL is: Refereed: Yes (no note) Disclaimer UWE has obtained warranties from all depositors as to their title in the material deposited and as to their right to deposit such material. UWE makes no representation or warranties of commercial utility, title, or fitness for a particular purpose or any other warranty, express or implied in respect of any material deposited. UWE makes no representation that the use of the materials will not infringe any patent, copyright, trademark or other property or proprietary rights. UWE accepts no liability for any infringement of intellectual property rights in any material deposited but will remove such material from public view pending investigation in the event of an allegation of any such infringement. PLEASE SCROLL DOWN FOR TEXT.

2 1 Information Extraction Techniques for the Purposes of Semantic Indexing of Archaeological Resources Andreas Vlachidis, Ceri Binding, Douglas Tudhope Abstract The paper describes the use of Information Extraction (IE), a Natural Language Processing (NLP) technique to assist rich semantic indexing of diverse archaeological text resources. Such unpublished online documents are often referred to as Grey Literature. Established document indexing techniques are not sufficient to satisfy user information needs that expand beyond the limits of a simple term matching search. The focus of the research is to direct a semantic-aware 'rich' indexing of diverse natural language resources with properties capable of satisfying information retrieval from on-line publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project in the UoG Hypermedia Research Unit. The study proposes the use of knowledge resources and conceptual models to assist an Information Extraction process able to provide rich semantic indexing of archaeological documents capable of resolving linguistic ambiguities of indexed terms. CRM CIDOC-EH, a standard core ontology in cultural heritage, and the English Heritage (EH) Thesauri for archaeological concepts are employed to drive the Information Extraction process and to support the aims of a semantic framework in which indexed terms are capable of supporting semantic-aware access to on-line resources. The paper describes the process of semantic indexing of archaeological concepts (periods and finds) in a corpus of 535 grey literature documents using a rule based Information Extraction technique facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Illustrative examples demonstrate the different stages of the process. Initial results suggest that the combination of information extraction with knowledge resources and standard core conceptual models is capable of supporting semantic aware and linguistically disambiguate term indexing. Index Terms Natural Language Processing, Ontology Based Information Extraction, Semantic Annotations, CIDOC, Conceptual Reference Model. I I.INTRODUCTION t is said that we live in the Information Society. Today more people than ever before manage an escalating volume of information, which in turn creates an even more complex, and increasing volume of electronic environments aimed to satisfy a wide range of information needs [1][2]. Considering the exponential growth of the Web over the last decades and an ever increasing load of on-line information which has been made available, it wouldn't be an exaggeration to say, that the Web today is the primary source of information for many people, if not for most of us [3]. Information needs on the Web are expressed in words in the form of query terms which are intentionally selected by the users and are expected to occur in the retrieved set of documents result. Based on this fundamental principle, contemporary retrieval systems like Web Search Engines have managed to provide wide access to information while being restrained to operate on the word level. The fundamental argument against the above principle is that not all documents use the same words to refer to the same concept and not all users will use the same words to seek for the same concept. Therefore, contemporary search engines systems are ill-equipped to satisfy information needs that are expressed in a conceptual level beyond the limits of words [4]. Information seeking is a complex human activity that can be studied by various theoretical frameworks but in the everyday 'on-line' practise, users are prompted to satisfy their information needs by simply submitting words in a search box. A comparison study over nine search engine transaction logs revealed that the average number of terms that web users employ to express a single query is 2.2 terms [3]. Considering also that only 2% of the users are using operators to explicate their query and that eight out of ten users simply ignore the results displayed beyond the first page, it is not hard to realise the importance individual words have in query formulation. The Semantic Web is proposed to add logic to the Web for improving user experience and information seeking activities and so to use rules, to make inferences and to choose courses of action that are defined by the meaning of information and not just by a string of words. It is not proposed to be an alternative or separate web but instead the Semantic Web is envisaged as an extension of the current Web [4]. Supporting access to diverse and distributed collections of information the Semantic Web can enable sharing of information and to provide a coherent view to resources where for example 'zip code' and postal code' are defined as resources carrying the same type of information. In addition, dealing with polysemy; same word carrying different meaning in different contexts (i.e. 'bar') and synonymy different words having the same meaning (i.e. car automobile) can significantly improve user information seeking activities. It is said that the Semantic Web, when properly designed 'can assist the evolution of human knowledge as a whole' [5].

3 2 The Semantic Technologies for Archaeological Resources (STAR) project is aligned to the above views and aims to develop new methods for linking digital archive databases, vocabularies and associated unpublished on-line documents, often referred to as Grey Literature. The project aims to support the considerable efforts of English Heritage in trying to integrate the data from various archaeological projects and their associated activities, and seeks to exploit the potential of semantic technologies and natural language processing techniques, for enabling complex and semantically defined queries over archaeological digital resources. To achieve semantic interoperability over diverse information resources the STAR project adopted the English Heritage extension of the CIDOC Conceptual Reference Model (CRM-EH). The CIDOC CRM is an ISO standard core ontology for cultural heritage information aimed to enable information exchange between heterogeneous resources providing the required semantic definitions and clarifications [6]. In addition, archaeological grey literature documents from the OASIS corpus (Online AccesS to the Index of archaeological investigations) constitute a valued resource for the aims of the STAR project for enabling access to diverse archaeological resources. Grey literature documents hold information relative to archaeological datasets that have been produced during archaeological excavations and quite frequently summarise sampling data and excavation activities that occurred during and after major archaeological fieldwork. The purpose of excavating grey literature documents' is to produce semantic-aware 'rich' indices of archaeological texts that comply with the ontological definitions of CRM- EH. To achieve this study explores the potential of Natural Language Techniques and more specifically the use of Information Extraction for identifying textual representations which are capable to support the population of rich semantic indices. The study is directed in the incorporation of knowledge resources (EH thesauri) and the ontological model (CRM-EH) to assist the information extraction process. The adopted information extraction technique is influenced from the notion of Object Based Information Extraction (OBIE) while the potential of semantic-aware 'rich' indices is addressed with the use of semantic annotations. The following paragraphs present the method and the results of an Information Extraction exercise which aimed to extract and to relate textual snippets from grey literature documents with the ontological model CRM- EH. The discussion reveals the method and presents the results of an initial exercise which, achieved to identify and to link to their semantic representations, textual snippets of information relating to two CRM-EH ontological entities E49.Time Appellation and E19. Physical Object corresponding to archaeological periods and object finds respectively. II.METHOD A.Excavating Grey Literature Documents The current method of excavating grey literature documents is based on the use of Information Extraction techniques which incorporate knowledge resources and ontological references to support a rule based Information Extraction approach, capable of providing semantic annotations that comply with the conceptual reference model CRM-EH. The framework employed to support extraction of information snippets and assignment of semantic annotations to documents is GATE (General Architecture for Text Engineering). At the core of the extraction mechanism are JAPE (Java Annotation Pattern Engine) rules which encapsulate the logic arguments of the extraction method and express natural language matching patterns responsible for extracting desirable snippets of information. The method is incorporating the ontological reference model for assisting the semantic interoperability of the produced semantic annotations. The level of involvement of the ontological structure in rules formulation describe an Ontology Oriented Information Extraction method, since the exploitation of ontological structure in the extraction process is such that can not be described as OBIE. The method also makes use of the EH Thesauri for Periods and Objects/Finds to support term extraction from texts. Overall 2980 thesaurus terms contributed in an information extraction exercise which performed over a corpus of 535 grey literature documents populating an index of approximately individual annotations covering the scope of two ontological entities for archaeological periods and finds. B.Information Extraction The potential of Information Extraction IR, a form of NLP technique aimed to extract snippets from documents, is suggested to enable richer forms of document indexing. Smeaton recognizes the advances Named Entity Recognition NER, can have in identifying index terms for document representation [7]. Similarly Marie-Francine Moens reveals that the idea of using semantic information when building indexing representations is not new, and actually has been expressed by Zellig Haris back in 1959 [8]. Information Extraction (IE) is not information retrieval; IE tasks do not involve finding relevant documents from a collection but they are rather specific text analysis tasks aimed to extract specific information snippets from documents. The fundamentally different role of IE does not compete with IR, on the contrary the potential combination of the two technologies promises the creation of new powerful tools in text processing [7][8][9][10]. C.Ontology Ontologies can be understood as conceptual structures that formally describe a given domain by defining classes and sub-classes of interest and by imposing rules and relationships among them to determine a formal structure of 'things' [5][11]. Ontologies can be employed to advance the operation of information retrieval systems beyond the limits of words to the level of concepts. Ontological concepts can enrich information retrieval tasks by facilitating rich, semantic information seeking activities both during query formulation and during retrieval selection. Inferences across diverse sources supported by ontological structures are capable to enhance information seeking activities and to mediate retrieval from heterogeneous data resources [11]. D.Semantic Annotations Semantic Annotations refers to a specific metadata generation and usage schema aimed to automate identification of concepts and their relationships in documents [12]. These annotations enrich documents with semantic information, while enabling access and presentation on the basis of a conceptual structure, providing

4 3 smooth traversal between unstructured text and ontologies. In addition, they can aid information retrieval tasks to make inferences from heterogeneous data sources by exploiting a given ontology and allowing users to search across textual resources for entities and relations instead of words [13]. Ideally the users can search for the term 'Paris' and a semantic annotation mechanism can relate the term with the abstract concept of 'city' and also provide a link to the term 'France' which relates to the abstract concept 'country'. In another case employing a different ontological schema the same term 'Paris' can be related with the concept of 'mythical hero' linked with city of Troy from Homer's epic poem The Iliad. Semantic Annotations carry the critical task to formally annotate textual parts of documents with respect to ontological entities and relations. Such annotations carry the potential to describe indices of semantic attributes which are capable of supporting information retrieval tasks with respect to a given ontology. E.GATE Application Environment GATE is the application environment that enables the information extraction exercise to be performed over a corpus of grey literature documents. Described as an infrastructure for processing human language, GATE provides architecture, a framework and a development environment for developing and deploying natural language software components. Offering a rich graphical user interface it provides easy access to language, processing and visual resources that help scientists and developers to produce applications that process human language. At the core of the extraction technique are the JAPE rules which are using regular expressions to recognise textual snippets that conform to particular pattern rules, enabling a cascading mechanism of finite state transducers over annotations. Moreover, a range of available utilities support additional processing activities such as exporting the produced annotations to XML file structures, manipulation of Gazetteers entries and data management of the annotations in relational databases. [10] III.PROTOTYPE DEVELOPMENT The task of identifying and presenting textual representations extracted from a corpus of grey literature documents, constitutes the early form of a semantic indexing effort which has been carried out in three stages. The first stage pre-processing selected and prepared the corpus documents for the second stage extraction-phase which produced the annotations while the third post-processing stage presented the produced semantic representations of documents in form of web-page. The experiment has managed to annotate a corpus of 535 archaeological documents with semantic attributes connected with two ontological entities of the CIDOC-EH model; E49:Time Appellation for archaeological periods such as medieval, prehistoric etc, and E19:Physical Object for archaeological objects/finds such as axe, flint, wall etc. Initial results are encouraging and reveal the potential of the method to enable linking between ontological entities and their textual representations. A.Pre-processing. During the pre-processing stage the 535 documents of the corpus collection have been transformed from either pdf or msword files to simple text files encoded with character set Latin-1 (ISO ). The transformation of files performed using custom shell scripts and the open source applications pdf2text and antiword running on a Ubuntu (hardy) platform. The newly created text files were lacking any style and presentation to enable optimum execution of JAPE rules. It has been noticed in earlier experiments the ability of GATE in processing a wide range of file formats not only plain text files, but rules' performance were noticed to be influenced from space and line break criteria that were not uniform across the corpus collection. The use of plain text files with simple line break statements has been adopted to assist consistent execution of JAPE rules among corpus documents. B.Extraction-Phase The main stage of the experiment was dedicated to information extraction and semantic annotation and has been developed in GATE using a number of available natural language processing resources, knowledge based resources, user-defined JAPE transducers and the CRM-EH ontological model. Two separate extraction techniques were applied during the experiment; a small scale exercise that incorporated the ontological model during the annotation process, and a large scale exercise which incorporated knowledge resources in the form of gazetteers supporting the formulation of complex JAPE rules. A range of language processing resources available from the GATE application environment have been used in both experiments to enable the execution of a cascading pipeline of successive processes aiming to annotate particular textual evidence. The processing resources Tokenizer, Sentence Splitter and Flexible Exporter have been used in both exercises to provide the smallest granules of text (tokens), to define stop words and sentences, and to export the resulted annotations in XML format respectively. The knowledge resources (EH Thesauri) have been transformed to gazetteer lists capable of being processed in the GATE environment, covering approximately three thousand terms and grouped into two types; archaeological periods and archaeological object types (finds). User defined JAPE rules have also been used in both exercises to extract and to annotate information from documents that conformed to the exercise objectives. A small scale exercise explored the potential of incorporating an event based ontological model (CRM-EH) in a simple information extraction process, for the annotation of terms of archaeological periods. The Ontogazetteer GATE utility has been employed to map gazetteers terms, originating from the EH Thesaurus of Archaeological Periods, to the CRM-EH entity E49: Time Appellation. A simple JAPE rule used for fetching the gazetteer entries and for producing annotations relevant to the selected ontological entity has been invoked. The annotations were assigned the specific URI (Universal Resource Identifier) of the ontological class to enable semantic interoperability of the annotated terms. The event based nature of the CRM-EH structure offered limited support when JAPE rules attempted to exploit the ontological relations of the model. OBIE systems seem to be capable of exploiting better the hierarchical ontological structures expressed rather than the event-based ontological models. Hence further investigation is required for revealing the potential of event-based ontological models in the use of semantic aware language processing techniques.

5 4 A large scale exercise aimed to identify ontological entities in texts but did not make implicit use of the ontological structure per se. Instead the ANNIE Gazetteer utility of GATE has been employed to accommodate the volume of the available EH thesauri terms. Based on their origin gazetteers terms have been assigned a major type attribute; Time_Appellation or Object_Type, corresponding to the ontological classes E49.Time Appellation and E19. Physical Object respectively. In addition, a minor type attribute has been assigned to all gazetteer terms, corresponding to the unique identifier of each individual EH Thesaurus term that is accommodated in the gazetteer resource. Exploitation of major and minor types allowed JAPE rule to annotate textual inputs with respect to ontological entities enabling semantic interoperability of annotations based on attributes that link textual representations to knowledge structures such as the EH Thesauri. Several JAPE rules have been constructed during the large scale exercise, expressed in the form of patterns and targeted to particular information extraction cases. A simple negation detection rule has been employed initially to match textual entries that are relevant to the ontological entities but bearing a negative meaning such as 'no prehistoric evidence'. Dedicated rules have been employed for matching textual entries to the two ontological classes for periods and physical objects by exploiting the major and minor types of the gazetteers resource. Extended rules have made use of additional gazetteer terms beyond the scope of EH-Thesauri expanding the matching capability to phrases like 'earlier Roman period'. Based on the period (E:49) annotations more complex JAPE rules have been made to annotate compound phrases such as 'from late Roman to early Medieval' and to relate them to the ontological class E52: Time Span. Last but not least, rules have successfully managed to annotate textual phrases that included both periods and physical objects as for example the phrase 'Roman Coin' or the phrase 'Burnt flint dating to the late Bronze Age', describing a complex pattern matching mechanism that builds on top of earlier produced annotations. C. Post-Processing The extraction phase has produced a set of XML files containing the grey literature contents and the semantic annotations that have been created during processing of the corpus collection. The objective of the third stage was to use the resultant XML files for making the semantic annotations available in simple form HTML hypertext documents. The server side technology PHP has been employed to handle the annotations from the XML files and to generate the relevant web pages. The resultant pages have been organised under a portal given the name 'Andronikos' which presents the annotations of documents and employs AJAX scripts to link annotations to their semantic definitions. The portal is making use of the DOM XML for processing the XML files and revealing the annotations of documents while it is making use of a MySQL database server to store relevant thesauri structures. Andronikos* portal has been developed to assist the evaluation of the extraction phase by making available the annotations in an easy to follow human readable format and by demonstrating the capability of semantic annotations to link textual representations to their semantic definitions. *( restricted access) IV.RESULTS The experiment resulted in the production of approximately individual semantic annotations distributed over 535 grey literature documents. Formal evaluation methods and measurements on precision and recall rates have only been applied against a single document where human annotators defined the goldstandards for evaluation. Since this is an early experiment the process of defining the gold-standards to conduct a formal evaluation method is under development. Early evaluation attempts have revealed encouraging results with JAPE rules in some cases outperforming human annotators in recall rates. Competing with a machine is hard when it comes to matching word instances in documents which can be overlooked by humans. On the other hand, human annotators presented better precision rates revealing the ability of humans to comprehend content and to suggest rich and elaborate annotations that are hard to match by a rule based logic. Another early evaluation and visual inspection mechanism have been deployed in Andronikos web portal. A search engine indexing algorithm provided by the open source FDSE project has been deployed in the portal to index the web-pages of the semantic annotations and the full text version pages stored in the XML files. The search engine is then used to retrieve results from both indexes to visually inspect their ability to respond for the same search terms. It is anticipated as the study progresses further that formal evaluation methods will be applied to test the efficiency of the annotation mechanism. V.DISCUSSION The experiment has revealed the potential of rule-based information extraction techniques to provide semantic annotations to grey literature documents from the archaeology domain. The use of knowledge resources such as thesauri and conceptual structures such as ontologies evidently can assist the formulation of sophisticated rules capable of assigning semantic representations to textual instances. Semantic annotations in the form of XML tags can be manipulated by web applications that make use of server side scripting technologies. This initial experiment represents an early attempt for creating semantic annotations that comply with a given ontological structure. The method has revealed the potential of ontology oriented information extraction techniques in identifying textual parts and linking them to their semantic representations while revealing a number of issues that relate to the capabilities and limitations of the method. Future developments should seek to overcome current restrictions imposed by the event-based model in order to enable exploitation of the model relations and entities. The rule based mechanisms can be elaborated and assisted by POS (Part of Speech) tagger. It is planned that the next phase of the experiment will be to incorporate POS inputs in JAPE rules to increase the efficiency of the method and to enable reasoning on the syntactical attributes of natural language text. It will involve the exploitation of the CRM-

6 5 EH ontological model to advance the experiment to the next phase widening its scope and including additional ontological entities and more sophisticated rule definitions. Expansion of the experiment towards inclusion of additional knowledge resources in the form of glossaries, thesauri and gazetteers is required to enable the expansion of the method to additional ontological entities. Further utilization of the produced annotations is also much desired to enable contribution of semantic annotations to information retrieval tasks. Andronikos portal incorporates semantic attributes of annotations to simply display links to semantic definition of terms extracted from grey literature documents. It is within the immediate future plans of the study to investigate the method for transforming the produced XML semantic annotation tags to RDF triple statements. Such RDF resources can be used to introduce the semantic annotations of documents to the semantic retrieval mechanism of the STAR project which uses the semantic technologies SPARQL and JSON for querying RDF triples [14]. VI.CONCLUSION Today available semantic technologies promise to close the gap between formal knowledge structures and textual representations enabling new access methods to information [4][5]. Sustainable efforts from the digital archaeology domain have been directed towards enabling semantic interoperability of available digital resources. The provision of semantic annotations to grey literature documents is a challenging task, aimed to enable access of documents on a semantic - conceptual level. The available language processing technologies make it possible today for scientists and developers to produce software applications capable to reveal the semantic attributes of textual elements and to associate them to conceptual structures. The study has attempted to provide semantic annotations to grey literature documents of the archaeology domain, following established information extraction techniques (OOIE - OBIE) and using standard tools (GATE). The initial experiment has revealed that available tools and methods are capable to assist the process of semantic annotations with promising results. The incorporation of ontologies and knowledge resources (gazetteer, thesauri, glossaries) in a rule-based information extraction technique promises to enable rich semantic indexing of grey literature documents. Additional efforts required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. [6] May K, Binding C, Tudhope D. (2008) A STAR is born: some emerging Semantic Technologies for Archaeological Resources. Proceedings Computer Applications and Quantitative Methods in Archaeology (CAA2008) [7] Smeaton A.F., (1997). Information Retrieval: Still Butting Heads with Natural Language Processing? In Alan Smeaton's Online Publications.[Online]Available at: [accessed 20 January 2009]. [8] Moens M.F., (2006) Information Extraction Algorithms and Prospects in a Retrieval Context. Dordrecht: Springer Gaizauskas R. & Wilks Y., 1998 Information extraction: beyond document retrieval. Journal of Documentation 54(1) p [9] Gaizauskas R. & Wilks Y., 1998 Information extraction: beyond document retrieval. Journal of Documentation 54(1) p [10] Cunningham H. (2005) Information Extraction, Automatic. Encyclopedia of Language and Linguistics, 2nd Edition, Elsevier. [11] Kiryakov A, Popov B, Terziev I, Manov D, Ognyanoff D (2004) Semantic annotation, indexing, and retrieval. Web Semantics: Science, Services and Agents on the World Wide Web 2(1):49 79 [12] Uren V, Cimiano P, Iria J, Handschuh S, Vargas-Vera M, Motta E, Ciravegna F (2006) Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Web Semantics: Science, Services and Agents on the World Wide Web 4(1):14 28 [13] Bontcheva K, Cunningham H, Kiryakov A, Tablan V. (2006) Semantic Annotation and Human Language Technology. Semantic Web Technology: Trends and Research in Ontology Based Systems John Wiley and Sons Ltd. [14] Binding C, Tudhope D, May K. (2008) Semantic Interoperability in Archaeological Datasets: Data Mapping and Extraction via the CIDOC CRM. Proceedings (ECDL 2008) 12th European Conference on Research and Advanced Technology for Digita Libraries REFERENCES [1] Marchionini G, 1995 Information Seeking in Electronic Environments Cambridge: Cambridge University Press, NY, USA [2] Smeaton A.F., (1997.). An Overview of Information Retrieval In: M. Agosti (ed),information retrieval and hypertext, Kluwer Academic Publishers, p.3-20 [3] Jansen B, Spink A. (2006) How are we searching the world wide web? A comparison of nine search engine transaction logs. Information Processing and Management 42(1):pp [4] Bontcheva K, Duke T, Glover N, Kings I. (2006) Semantic Information Access. In Semantic Web Semantic Web Technology: Trends and Research in Ontology Based Systems John Wiley and Sons Ltd. [5] Lee B, Hendler J, Lassila O. (2001) The Semantic Web. Scientific American 284(5):28 37

Exploring the Use of Semantic Technologies for Cross-Search of Archaeological Grey Literature and Data

Exploring the Use of Semantic Technologies for Cross-Search of Archaeological Grey Literature and Data Exploring the Use of Semantic Technologies for Cross-Search of Archaeological Grey Literature and Data Presented by Keith May @keith_may Based on the work of Andreas Vlachidis, Ceri Binding, Keith May,

More information

D4.6 Data Value Chain Database v2

D4.6 Data Value Chain Database v2 D4.6 Data Value Chain Database v2 Coordinator: Fabrizio Orlandi (Fraunhofer) With contributions from: Isaiah Mulang Onando (Fraunhofer), Luis-Daniel Ibáñez (SOTON) Reviewer: Ryan Goodman (ODI) Deliverable

More information

Implementing a Variety of Linguistic Annotations

Implementing a Variety of Linguistic Annotations Implementing a Variety of Linguistic Annotations through a Common Web-Service Interface Adam Funk, Ian Roberts, Wim Peters University of Sheffield 18 May 2010 Adam Funk, Ian Roberts, Wim Peters Implementing

More information

On a Java based implementation of ontology evolution processes based on Natural Language Processing

On a Java based implementation of ontology evolution processes based on Natural Language Processing ITALIAN NATIONAL RESEARCH COUNCIL NELLO CARRARA INSTITUTE FOR APPLIED PHYSICS CNR FLORENCE RESEARCH AREA Italy TECHNICAL, SCIENTIFIC AND RESEARCH REPORTS Vol. 2 - n. 65-8 (2010) Francesco Gabbanini On

More information

The Semantic Planetary Data System

The Semantic Planetary Data System The Semantic Planetary Data System J. Steven Hughes 1, Daniel J. Crichton 1, Sean Kelly 1, and Chris Mattmann 1 1 Jet Propulsion Laboratory 4800 Oak Grove Drive Pasadena, CA 91109 USA {steve.hughes, dan.crichton,

More information

GoNTogle: A Tool for Semantic Annotation and Search

GoNTogle: A Tool for Semantic Annotation and Search GoNTogle: A Tool for Semantic Annotation and Search Giorgos Giannopoulos 1,2, Nikos Bikakis 1,2, Theodore Dalamagas 2, and Timos Sellis 1,2 1 KDBSL Lab, School of ECE, Nat. Tech. Univ. of Athens, Greece

More information

Information Retrieval (IR) through Semantic Web (SW): An Overview

Information Retrieval (IR) through Semantic Web (SW): An Overview Information Retrieval (IR) through Semantic Web (SW): An Overview Gagandeep Singh 1, Vishal Jain 2 1 B.Tech (CSE) VI Sem, GuruTegh Bahadur Institute of Technology, GGS Indraprastha University, Delhi 2

More information

CSC 5930/9010: Text Mining GATE Developer Overview

CSC 5930/9010: Text Mining GATE Developer Overview 1 CSC 5930/9010: Text Mining GATE Developer Overview Dr. Paula Matuszek Paula.Matuszek@villanova.edu Paula.Matuszek@gmail.com (610) 647-9789 GATE Components 2 We will deal primarily with GATE Developer:

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK REVIEW PAPER ON IMPLEMENTATION OF DOCUMENT ANNOTATION USING CONTENT AND QUERYING

More information

A model of information searching behaviour to facilitate end-user support in KOS-enhanced systems

A model of information searching behaviour to facilitate end-user support in KOS-enhanced systems A model of information searching behaviour to facilitate end-user support in KOS-enhanced systems Dorothee Blocks Hypermedia Research Unit School of Computing University of Glamorgan, UK NKOS workshop

More information

A service based on Linked Data to classify Web resources using a Knowledge Organisation System

A service based on Linked Data to classify Web resources using a Knowledge Organisation System A service based on Linked Data to classify Web resources using a Knowledge Organisation System A proof of concept in the Open Educational Resources domain Abstract One of the reasons why Web resources

More information

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge Discover hidden information from your texts! Information overload is a well known issue in the knowledge industry. At the same time most of this information becomes available in natural language which

More information

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper. Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...

More information

Incorporating ontological background knowledge into Information Extraction

Incorporating ontological background knowledge into Information Extraction Incorporating ontological background knowledge into Information Extraction Benjamin Adrian Knowledge Management Department, DFKI, Kaiserslautern, Germany benjamin.adrian@dfki.de Abstract. In my PhD work

More information

Semantic Exploitation of Engineering Models: An Application to Oilfield Models

Semantic Exploitation of Engineering Models: An Application to Oilfield Models Semantic Exploitation of Engineering Models: An Application to Oilfield Models Laura Silveira Mastella 1,YamineAït-Ameur 2,Stéphane Jean 2, Michel Perrin 1, and Jean-François Rainaud 3 1 Ecole des Mines

More information

Using idocument for Document Categorization in Nepomuk Social Semantic Desktop

Using idocument for Document Categorization in Nepomuk Social Semantic Desktop Using idocument for Document Categorization in Nepomuk Social Semantic Desktop Benjamin Adrian, Martin Klinkigt,2 Heiko Maus, Andreas Dengel,2 ( Knowledge-Based Systems Group, Department of Computer Science

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

Named Entity Identification / Disambiguation

Named Entity Identification / Disambiguation Intelligent information access to linked data weaving the cultural heritage web Research Archive for Ancient Sculpture Universität of Cologne, Germany 18. September 2007 Outline Digital Scholarship A model

More information

Generation of Semantic Clouds Based on Linked Data for Efficient Multimedia Semantic Annotation

Generation of Semantic Clouds Based on Linked Data for Efficient Multimedia Semantic Annotation Generation of Semantic Clouds Based on Linked Data for Efficient Multimedia Semantic Annotation Han-Gyu Ko and In-Young Ko Department of Computer Science, Korea Advanced Institute of Science and Technology,

More information

It Is What It Does: The Pragmatics of Ontology for Knowledge Sharing

It Is What It Does: The Pragmatics of Ontology for Knowledge Sharing It Is What It Does: The Pragmatics of Ontology for Knowledge Sharing Tom Gruber Founder and CTO, Intraspect Software Formerly at Stanford University tomgruber.org What is this talk about? What are ontologies?

More information

Semantic Searching. John Winder CMSC 676 Spring 2015

Semantic Searching. John Winder CMSC 676 Spring 2015 Semantic Searching John Winder CMSC 676 Spring 2015 Semantic Searching searching and retrieving documents by their semantic, conceptual, and contextual meanings Motivations: to do disambiguation to improve

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

THE GETTY VOCABULARIES TECHNICAL UPDATE

THE GETTY VOCABULARIES TECHNICAL UPDATE AAT TGN ULAN CONA THE GETTY VOCABULARIES TECHNICAL UPDATE International Working Group Meetings January 7-10, 2013 Joan Cobb Gregg Garcia Information Technology Services J. Paul Getty Trust International

More information

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE 15 : CONCEPT AND SCOPE 15.1 INTRODUCTION Information is communicated or received knowledge concerning a particular fact or circumstance. Retrieval refers to searching through stored information to find

More information

Semantic Cloud Generation based on Linked Data for Efficient Semantic Annotation

Semantic Cloud Generation based on Linked Data for Efficient Semantic Annotation Semantic Cloud Generation based on Linked Data for Efficient Semantic Annotation - Korea-Germany Joint Workshop for LOD2 2011 - Han-Gyu Ko Dept. of Computer Science, KAIST Korea Advanced Institute of Science

More information

A Study of Future Internet Applications based on Semantic Web Technology Configuration Model

A Study of Future Internet Applications based on Semantic Web Technology Configuration Model Indian Journal of Science and Technology, Vol 8(20), DOI:10.17485/ijst/2015/v8i20/79311, August 2015 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 A Study of Future Internet Applications based on

More information

GoNTogle: A Tool for Semantic Annotation and Search

GoNTogle: A Tool for Semantic Annotation and Search GoNTogle: A Tool for Semantic Annotation and Search Giorgos Giannopoulos 1,2, Nikos Bikakis 1, Theodore Dalamagas 2 and Timos Sellis 1,2 1 KDBS Lab, School of ECE, NTU Athens, Greece. {giann@dblab.ntua.gr,

More information

Annotating Spatio-Temporal Information in Documents

Annotating Spatio-Temporal Information in Documents Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de

More information

Towards KOS-based semantic services

Towards KOS-based semantic services Towards KOS-based semantic services Doug Tudhope Hypermedia Research Unit University of Glamorgan Helsinki, November 30, 2007 Presentation Possible approaches to interoperability (from Nov 29) Standards

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

An aggregation system for cultural heritage content

An aggregation system for cultural heritage content An aggregation system for cultural heritage content Nasos Drosopoulos, Vassilis Tzouvaras, Nikolaos Simou, Anna Christaki, Arne Stabenau, Kostas Pardalis, Fotis Xenikoudakis, Eleni Tsalapati and Stefanos

More information

Metadata Workshop 3 March 2006 Part 1

Metadata Workshop 3 March 2006 Part 1 Metadata Workshop 3 March 2006 Part 1 Metadata overview and guidelines Amelia Breytenbach Ria Groenewald What metadata is Overview Types of metadata and their importance How metadata is stored, what metadata

More information

Semantic Data Extraction for B2B Integration

Semantic Data Extraction for B2B Integration Silva, B., Cardoso, J., Semantic Data Extraction for B2B Integration, International Workshop on Dynamic Distributed Systems (IWDDS), In conjunction with the ICDCS 2006, The 26th International Conference

More information

Precise Medication Extraction using Agile Text Mining

Precise Medication Extraction using Agile Text Mining Precise Medication Extraction using Agile Text Mining Chaitanya Shivade *, James Cormack, David Milward * The Ohio State University, Columbus, Ohio, USA Linguamatics Ltd, Cambridge, UK shivade@cse.ohio-state.edu,

More information

Automatic Metadata Extraction for Archival Description and Access

Automatic Metadata Extraction for Archival Description and Access Automatic Metadata Extraction for Archival Description and Access WILLIAM UNDERWOOD Georgia Tech Research Institute Abstract: The objective of the research reported is this paper is to develop techniques

More information

clarin:el an infrastructure for documenting, sharing and processing language data

clarin:el an infrastructure for documenting, sharing and processing language data clarin:el an infrastructure for documenting, sharing and processing language data Stelios Piperidis, Penny Labropoulou, Maria Gavrilidou (Athena RC / ILSP) the problem 19/9/2015 ICGL12, FU-Berlin 2 use

More information

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication

University of Bath. Publication date: Document Version Publisher's PDF, also known as Version of record. Link to publication Citation for published version: Patel, M & Duke, M 2004, 'Knowledge Discovery in an Agents Environment' Paper presented at European Semantic Web Symposium 2004, Heraklion, Crete, UK United Kingdom, 9/05/04-11/05/04,.

More information

ISO 2146 INTERNATIONAL STANDARD. Information and documentation Registry services for libraries and related organizations

ISO 2146 INTERNATIONAL STANDARD. Information and documentation Registry services for libraries and related organizations INTERNATIONAL STANDARD ISO 2146 Third edition 2010-04-15 Information and documentation Registry services for libraries and related organizations Information et documentation Services de registre pour les

More information

Extracting knowledge from Ontology using Jena for Semantic Web

Extracting knowledge from Ontology using Jena for Semantic Web Extracting knowledge from Ontology using Jena for Semantic Web Ayesha Ameen I.T Department Deccan College of Engineering and Technology Hyderabad A.P, India ameenayesha@gmail.com Khaleel Ur Rahman Khan

More information

Semantic Annotation and Linking of Medical Educational Resources

Semantic Annotation and Linking of Medical Educational Resources 5 th European IFMBE MBEC, Budapest, September 14-18, 2011 Semantic Annotation and Linking of Medical Educational Resources N. Dovrolis 1, T. Stefanut 2, S. Dietze 3, H.Q. Yu 3, C. Valentine 3 & E. Kaldoudi

More information

New Approach to Graph Databases

New Approach to Graph Databases Paper PP05 New Approach to Graph Databases Anna Berg, Capish, Malmö, Sweden Henrik Drews, Capish, Malmö, Sweden Catharina Dahlbo, Capish, Malmö, Sweden ABSTRACT Graph databases have, during the past few

More information

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS 1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,

More information

An Annotation Tool for Semantic Documents

An Annotation Tool for Semantic Documents An Annotation Tool for Semantic Documents (System Description) Henrik Eriksson Dept. of Computer and Information Science Linköping University SE-581 83 Linköping, Sweden her@ida.liu.se Abstract. Document

More information

The Semantic Web: A New Opportunity and Challenge for Human Language Technology

The Semantic Web: A New Opportunity and Challenge for Human Language Technology The Semantic Web: A New Opportunity and Challenge for Human Language Technology Kalina Bontcheva and Hamish Cunningham Department of Computer Science, University of Sheffield 211 Portobello St, Sheffield,

More information

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Amanda Clare 1,3, Samuel Croset 2,3 (croset@ebi.ac.uk), Christoph Grabmueller 2,3, Senay

More information

Annotation Science From Theory to Practice and Use Introduction A bit of history

Annotation Science From Theory to Practice and Use Introduction A bit of history Annotation Science From Theory to Practice and Use Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York 12604 USA ide@cs.vassar.edu Introduction Linguistically-annotated corpora

More information

Semantically Enhanced Hypermedia: A First Step

Semantically Enhanced Hypermedia: A First Step Semantically Enhanced Hypermedia: A First Step I. Alfaro, M. Zancanaro, A. Cappelletti, M. Nardon, A. Guerzoni ITC-irst Via Sommarive 18, Povo TN 38050, Italy {alfaro, zancana, cappelle, nardon, annaguer}@itc.it

More information

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE YING DING 1 Digital Enterprise Research Institute Leopold-Franzens Universität Innsbruck Austria DIETER FENSEL Digital Enterprise Research Institute National

More information

97 Information Technology with Audiovisual and Multimedia and National Libraries (part 2) No

97 Information Technology with Audiovisual and Multimedia and National Libraries (part 2) No Date : 25/05/2006 Towards Constructing a Chinese Information Extraction System to Support Innovations in Library Services Zhang Zhixiong, Li Sa, Wu Zhengxin, Lin Ying The library of Chinese Academy of

More information

Linked Open Data: a short introduction

Linked Open Data: a short introduction International Workshop Linked Open Data & the Jewish Cultural Heritage Rome, 20 th January 2015 Linked Open Data: a short introduction Oreste Signore (W3C Italy) Slides at: http://www.w3c.it/talks/2015/lodjch/

More information

Information mining and information retrieval : methods and applications

Information mining and information retrieval : methods and applications Information mining and information retrieval : methods and applications J. Mothe, C. Chrisment Institut de Recherche en Informatique de Toulouse Université Paul Sabatier, 118 Route de Narbonne, 31062 Toulouse

More information

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction Adaptable and Adaptive Web Information Systems School of Computer Science and Information Systems Birkbeck College University of London Lecture 1: Introduction George Magoulas gmagoulas@dcs.bbk.ac.uk October

More information

Making Sense Out of the Web

Making Sense Out of the Web Making Sense Out of the Web Rada Mihalcea University of North Texas Department of Computer Science rada@cs.unt.edu Abstract. In the past few years, we have witnessed a tremendous growth of the World Wide

More information

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009 Maximizing the Value of STM Content through Semantic Enrichment Frank Stumpf December 1, 2009 What is Semantics and Semantic Processing? Content Knowledge Framework Technology Framework Search Text Images

More information

The Semantic Web DEFINITIONS & APPLICATIONS

The Semantic Web DEFINITIONS & APPLICATIONS The Semantic Web DEFINITIONS & APPLICATIONS Data on the Web There are more an more data on the Web Government data, health related data, general knowledge, company information, flight information, restaurants,

More information

Proposal for Implementing Linked Open Data on Libraries Catalogue

Proposal for Implementing Linked Open Data on Libraries Catalogue Submitted on: 16.07.2018 Proposal for Implementing Linked Open Data on Libraries Catalogue Esraa Elsayed Abdelaziz Computer Science, Arab Academy for Science and Technology, Alexandria, Egypt. E-mail address:

More information

Security in the Web Services Framework

Security in the Web Services Framework Security in the Web Services Framework Chen Li and Claus Pahl Dublin City University School of Computing Dublin 9 Ireland Abstract The Web Services Framework provides techniques to enable the application-toapplication

More information

Semantic Web. Tahani Aljehani

Semantic Web. Tahani Aljehani Semantic Web Tahani Aljehani Motivation: Example 1 You are interested in SOAP Web architecture Use your favorite search engine to find the articles about SOAP Keywords-based search You'll get lots of information,

More information

Linking library data: contributions and role of subject data. Nuno Freire The European Library

Linking library data: contributions and role of subject data. Nuno Freire The European Library Linking library data: contributions and role of subject data Nuno Freire The European Library Outline Introduction to The European Library Motivation for Linked Library Data The European Library Open Dataset

More information

Two interrelated objectives of the ARIADNE project, are the. Training for Innovation: Data and Multimedia Visualization

Two interrelated objectives of the ARIADNE project, are the. Training for Innovation: Data and Multimedia Visualization Training for Innovation: Data and Multimedia Visualization Matteo Dellepiane and Roberto Scopigno CNR-ISTI Two interrelated objectives of the ARIADNE project, are the design of new services (or the integration

More information

QAKiS: an Open Domain QA System based on Relational Patterns

QAKiS: an Open Domain QA System based on Relational Patterns QAKiS: an Open Domain QA System based on Relational Patterns Elena Cabrio, Julien Cojan, Alessio Palmero Aprosio, Bernardo Magnini, Alberto Lavelli, Fabien Gandon To cite this version: Elena Cabrio, Julien

More information

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E

Powering Knowledge Discovery. Insights from big data with Linguamatics I2E Powering Knowledge Discovery Insights from big data with Linguamatics I2E Gain actionable insights from unstructured data The world now generates an overwhelming amount of data, most of it written in natural

More information

Agent-Enabling Transformation of E-Commerce Portals with Web Services

Agent-Enabling Transformation of E-Commerce Portals with Web Services Agent-Enabling Transformation of E-Commerce Portals with Web Services Dr. David B. Ulmer CTO Sotheby s New York, NY 10021, USA Dr. Lixin Tao Professor Pace University Pleasantville, NY 10570, USA Abstract:

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Customisable Curation Workflows in Argo

Customisable Curation Workflows in Argo Customisable Curation Workflows in Argo Rafal Rak*, Riza Batista-Navarro, Andrew Rowley, Jacob Carter and Sophia Ananiadou National Centre for Text Mining, University of Manchester, UK *Corresponding author:

More information

3 Publishing Technique

3 Publishing Technique Publishing Tool 32 3 Publishing Technique As discussed in Chapter 2, annotations can be extracted from audio, text, and visual features. The extraction of text features from the audio layer is the approach

More information

The Documentalist Support System: a Web-Services based Tool for Semantic Annotation and Browsing

The Documentalist Support System: a Web-Services based Tool for Semantic Annotation and Browsing The Documentalist Support System: a Web-Services based Tool for Semantic Annotation and Browsing Hennie Brugman 1, Véronique Malaisé 2, Luit Gazendam 3, and Guus Schreiber 2 1 MPI for Psycholinguistics,

More information

The Semantic Web & Ontologies

The Semantic Web & Ontologies The Semantic Web & Ontologies Kwenton Bellette The semantic web is an extension of the current web that will allow users to find, share and combine information more easily (Berners-Lee, 2001, p.34) This

More information

XML in the bipharmaceutical

XML in the bipharmaceutical XML in the bipharmaceutical sector XML holds out the opportunity to integrate data across both the enterprise and the network of biopharmaceutical alliances - with little technological dislocation and

More information

STS Infrastructural considerations. Christian Chiarcos

STS Infrastructural considerations. Christian Chiarcos STS Infrastructural considerations Christian Chiarcos chiarcos@uni-potsdam.de Infrastructure Requirements Candidates standoff-based architecture (Stede et al. 2006, 2010) UiMA (Ferrucci and Lally 2004)

More information

Towards Semantic Data Mining

Towards Semantic Data Mining Towards Semantic Data Mining Haishan Liu Department of Computer and Information Science, University of Oregon, Eugene, OR, 97401, USA ahoyleo@cs.uoregon.edu Abstract. Incorporating domain knowledge is

More information

Semantic Web Mining and its application in Human Resource Management

Semantic Web Mining and its application in Human Resource Management International Journal of Computer Science & Management Studies, Vol. 11, Issue 02, August 2011 60 Semantic Web Mining and its application in Human Resource Management Ridhika Malik 1, Kunjana Vasudev 2

More information

International Jmynal of Intellectual Advancements and Research in Engineering Computations

International Jmynal of Intellectual Advancements and Research in Engineering Computations www.ijiarec.com ISSN:2348-2079 DEC-2015 International Jmynal of Intellectual Advancements and Research in Engineering Computations VIRTUALIZATION OF DISTIRIBUTED DATABASES USING XML 1 M.Ramu ABSTRACT Objective

More information

Enterprise Multimedia Integration and Search

Enterprise Multimedia Integration and Search Enterprise Multimedia Integration and Search José-Manuel López-Cobo 1 and Katharina Siorpaes 1,2 1 playence, Austria, 2 STI Innsbruck, University of Innsbruck, Austria {ozelin.lopez, katharina.siorpaes}@playence.com

More information

Joining the BRICKS Network - A Piece of Cake

Joining the BRICKS Network - A Piece of Cake Joining the BRICKS Network - A Piece of Cake Robert Hecht and Bernhard Haslhofer 1 ARC Seibersdorf research - Research Studios Studio Digital Memory Engineering Thurngasse 8, A-1090 Wien, Austria {robert.hecht

More information

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial. A tutorial report for SENG 609.22 Agent Based Software Engineering Course Instructor: Dr. Behrouz H. Far XML Tutorial Yanan Zhang Department of Electrical and Computer Engineering University of Calgary

More information

- What we actually mean by documents (the FRBR hierarchy) - What are the components of documents

- What we actually mean by documents (the FRBR hierarchy) - What are the components of documents Purpose of these slides Introduction to XML for parliamentary documents (and all other kinds of documents, actually) Prof. Fabio Vitali University of Bologna Part 1 Introduce the principal aspects of electronic

More information

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web What you have learned so far Interoperability Introduction to the Semantic Web Tutorial at ISWC 2010 Jérôme Euzenat Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved

More information

Using Data-Extraction Ontologies to Foster Automating Semantic Annotation

Using Data-Extraction Ontologies to Foster Automating Semantic Annotation Using Data-Extraction Ontologies to Foster Automating Semantic Annotation Yihong Ding Department of Computer Science Brigham Young University Provo, Utah 84602 ding@cs.byu.edu David W. Embley Department

More information

Semantic Web and Natural Language Processing

Semantic Web and Natural Language Processing Semantic Web and Natural Language Processing Wiltrud Kessler Institut für Maschinelle Sprachverarbeitung Universität Stuttgart Semantic Web Winter 2014/2015 This work is licensed under a Creative Commons

More information

Development of an Ontology-Based Portal for Digital Archive Services

Development of an Ontology-Based Portal for Digital Archive Services Development of an Ontology-Based Portal for Digital Archive Services Ching-Long Yeh Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd. 3rd Sec. Taipei, 104, Taiwan chingyeh@cse.ttu.edu.tw

More information

SKOS. COMP62342 Sean Bechhofer

SKOS. COMP62342 Sean Bechhofer SKOS COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Ontologies Metadata Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies

More information

Tokenization and Sentence Segmentation. Yan Shao Department of Linguistics and Philology, Uppsala University 29 March 2017

Tokenization and Sentence Segmentation. Yan Shao Department of Linguistics and Philology, Uppsala University 29 March 2017 Tokenization and Sentence Segmentation Yan Shao Department of Linguistics and Philology, Uppsala University 29 March 2017 Outline 1 Tokenization Introduction Exercise Evaluation Summary 2 Sentence segmentation

More information

Knowledge Engineering with Semantic Web Technologies

Knowledge Engineering with Semantic Web Technologies This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) Knowledge Engineering with Semantic Web Technologies Lecture 5: Ontological Engineering 5.3 Ontology Learning

More information

USE OF ONTOLOGIES IN INFORMATION EXTRACTION

USE OF ONTOLOGIES IN INFORMATION EXTRACTION USE OF ONTOLOGIES IN INFORMATION EXTRACTION by DAYA CHINTHANA WIMALASURIYA A DISSERTATION Presented to the Department of Computer and Information Science and the Graduate School of the University of Oregon

More information

Setting up a CIDOC CRM Adoption and Use Strategy CIDOC CRM: Success Stories, Challenges and New Perspective

Setting up a CIDOC CRM Adoption and Use Strategy CIDOC CRM: Success Stories, Challenges and New Perspective Setting up a CIDOC CRM Adoption and Use Strategy CIDOC CRM: Success Stories, Challenges and New Perspective George Bruseker CIDOC 2017 Tblisi, Georgia 27/09/2017 Researcher, Interpreter Goal: A Semantic

More information

CEN MetaLex. Facilitating Interchange in E- Government. Alexander Boer

CEN MetaLex. Facilitating Interchange in E- Government. Alexander Boer CEN MetaLex Facilitating Interchange in E- Government Alexander Boer aboer@uva.nl MetaLex Initiative taken by us in 2002 Workshop on an open XML interchange format for legal and legislative resources www.metalex.eu

More information

Semantic web. Tapas Kumar Mishra 11CS60R32

Semantic web. Tapas Kumar Mishra 11CS60R32 Semantic web Tapas Kumar Mishra 11CS60R32 1 Agenda Introduction What is semantic web Issues with traditional web search The Technology Stack Architecture of semantic web Meta Data Main Tasks Knowledge

More information

Envisioning Semantic Web Technology Solutions for the Arts

Envisioning Semantic Web Technology Solutions for the Arts Information Integration Intelligence Solutions Envisioning Semantic Web Technology Solutions for the Arts Semantic Web and CIDOC CRM Workshop Ralph Hodgson, CTO, TopQuadrant National Museum of the American

More information

structure of the presentation Frame Semantics knowledge-representation in larger-scale structures the concept of frame

structure of the presentation Frame Semantics knowledge-representation in larger-scale structures the concept of frame structure of the presentation Frame Semantics semantic characterisation of situations or states of affairs 1. introduction (partially taken from a presentation of Markus Egg): i. what is a frame supposed

More information

Ontologies SKOS. COMP62342 Sean Bechhofer

Ontologies SKOS. COMP62342 Sean Bechhofer Ontologies SKOS COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Metadata Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies

More information

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent Semantic Technologies and CDISC Standards Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent Part I Introduction to Semantic Technology Resource Description Framework

More information

What is Text Mining? Sophia Ananiadou National Centre for Text Mining University of Manchester

What is Text Mining? Sophia Ananiadou National Centre for Text Mining   University of Manchester National Centre for Text Mining www.nactem.ac.uk University of Manchester Outline Aims of text mining Text Mining steps Text Mining uses Applications 2 Aims Extract and discover knowledge hidden in text

More information

Business Rules in the Semantic Web, are there any or are they different?

Business Rules in the Semantic Web, are there any or are they different? Business Rules in the Semantic Web, are there any or are they different? Silvie Spreeuwenberg, Rik Gerrits LibRT, Silodam 364, 1013 AW Amsterdam, Netherlands {silvie@librt.com, Rik@LibRT.com} http://www.librt.com

More information

B2FIND and Metadata Quality

B2FIND and Metadata Quality B2FIND and Metadata Quality 3 rd EUDAT Conference 25 September 2014 Heinrich Widmann and B2FIND team 1 Outline B2FIND the EUDAT Metadata Service Semantic Mapping of Metadata Quality of Metadata Summary

More information

A conceptual model of trademark retrieval based on conceptual similarity

A conceptual model of trademark retrieval based on conceptual similarity Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 22 (2013 ) 450 459 17 th International Conference on Knowledge Based and Intelligent Information and Engineering Systems

More information

Opus: University of Bath Online Publication Store

Opus: University of Bath Online Publication Store Patel, M. (2004) Semantic Interoperability in Digital Library Systems. In: WP5 Forum Workshop: Semantic Interoperability in Digital Library Systems, DELOS Network of Excellence in Digital Libraries, 2004-09-16-2004-09-16,

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

Ontology-based Navigation of Bibliographic Metadata: Example from the Food, Nutrition and Agriculture Journal

Ontology-based Navigation of Bibliographic Metadata: Example from the Food, Nutrition and Agriculture Journal Ontology-based Navigation of Bibliographic Metadata: Example from the Food, Nutrition and Agriculture Journal Margherita Sini 1, Gauri Salokhe 1, Christopher Pardy 1, Janice Albert 1, Johannes Keizer 1,

More information

The Virtual Observatory and the IVOA

The Virtual Observatory and the IVOA The Virtual Observatory and the IVOA The Virtual Observatory Emergence of the Virtual Observatory concept by 2000 Concerns about the data avalanche, with in mind in particular very large surveys such as

More information