Semi-automatic creation of domain ontologies with centroid based crawlers. Carel Fenijn

Size: px
Start display at page:

Download "Semi-automatic creation of domain ontologies with centroid based crawlers. Carel Fenijn"

Transcription

1 Semi-automatic creation of domain ontologies with centroid based crawlers Carel Fenijn Graduate Thesis Doctoraal Linguistics Utrecht University, December 2007

2 Contents i i 1 Introduction The World Wide Web The Semantic Web From World Wide Web to Semantic Web Ontology Engineering Ontology Definitions Types of Ontologies Classification of Ontologies Ontology Languages Ontology Design Ontology Learning Ontology Learning Techniques Ontology Editors and Engineering Tools Ontology Learning Approaches Assessment of Ontology Learning Approaches Information Retrieval: Focused Crawling Definition Focused Crawling Focused Crawling Techniques Focused Crawling Approaches Assessment of Focused Crawling Approaches OntoSpider The Ontology Engineering Component of OntoSpider The IR Component of OntoSpider Assessment of OntoSpider Conclusion and Further Research Some Notes on Methodology Future Research Bibliography

3 ii List of Figures 1.1 Layered Stack of the Semantic Web, from An Ontology Scale by Lassila and McGuiness, An Ontology Scale by Daconta et al., Opening screen of Protege with the OntoLT plug-in marked in red Tabs of OntoLT Mapping Rule for Head Nouns and Modifiers Above Rule in an older version of OntoLT Simplified Possible View of OntoSpider OntoSpider with OntoLT as the Ontology Learning Component IR Component of OntoSpider Rich output of OntoSpider

4 Abstract Various approaches exist for the semi-automatic creation of ontologies from text. This thesis shows how centroid based focused crawlers can be used for this purpose, specifically with domain ontologies in specialistic fields like that of linguistics as the target. The approach is highly modular: the highly specialistic output corpus of the Information Retrieval component of the approach will be input to its Ontology Engineering component, which can create ontologies. With this approach, domain ontologies can be created for subjects like natural language morphology. The overall approach that is proposed combines techniques from Information Retrieval and of Ontology Engineering. Some systems that could form the Ontology Engineering part of the approach are discussed. In this study it is examined, whether the use of centroid based focused crawlers can help in the semi-automatic creation of ontologies from text. More specifically, two types of focused crawlers will be compared: A general purpose centroid based focused crawler and a literature crawler. The approach that is proposed here is called OntoSpider. iii

5 iv

6 Acknowledgements I would like to thank Dr. Paola Monachesi at Utrecht University, who supervised this thesis with good advice and much patience, as it was mostly difficult to have progress in this research in combination with a full-time job. Thanks also to my former boss at Demon, Jim Segrave, for allowing me to shift work hours in order to follow some courses that were relevant for this research. An interesting course in Information Retrieval by Maarten de Rijke and Valentin Jijkoun at the University of Amsterdam set me off into the direction of this research, and material of that course was used. v

7 vi

8 Chapter 1 Introduction 1.1 The World Wide Web Information on the current Web, the World Wide Web (WWW), is stored in a decentralized way and may be available in many formats, traditionally mostly embedded in the relatively loose format HTML, more and more in the more rigid XHTML, and is often represented in a natural language. Metadata is scarcely present in the form of keywords and DTD s. If DTD s are available, they are often very generic, to such an extent that no real specific semantic data can be derived from them. For agents like web crawlers, it is often difficult to extract the right information from the WWW because agents do not understand natural languages. Web pages are mostly created for human consumption only. Finding specific information on the WWW is often a very time consuming endeavour, with limited results. Because of the overload of information that is present on the World Wide Web, and the noise that is accompanied by it, much research takes place on Information Retrieval from the World Wide Web. For search engines like Google and Yahoo!, software agents called web crawlers or spiders crawl the Web to gather and index information and make this available. Typically, such search engines try to cover a very large part of the Web, and general purpose crawlers may be used for this. In order to keep the task manageable, efficient algorithms have been developed like the PageRank algorithm[55]. One type of crawler that has been developed, is the focused crawler or topicoriented crawler[17]. Focused crawling approaches try to deal with the enormous mass of information that is contained in the Web in a more efficient way than general purpose crawlers do, and offer ways to extract very specific on-topic data from it, by selectively crawling the Web. This saves network traffic and processor 1

9 2 CHAPTER 1. INTRODUCTION time, as only smaller subsets of the Web are crawled. Apart from the more limited use of resources of focused crawlers, they may also yield better results for specific domains than general purpose ones. A centroid based focused crawler is a special type of focused crawler, that makes use of a centroid, which is a representation of highly on-topic information. Focused crawlers in general and centroid based ones in particular, will be described in chapter The Semantic Web The Semantic Web as presented in e.g. Tim Berners-Lee et al. ([7], 2001) is a vision of Tim Berners-Lee, the inventor of the current World Wide Web (WWW) and director of the World Wide Web Consortium (W3C). In Berners-Lee s vision, the future web, the Semantic Web, will also contain metadata. This metadata will enable agents to extract very much specific information from web pages and act intelligently based on this information using logical inferences. Apart from the addition of metadata, other problems of the current Web will be addressed in the Semantic Web. One such problem is that of trust and reliability of data. In a layered model of the design of the Semantic Web, trust is even at the highest layer of the stack (see Figure 1.1), based on proof and cryptography. If intelligent agents can make logical inferences based on explicit formal metadata, they can also account for these inferences, which can be requested and verified by humans if necessary. The interpretation of metadata will be based on information that is present in ontologies. Like on the current World Wide Web, information on the Semantic Web will also be stored in a decentralized way. Ontologies are central to the implementation of the Semantic Web. They contain domain knowledge, specific data regarding a certain subject field, in a very structured way. Semantic Web Agents will be able to interpret information that is found in Web pages using these ontologies, as they give the agents precise information on those pages. Apart from that, based on ontologies, such agents will be able to communicate with each other, as the ontologies provide a shared understanding of a given domain. The development and maintenance of ontologies is part of Ontology Engineering. For the Semantic Web to work, it will be necessary that very many ontologies will be available. In the literature, this is referred to as the bootstrapping problem of the Semantic Web, which is in a way some sort of a chicken and egg problem: Ontologies are a necessary prerequisite for a working Semantic Web, but as long as there is no Semantic Web yet, many people are not interested in producing ontolo-

10 1.3. FROM WORLD WIDE WEB TO SEMANTIC WEB 3 Figure 1.1: Layered Stack of the Semantic Web, from gies. One way of overcoming this problem may be to (semi-)automatically create a lot of ontologies from existing resources, like knowledge bases and the World Wide Web. As Ontology Engineering is such a central part of this study, it will be treated in more detail in a separate chapter. 1.3 From World Wide Web to Semantic Web The (semi-)automatic creation of ontologies has received quite some attention these last years. The main reasons for this are the fact that manual creation of ontologies is tedious and costly work. On the one hand, we know that very many ontologies will need to be made in the bootstrapping phase and later phases of the Semantic Web, and much research has been carried out in this direction. On the other hand, in the Information Retrieval community, much work has been done in the improvement of approaches and algorithms for efficient and effective IR on the current World Wide Web. One of the approaches in this field is that of focused crawlers, and here, a centroid based approach is one of the options. We will investigate how these two research areas can be combined. For domains that have already been covered by ontologies, the creation of such ontologies from scratch might be less useful, unless this is for the purpose of evaluating an Ontology Engineering approach. Enriching such existing ontologies makes more sense. This approach may be particularly useful

11 4 CHAPTER 1. INTRODUCTION in case no such ontologies exist yet and repositories of specialized research papers on a subject are available. Even once the Semantic Web will be in a very advanced stage, there may still be highly specialistic subjects on which no ontologies are available at all, and data on the World Wide Web can be of use. The reason why not much other research combines the two fields of Information Retrieval and Ontology Engineering may be the fact, that the semi-automatic creation of ontologies by itself is already a very difficult research field, which has many problems that still need to be solved, like NLP problems and knowledge engineering problems, and most researchers concentrate on just that, or even on sub problems of it. For many specific purposes, selecting a set of relevant documents on which ontologies are based in a different way, like with clustering techniques, often suffices. Also, the field of Information Retrieval has its own issues that need to be resolved, both for IR on the Web and on the Semantic Web. Yet, a motivation for the combination of the two fields will be presented here. The World Wide Web, with all the shortcomings it has compared to the Semantic Web, does contain a huge wealth of information. As mentioned, focused crawlers can extract highly relevant and specialistic information from this World Wide Web. The centroid that is used by centroid based focused crawlers and the set of downloaded pages should contain very specialized data. Usually the downloaded pages will be in a rough format like HTML embedded free text. As was mentioned above, ontologies also contain very specific data. Clearly, it is a far cry from the raw specialized data that is gathered by focused crawlers, and the richly structured data that is contained in ontologies, still it is interesting to study the hypothesis that a combination of a focused crawler approach and an approach of semi-automatic creation of ontologies from text can be fruitful. A study which seems to confirm this hypothesis is Ehrig (([31]), 2002). His approach, which will be described in some more detail below, also combines a focused crawler with ontology creation. However, it uses ontological metadata for the enhancement of focused crawls, instead of simple vector based centroids. The study at hand will do the opposite: Examine how focused crawlers may help in the semi-automatic creation of ontologies from text. Another, more recent, study that combines focused crawling with ontology learning is Su et al ([67] (2004)). They also use ontologies for improving focused crawls. As a side-effect, the ontologies that are used are enriched in an automatic way. Some more detail on their approach will also follow in chapter 3. Note that the definition of Ontology that is adopted here, does not mention the Semantic Web at all. Even though ontologies play a crucial part in the emergence of the Semantic Web, the use of ontologies is more universal than that. In research projects, company intranets, etcetera ontologies may play an important role as well, as part of Knowledge Management. In general, one of the first stages in Ontology Engineering processes, like the manual construction of ontologies, is the enumeration of terms that will be part of the ontology. Noy et

12 1.3. FROM WORLD WIDE WEB TO SEMANTIC WEB 5 al. ([52],2001) describe this as step 3, after determining the domain and scope of the ontology (step 1) and considering reusing existing ontologies (step 2). It may be interesting to see whether the resulting set of terms that are in the centroid of the focused crawler might be a good starting point for this third step in manual ontology creation as well. Now that a motivation for combining a focused crawler with the semi-automatic creation of ontologies from text has been presented, the central areas of this approach of combining Ontology Engineering and centroid based focused crawling, will be described in more detail in the following chapters. Chapter 2 presents the field of Ontology Engineering, describing the main concepts of this field. Chapter 3 is on Ontology Learning. This chapter mainly presents concepts, techniques and approaches that are specific to the (semi)automatic creation of ontologies. In the last chapter, chapter 5, the approach itself, OntoSpider, will be presented. This approach employs the use of centroid based focused crawlers to semi-automatically create domain ontologies based on data that is available on the World Wide Web. More specifically, the results of a General Purpose Focused Crawler will be compared with those of a Literature Crawler, from an Ontology Engineering point of view. For this purpose, hypotheses will be proposed.

13 6 CHAPTER 1. INTRODUCTION

14 Chapter 2 Ontology Engineering The field of Ontology Engineering studies the theory and practice of how Ontologies are designed and created. An overview of a recent state of the art in Ontology Engineering can be found in Gómez-Pérez et al. ([34], 2004). Much of this chapter is based on information from their book. 2.1 Ontology Definitions Traditionally, Ontology is a branch of Philosophy, that studies the being, of things, their essence, existence, properties, nature, classification, etcetera. Ancient Greek philosophers like Parmenides and Aristotle made important contributions to this discipline. Throughout history until the modern time, various philosophers have studied this discipline. More recently, the term Ontology has been adopted within a Knowledge Engineering setting. One very frequently cited definition is that of Gruber ([36], 1993), An ontology is an explicit specification of a conceptualization. Clearly, this definition is rather vague, and other researchers have proposed definitions that are based on Gruber s, but that are more precise. Struder et al. (1998), as cited in [34], define an ontology as a formal explicit specification of a shared conceptualization. Conceptualization refers to an abstract model of some phenomenon in the world by having identified the relevant concepts of that phenomenon. Explicit means that the type of concepts used, and the constraints on their use are explicitly defined. Formal refers to the fact that the ontology should be machine-readable and processable. Shared reflects the notion that an ontology captures consensual knowledge, that is, it is not private of some individual, but accepted by a group. For the purpose of this study, this definition will be adopted. Ontology specifications are formulated in 7

15 8 CHAPTER 2. ONTOLOGY ENGINEERING ontology languages, and various of these have been developed in recent years. 2.2 Types of Ontologies In the literature, various types of ontologies have been proposed. Gómez-Pérez et al. ([34], 2004) mention Top-Level Ontologies which mainly deal with universal abstract categories, General or Common Ontologies that contain common sense information, Knowledge Representation Ontologies for which a KR paradigm is characteristic, Task Ontologies that are focused on a task or activity, Method Ontologies which center around some method, and Application Ontologies that are made for a specific application. The type of Ontology that this study is concerned with, is that of the Domain Ontology. Characteristics of a Domain Ontology are, that a specific domain like a scientific discipline or a specific business is the subject of the Ontology, and that the Ontology therefore typically uses a more specialized vocabulary. 2.3 Classification of Ontologies It is very common to distinguish between lightweight and heavyweight ontologies, and scales or hierarchies of ontologies have been proposed, that place ontologies on such a scale between shallow and heavyweight ones, which also reflects the expressiveness of the formalisms that are used for these ontologies. Such ontology scales can help understand the differences, commonalities and relationships between e.g. semantic networks, thesauri, taxonomies, catalogs, ontologies, relational databases, UML, logics and the Object Oriented paradigm. One classification is that of Lassila and McGuinness ([41], 2001). In their paper, they argue that the RDF formalism can be seen as a frame based formalism, and that frame-based representation is a suitable paradigm for ontology creation. They point out the connection between frame-based systems, object oriented programming and description logics, and argue that even catalogs, glossaries and controlled vocabularies could be seen as potential ontology specifications. The classification, which they present in the paper as An Ontology Spectrum, ranges from such catalogs and glossaries on one end, to systems with general logical constraints on the other end. There is a clear line between systems with informal is-a relations and those with formal is-a relations. While explaining the characteristics of taxonomies and thesauri, the difference between taxonomies and ontologies, and as a basis for their definition of ontologies, Daconta et al. ([23], 2003) propose an Ontology Spectrum with weak semantics on

16 2.3. CLASSIFICATION OF ONTOLOGIES 9 Figure 2.1: An Ontology Scale by Lassila and McGuiness, 2001 Figure 2.2: An Ontology Scale by Daconta et al., 2003

17 10 CHAPTER 2. ONTOLOGY ENGINEERING the lower end of the scale, and strong semantics at the higher end. The scale ranges, from the weakest, the Relational Model, via Taxonomy, Schema, ER, Thesaurus, Extended ER, XTM, RDF/S, Conceptual Model, UML, DAML+OIL, Description Logic, Local Domain Theory and First Order Logic to Modal Logic, which is at the high end of the spectrum. They go in great length describing what taxonomies and ontologies are, and hold that the main difference between taxonomies and ontologies is, that the former do not have rigorous logic, that machines can base inferences on, and the latter do have such rigorous logic. 2.4 Ontology Languages Ontology languages are the formal languages in which ontologies are defined. In this section, only some of the important characteristics of what are currently the main ontology languages will be described in a general informal way. For formal specifications, there is ample literature available XML/XML Schema XML (Extensible Markup Language) is a formal language the conforms with the SGML specifications. It can be seen as a subset of SGML, which is simpler and more practical in its use than SGML. Because XML and XHTML which was derived from it, is more rigid in its definition than HTML, it is easier to process XML and XHTML automatically in a consistent way than HTML. One of the reasons for the W3C to develop XML was, to deal with the shortcomings of HTML. In HTML, the representation of data and its presentation are mixed and messy, in XML they are strictly separated, enabling clear unambiguous data representation with welldefined syntactic means. However, the use of XML is far broader and far reaching than just for applications on the World Wide Web. At the time of writing, XML is the most common standard that is in use for Business to Business (B2B) information interchange. XML Schema and its formal language XML Schema Definition (XSD) allows one to create data models and specify data types and criteria by which XML document are valid or not. Thus XML documents can be syntactically correct according to the XML specifications, but invalid given a specific XML Schema specification. An older schema language that was in common use for HTML and XML, is DTD. DTD s are making way more and more for XML Schema, but for historic reasons they are still in wide use. Unlike DTD s, XML Schema itself conforms to the XML specifications.

18 2.4. ONTOLOGY LANGUAGES 11 In and of themselves, XML and XML Schema do not suffice as ontology languages, for only the correctness and validity of the syntax of XML documents can be determined, not the semantics of these documents. In itself, e.g. the XML markup <dictator>john</dictator> and <gardiner>john</gardiner> do not mean anything different to an XML parser, even though humans who choose or read these tags will most likely assign a certain meaning to them. However, fully-fledged ontology languages which are capable of expressing complex meaning have been formulated fully in XML and XML Schema, which is the reason for mentioning XML here RDF(S) One of the many formal languages that have been constructed in accordance with the XML specifications, is the Resource Description Framework (RDF)[3]. It was developed by the W3C to provide a solid formal basis for ontology languages, expressing meaning with RDF-triples. RDF-triples are sets of three identifiers, resources, one of which intuitively functions as a subject, one as an object, and one as a predicate or relation between subject and object, much like meaning can be represented in many natural languages and in First Order Predicate Logic (FOL). A triplet like a, R, b could be represented in FOL with the two place predicate R like so: Rab or R(a,b). The identifiers of RDF-triples are often URI s, for the subject and relation or predicate, this is always the case. The object can be either a URI or a literal. The URI s, which are often URL s on the Web, ensure explicitness and precision of data representation. For example, thousands of different entities called John can all have their own URL disambiguating them. Even though RDF s data model with RDF-triples is simple, its expressiveness is very great. Many RDF-triples can combine into complicated webs of knowledge that are equivalent to semantic nets. Even though more place predicates in FOL cannot be represented with a single RDFtriplet, they can be represented with multiple RDF-triples in an indirect way. Also, reification is part of RDF, so it is possible to make statements about RDF statements in this data model. Furthermore, RDF containers are part of the RDF data model, with groups of resources like bags (unordered sets) and sequences (ordered sets). RDF has been extensively documented by the W3C and all specifications are open in the RDF Concepts and Abstract Syntax document, the RDF Semantics document ([57]) and other documents. Also, a document like the RDF Primer ([56]) makes the technology accessible to the public. RDF does not necessarily have to be represented in XML. Shorthand notations exist like N3, and tuple notation like subject, predicate, object and <subject> <predicate> <object> are in use, as well as graphical representations with directed labeled graphs. Although RDF graphs are easy to consume by humans, it is more efficient to serialize the data in

19 12 CHAPTER 2. ONTOLOGY ENGINEERING XML format so that it is easy for computer programs to process it. Unlike XML Schema is to XML, RDF Schema is not a schema language in which valid RDF representations are defined. RDF Schema was built on top of RDF, and can be seen as a limited, lightweight ontology language, in that in it, the class and subclass relations are defined in a formal way, and RDF vocabularies can be formulated, in which classes and properties are distinguished OWL The constraints that RDFS imposes on RDF are quite limited. Other ontology languages were developed, which impose more and preciser constraints and allow the formulation of heavyweight ontologies. One such language is the Web Ontology Language (OWL), which exists in three types: OWL Lite, OWL DL and OWL Full. Like RDF, OWL is well documented with extensive open documentation, like the OWL Web Ontology Language Reference ([54]). Historically it descends from earlier ontology languages, DAML+OIL, which like OWL itself was based on RDF. In OWL Lite, relatively lightweight ontologies like taxonomies can be formulated, in OWL DL, which is more expressive, more heavyweight ones, and in OWL Full which is most expressive of the three, any ontologies that the RDF formalism allows for can be formulated. The choice of the type of OWL can depend on the purpose of a project, if one only needs to produce a taxonomy, the choice of OWL Lite can be evident, also for reasons of decidability and efficiency. A quick overview of the OWL specifications can be found in the OWL Web Ontology Language Overview ([53]). 2.5 Ontology Design Various very specific methodologies for ontology design have been proposed in the literature. They are applicable both to manual and to (semi-)automatic Ontology Design. The main methodologies for defining a classification of classes or concepts in ontologies, are the Top-Down one, which goes from general to specific, and the Bottom-Up one which goes the opposite direction, from specific to more general classes or concepts. The Top-Down methodology departs from general concepts, going to specific ones. According to Uschold and Gruninger ([71],1996), the amount of detail of the ontology is better controlled with this methodology as compared with the Bottom- Up methodology. A disadvantage of this methodology is however, that it can become arbitrary which high-level concepts will get a place in the ontology when this

20 2.5. ONTOLOGY DESIGN 13 methodology is followed. In the Top-Down approach, the high-level concepts do not follow from the lower level concepts themselves. Therefore, the ontology could become less stable and the process may require more effort and re-work. The Bottom-Up methodology goes from detailed and specific concepts to more general ones. Uschold and Gruninger ([71],1996) maintain that this approach may also result in more effort and re-work, but for different reasons. The level of detail in the ontology may become very high in this approach, which may increase the chance of inconsistencies and which may make commonalities between related concepts less transparent. In the Middle-Out approach, one starts with the main concepts in the middle, i.e. those which are neither very high-level nor at the maximum of specificity. Uschold and Gruninger ([71],1996) hold, that this approach strikes a balance in the level of detail of the resulting ontology. High-level and low-level concepts only follow naturally from these main concepts from which one departs. An approach that ([71],1996) do not mention, is that of Mixture Ontology design. Here, one could start with both high-level concepts and concepts at the lowest level, which have most detail, thus mixing the Top-Down and Bottom-Up approach. It is expected, that this methodology would suffer from the drawbacks of both of the other methodologies. All in all, the Middle-Out Ontology Design strategy seems to be the most promising. Many approaches that involve some cyclic, iterative way of constructing ontologies, will include the possibility of enriching existing ontologies because of this. Approaches may also focus on the enrichment of existing ontologies as a goal in itself. Apart from enriching existing ontologies, there are also strategies that reuse existing ontologies to create totally new ontologies. Also, strategies that merge two or more existing ontologies into one ontologies exist.

21 14 CHAPTER 2. ONTOLOGY ENGINEERING

22 Chapter 3 Ontology Learning Ontology Learning is the acquisition of knowledge for the (semi)automatic creation of ontologies. Very often Ontology Learning is from text, but it can also be from other sources, like databases. Because of the interdisciplinary nature of the subject, very many Ontology Learning approaches exist, and many methods and techniques are used in this field. Buitelaar et al. ([11], 2003) argue, that in spite of this multidisciplinary nature, Ontology Learning is a new and challenging area in its own right. In this chapter, some existing surveys will first be treated. Then some specific approaches that are somehow similar to the OntoSpider approach that is presented in chapter 5, or that are somehow related to it will be described. Finally, some general aspects of various approaches will be mentioned, like commonalities in system designs, convergence or divergence of NLP approaches, choice of AI technologies, etcetera. This study will mainly focus on ontology learning from text. It is not an exhaustive survey. The reason for examining various approaches apart from using existing surveys was, to get a better grasp of the subject matter and to avoid reinventing the wheel. Roughly, two types of related work can be distinguished: Work that is very similar to the total approach, i.e. it both involves a focused crawler and (semi-)automatic creation of ontologies from text, and work that is only similar to part of it, i.e. work that only involves the use of focused crawlers, mainly for scientific data gathering purposes, or that involves the (semi-)automatic creation of ontologies. 15

23 16 CHAPTER 3. ONTOLOGY LEARNING 3.1 Ontology Learning Techniques Because of the multidisciplinary nature of the field, many existing techniques from fields like Artificial Intelligence and Information Retrieval are used for Ontology Learning. This section presents some techniques that may be used by various Ontology Learning approaches. Many of these techniques are related to text processing and analysis. There is often a choice of algorithms that can be used for the implementation of the techniques that are being described here. Certain techniques are very general in nature, and might as well have been presented in the chapter on Information Retrieval, chapter 4. Web Mining is a research area in which information is extracted from the World Wide Web. For this extraction, among other things Text Mining may be used, here the information extraction is specifically from texts in natural languages like English and French. Further techniques that are used include Chunk Parsing, POS Tagging and Semantic Tagging. Stopping or stopword removal is a standard technique in NLP. The most frequent words in corpora, the stopwords, will occur in practically any document, hence they are not significant for most IR purposes, and are removed at a very early stage. Like stopping, stemming is very standard in NLP and IR. Stemming reduces the various forms of a word that may be the result of morphological processes like inflection and derivation, to a single stem or root. Some often used stemmers are the Porter stemmer, which has modules for various languages, and the Lovins stemmer. Often, from a morphological point of view, the results of stemmers are quite crude, but from a pragmatic point of view they are still very effective. Part-Of-Speech tagging or POS-tagging is also a very common technique in NLP and IR. One of the most famous one is the Brill POS-tagger, another one is the Monty POS-tagger. Chunk Parsing is a shallow technique, by which natural language sentences are parsed in chunks. Very roughly, these chunks correspond to syntactic phrases. Often, chunk parsing takes place after a POS-tagging phase, and the technique is widely used in IR. Approaches that have a chunk parser, include SMES and SymOntos. The latter uses the CHAOS chunk parser. A specific application of chunk parsing is cascaded chunk parsing. In this approach, the output of one round of chunk parsing can be input to a next round of chunk parsing at which new chunks can be parsed, thus multiple rounds of consequent chunk parsing can take place. Semantic Tagging or Semantic Annotation is the enrichment of natural language texts like corpora with semantic tags. Often a semantically tagged text comes in a way closer to an ontology, as it could be input to concept extraction modules

24 3.2. ONTOLOGY EDITORS AND ENGINEERING TOOLS 17 or otherwise be part of approaches. As part of Semantic Tagging, various types of resolution may take place, like Synonymy, Hyponomy, Hyperonomy and Meronymy Resolution. Dill et al. ([28], 2003) maintain, that automated large-scale semantic tagging of ambiguous content can bootstrap and accelerate the creation of the Semantic Web. Their approach consists of SemTag, a Semantic Tagger that works through three stages, one for spotting, with tokenizing and label extraction, one for learning and finally one for the actual semantic tagging itself. The other half of the approach, Seeker, will be described in another section. In practice, most of the semantic taggers that exist today only produce shallow results. If the resulting ontologies of systems that use these should not be shallow, that could be achieved by combining shallow semantic taggers with other techniques. SMES is presented by Maedche and Staab ([42], 2000; [43], 2000; [44], 2001) as part of the Text-To-Onto approach. In [19] a Concept Extractor was developed for the Ontolo approach. Clustering is an IR technique in which documents are grouped together in so-called clusters. This technique can be used for classification purposes, or as a preparatory step for further analysis of the documents. Various approaches use simple pattern matching approaches. E.g. Perl or sed regexes can be very powerful. Especially as an additional technique to other techniques or as part of other techniques like POS-tagging it can be very useful. Maedche and Staab emphasize the difference between taxonomic and nontaxonomic relation extraction from text. Much of the work that precedes theirs consists of very shallow approaches, which only succeed in taxonomic relation extraction. What is necessary according to these researchers, is nontaxonomic relation extraction from text. 3.2 Ontology Editors and Engineering Tools Clearly, if the creation of ontologies from text is not done fully automatically but semi-automatically, an ontology engineer will have to correct, refine or expand the ontologies. For this purpose, Ontology Editors and Engineering Tools can be used. They can be considered part of overall semi-automatic approaches. Da Silva et al. ([60], 2004) present a survey on econstruction, Ontology Engineering/Design tools, and Ontology Exploitation software tools. The focus of the study is on software tools and the following Ontology Design tools are described: LexiCon, OilED, Protégé2000, OntoEdit, LinkFactory, e-cognos and e-coser, TERMINAE, Text-to-Onto and OntoLearn. In the conclusion on Ontology Design tools, the authors state that Protégé is the most recommended software tool for

25 18 CHAPTER 3. ONTOLOGY LEARNING various reasons, including OWL-compliance, the fact that it is freeware and it has a good base of developers around the world that support it. The Ontology Exploitation that is evaluated in the survey, is outside the scope of this study. OntoEdit is an ontology engineering environment that is presented by Maedche and Staab ([42], 2000; [43], 2000; [44], 2001) as part of the Text-To-Onto approach. Only a limited version of OntoEdit is free of charge. The tool will run on Windows and Linux platforms. Protégé is a very popular Ontology Engineering Tool. It is an Open Source Java tool, that can be used for editing domain ontologies or knowledge bases in a user friendly way with a GUI. The tool comes with a clear tutorial and good documentation, and is used by a large community. It is scalable, platform-independent and easy to extend with plugins. Furthermore, it supports data in various formats, like RDF and OWL. Many specialistic ontologies have been developed with Protégé with various domains. Linguistics related ontologies include GOLD, an ontology for descriptive linguistics and GUM, a general task and domain independent linguistically motivated ontology. 3.3 Ontology Learning Approaches Surveys of Ontogy Learning Approaches Various surveys of ontology-learning approaches exist. presented here in chronological order. Some of these are briefly Maedche and Staab ([44, p.76-78], 2001) include a brief survey of ontologylearning approaches in the presentation of their own ontology-learning framework, which includes Text-To-Onto, SMES and OntoEdit. The survey covers the following domains: free text, dictionary, knowledge base, semistructured and relational schemata. The methods mentioned for free text, the subject matter of OntoSpider, are clustering, inductive logic programming, association rules, frequency based, pattern-matching and classification methods. No extensive evaluation is made of the various approaches, they are presented in a table with references to the corresponding literature. Ying Ding and Schubert Foo ([30], 2002) present a review of ontology generation, in which Infosleuth, SKC, AIFB approaches like SMES, OntoEditor, and Textto-Onto, ECAI 2000 (SVETLAN, Mo K, SYLEX, ASIUM), Inductive Logic Programming (WOLFIE), DELOS, OntoWeb, DODDLE, and some more approaches are described. Before presenting these approaches, some general notes on ontology creation are given. An important conclusion they draw is, that the complexity of

26 3.3. ONTOLOGY LEARNING APPROACHES 19 relation extraction is the main impedance to ontology learning and its application, and that learning ontologies from text is still largely a theoretic enterprise, which is not advanced enough yet for real applications. A more extensive and recent survey of existing approaches can be found in Gomez et al. ([33],2003). Many researchers have contributed to this survey, and it is very systematic. The following domains are covered: text, machine-readable dictionaries, knowledge bases, structured data, semi-structured data and unstructured data. Of ontology learning from text, both methods and tools are described. The methods are usually named after one of the authors of the papers. The tools that are described, are Caméléon, CORPORUM-Ontobuilder, DOE, KEA, LTG Text Processing Workbench, Mo K Workbench, the Ontolearn Tool, Prométhé, SOAT, Sub- WordNet Engineering Process Tool, SVETLAN, TFIDF based term classification system, TERMINAE, Text-To-Onto, TextStorm and Clouds, Welkin and WOLFIE. The authors do not pretend to present a complete survey, but do claim that the main approaches have been covered. The systematic presentation of the methods and approaches gives a very clear overview and one thing that may strike the reader because of this, is the fact that in many cases certain aspects of approaches are not disclosed in papers at all, which is indicated in the text with information not available in papers. Approaches with semi-automatic creation of ontologies incorporate various modules Descriptions of Ontogy Learning Approaches TERMINAE is presented in various papers, like Biebow et al ([8], 1999), as a methodology and a tool for building ontologies from text or from scratch. Much attention is given to linguistics, and formality and traceability are requirements. Lexter is used for the extraction of terms from text. The approach, that focuses on technical text, evolved over time, one version uses Syntex and Caméléon as NLP tools for the subsequent linguistic analyses. The knowledge engineer is expected to have expertise in the area of the subject of the ontology and to have a good idea of how the resulting ontology will be applied, intuitive GUI s can be used to construct and adapt ontologies. The role of the knowledge expert is crucial in this approach. After normalization, the domain knowledge is formalized in some kind of a description logic. This description logic has rather limited expressive power. Subsequent work by the authors included work on other systems, like Géditerm, which also implemented part of the tasks of their methodology. Text-To-Onto is presented by Maedche and Staab ([42], 2000; [43], 2000; [44], 2001) as an architecture and a system for semi-automatic creation of ontologies from text. It was used in the On-To-Knowledge project. The authors stress that

27 20 CHAPTER 3. ONTOLOGY LEARNING most of the approaches prior to the year 2000 only got to the taxonomic level, but not further than that, and that non-taxonomic conceptual relations are an important goal in ontology engineering. This view corresponds with the classification of ontologies by Lassila and McGuinness (2001). They use a balanced cooperative modeling paradigm as proposed by Morik (1993), which includes the use of Text Mining. An NLP module, SMES is used for shallow text processing, with some extensions for heuristic correlations in order to attain a high recall of relevant linguistic dependency relations. SMES has access to a lexical database with German words. Dependency relations form the main output of SMES. Concept and relation extraction are performed by the learning module, the algorithm of which is based on Ramakrishnan Srikant and Rakesh Agrawal, Mining Generalized association rules (1995). The ontology engineer gets presented pairs of concepts which can be included in the ontology as non-taxonomic relations. For this purpose, OntoEdit is used. Furthermore, the ontology engineer can prune the resulting ontology, and decide whether it is necessary to iterate the ontology learning cycle or not. The authors stress that this is just one of various possible strategies. OntoLearn, presented in Missikoff et al. ([48], 2002) is a system that can automatically extract concepts from text to form semantic nets and specialized domain ontologies from corpora. It uses WordNet and large domain corpora. Projects that used OntoLearn include Harmonise, which produced a large ontology on tourism. Other applications involved ontologies in the fields of Economy and Computer Networks. OntoLearn incorporates mainly three algorithms, one for terminology extraction, one for semantic disambiguation and one for semantic annotation and the creation of ontologies. A special algorithm, SSI (Structural Semantic Interconnections), was designed for semantic interpretation, which is also done based on the principle of compositionality of meaning. The relevance of concepts that are extracted from a corpus is determined by comparison with frequencies of occurrence in a generic corpus, which functions as a contrast corpus. For the purpose of evaluating resulting ontologies, glosses were added to the Ontolearn system. These will be described in a later section. Symontos is described by Missikoff et al. ([47], 2001). It is an approach that uses Web Mining for Ontology creation and enrichment. The ontological data that is created is not very rich, it is about at the taxonomic level, but can be used for the creation or enrichment of ontologies. Ontolo as presented in Chetrit ([19], 2004) is a tool for facilitating Ontology Construction from texts, in fact that is literally the title of the thesis. The user manually inserts articles from the PubMed database. After POS-tagging, stemming and concept extraction, rudimentary ontologies are created by the Ontology Construction tool.

28 3.3. ONTOLOGY LEARNING APPROACHES 21 The system Asium is presented in Faure et al. ([32], 1998). It is a system that automatically acquires semantic knowledge and ontologies from text, with Machine Learning techniques. Another system, Sylex, is used for syntactic parsing and after this parsing and post-processing, syntactic frames of clauses are produced. Along with these, an ontology of concepts can be formed. The clustering of words can be done in a hierarchical or in a pyramidal way. Pyramids of clusters are richer than simple hierarchies because multiple parents are possible. The relevance of concepts that can be derived from clusters is determined with a similarity measure, that determines how close clusters are to each other. The user interactively validates learned clusters. For this purpose, a GUI is part of the system. GATE is presented in Cunningham et al. ([22], 2002) and Bontcheva et al. ([9], 2004) as a framework and a graphical development environment for Language Engineering. It was implemented in Java, is freely available, modular, Open Source and well documented in articles and with an extensive user guide. GATE can be seen as a general-purpose and flexible tool for NLP processing. Here, only GATE v2 will be described. The authors distinguish the following GATE resources that are available: language resources (LRs), processing resources (PRs) and visual resources (VRs). The language resources with declarative data are strictly separated from the processing resources and the visual resources, which enables the users, e.g. linguists or programmers, to concentrate on their field of expertise in their work on GATE. All resources together are called CREOLE, a Collection of REusable Objects for Language Engineering. The GATE resources can be accessed both via a GUI and via the GATE API. The API makes it easier to automate certain tasks. GATE can deal with various data formats, which are converted into a GATE specific XML format before they are further processed. Examples of processing resources that are available in GATE, are tokenizers, POS-taggers, An important part of GATE v2 is JAPE, an engine for regular expressions, that is based on finite state technology. Although the use of finite state technology does not guarantee efficient processing, generally most tasks that are performed with the JAPE engine are efficient. Other modules that are available as plugins to GATE, are an implementation of the Google API, A web crawler. Various resources that can process ontologies are available for Gate v2. The OntoGazetteer is an interface that enables one to view ontologies. With the OntoGazetteer Editor, the class hierarchies of RDF or RDF(S) ontologies can be edited. Protégé has been integrated with GATE. OntoLT is a very likely candidate for the Ontology Engineering component of OntoSpider. For this reason, it will be described into more detail here. OntoLT

29 22 CHAPTER 3. ONTOLOGY LEARNING Figure 3.1: Opening screen of Protege with the OntoLT plug-in marked in red is a plug-in for Protégé that requires Sun s Java Runtime Environment (JRE). A first beta version of the plug-in was made available to the public in November The current version is 2.0 and it works with version 3.x of Protégé. The following description is based on [10], [12], [13], [14], [15], [16] and [62], and evaluations that were done with an earlier version on a machine running FreeBSD 5.x. Most screenshots were taken from the latest version. In Protégé, the OntoLT plug-in is represented by a tab. OntoLT takes an XML-annotated corpus as input. The format of this XML annotation is proprietary and is called MM. This MM format encodes morphological, syntactic and semantic information. A software package that can produce the necessary XML annotation automatically, is SCHUG/WebSchug. SCHUG, which stands for Shallow and Chunk based Unification Grammar, was introduced by Declerck et al. ([24], 2002) and later in other work like Declerck et al. ([26], 2003). SCHUG maps XML with linguistic information onto feature structures, on which unification can work, activating rules that can work on the linguistic data. The

30 3.3. ONTOLOGY LEARNING APPROACHES 23 technique of Cascaded Chunk Processing is used at this point to perform various kinds of linguistic processing. The output of SCHUG is again data which is XML encoded, enriched with more linguistic annotations. SCHUG is able to process various natural languages, like German and Spanish, which is demonstrated in Declerc et al. ([24], 2002) and, as is demonstrated in Declerck et al. ([26], 2003), in e.g. Central and Eastern European languages. Another application outside OntoLT that uses SCHUG is the MUMIS project, which performs Information Extraction on multimedia resources in the field of soccer. MUMIS is described in various papers, like Declerck et al. ([25], 2002). A corpus consists of one or more documents that are marked in XML with <document> tags. Every document is represented by a separate file on disk and can consist of one or more sentences, indicated by <sentence> tags. Sentences consist of clauses, phrases and text, indicated with tags of the same name. Text contains <token> tags, from which the original sentences can be reconstructed. A simplified abstract example of the XML structure, the dots are an informal representation of partial information: <?xml version= 1.0 encoding= ISO ?> <document name=./example.xml date= > <sentence id= 1 stype= decl corresp= > <clauses> </clauses> <phrases> </phrases> <text> <token> </token> </text> </sentence> <sentence id=... >... </sentence> </document> A sample XML annotated English corpus that was supplied by SCHUG is included in the OntoLT package. Manual XML annotation is tedious, and the OntoLT

31 24 CHAPTER 3. ONTOLOGY LEARNING Figure 3.2: Tabs of OntoLT plug-in is meant for semi-automatic use anyway, so the only realistic alternatives to using SCHUG are writing an alternative semi-automatic XML annotator or adapting an existing one to deal with this specific XML format. For the purpose of this study, WebSchug was chosen. Much is done by the module that produces the input XML to OntoLT. It will take care of POS-tagging, (other) morphological analysis, syntactic analysis and lexical semantic tagging, and provide XML markup for all this. When the OntoLT tab is clicked, tabs for Operators, Mappings, Conditions and Corpora will be visible (Figure 3.2). In the Corpora tab, new corpora can be imported. For this purpose, multiple XML annotated files can be selected and together given a corpus name. Clicking on the binoculars in the Candidate View tab and selecting the corpus then extracts candidate classes, slots and instances. The name of the extraction is derived from the time at which it took place. Extracted candidates can be inspected by clicking on key icons. The user can choose with which candidates the resulting ontology should be enriched. At the time of writing, OntoLT only allows for the extension of ontologies, not creating smaller ontologies from existing ones. The extraction takes place based on XPATH expressions. These can be found under the XPaths tab. If an XPATH expression matches, a mapping rule is activated based on which candidates may be extracted. Both the XPATH expressions and the mapping rules can be adjusted or added to by the user. For the XPATH expressions, a precondition language is available, comprising of the predicates containspath, HasValue, HasConcept, AND, OR, NOT and EQUAL, and the function ID. The beta version of OntoLT 1.0 includes two mapping rules which consist of large conjunctions of conditions (Figure 3).

32 3.3. ONTOLOGY LEARNING APPROACHES 25 Figure 3.3: Mapping Rule for Head Nouns and Modifiers Figure 3.4: Above Rule in an older version of OntoLT

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet Joerg-Uwe Kietz, Alexander Maedche, Raphael Volz Swisslife Information Systems Research Lab, Zuerich, Switzerland fkietz, volzg@swisslife.ch

More information

Web Semantic Annotation Using Data-Extraction Ontologies

Web Semantic Annotation Using Data-Extraction Ontologies Web Semantic Annotation Using Data-Extraction Ontologies A Dissertation Proposal Presented to the Department of Computer Science Brigham Young University In Partial Fulfillment of the Requirements for

More information

State of the Art: Patterns in Ontology Engineering

State of the Art: Patterns in Ontology Engineering : Patterns in Ontology Engineering Eva Blomqvist ISSN 1404-0018 Research Report 04:8 : Patterns in Ontology Engineering Eva Blomqvist Information Engineering Research Group Department of Electronic and

More information

Knowledge Representations. How else can we represent knowledge in addition to formal logic?

Knowledge Representations. How else can we represent knowledge in addition to formal logic? Knowledge Representations How else can we represent knowledge in addition to formal logic? 1 Common Knowledge Representations Formal Logic Production Rules Semantic Nets Schemata and Frames 2 Production

More information

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck Lecture Telecooperation D. Fensel Leopold-Franzens- Universität Innsbruck First Lecture: Introduction: Semantic Web & Ontology Introduction Semantic Web and Ontology Part I Introduction into the subject

More information

Knowledge and Ontological Engineering: Directions for the Semantic Web

Knowledge and Ontological Engineering: Directions for the Semantic Web Knowledge and Ontological Engineering: Directions for the Semantic Web Dana Vaughn and David J. Russomanno Department of Electrical and Computer Engineering The University of Memphis Memphis, TN 38152

More information

Ontology Extraction from Heterogeneous Documents

Ontology Extraction from Heterogeneous Documents Vol.3, Issue.2, March-April. 2013 pp-985-989 ISSN: 2249-6645 Ontology Extraction from Heterogeneous Documents Kirankumar Kataraki, 1 Sumana M 2 1 IV sem M.Tech/ Department of Information Science & Engg

More information

Domain-specific Concept-based Information Retrieval System

Domain-specific Concept-based Information Retrieval System Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical

More information

CSC 5930/9010: Text Mining GATE Developer Overview

CSC 5930/9010: Text Mining GATE Developer Overview 1 CSC 5930/9010: Text Mining GATE Developer Overview Dr. Paula Matuszek Paula.Matuszek@villanova.edu Paula.Matuszek@gmail.com (610) 647-9789 GATE Components 2 We will deal primarily with GATE Developer:

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

Ontology Creation and Development Model

Ontology Creation and Development Model Ontology Creation and Development Model Pallavi Grover, Sonal Chawla Research Scholar, Department of Computer Science & Applications, Panjab University, Chandigarh, India Associate. Professor, Department

More information

Models versus Ontologies - What's the Difference and where does it Matter?

Models versus Ontologies - What's the Difference and where does it Matter? Models versus Ontologies - What's the Difference and where does it Matter? Colin Atkinson University of Mannheim Presentation for University of Birmingham April 19th 2007 1 Brief History Ontologies originated

More information

Ontology Research Group Overview

Ontology Research Group Overview Ontology Research Group Overview ORG Dr. Valerie Cross Sriram Ramakrishnan Ramanathan Somasundaram En Yu Yi Sun Miami University OCWIC 2007 February 17, Deer Creek Resort OCWIC 2007 1 Outline Motivation

More information

Natural Language Processing with PoolParty

Natural Language Processing with PoolParty Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense

More information

Motivating Ontology-Driven Information Extraction

Motivating Ontology-Driven Information Extraction Motivating Ontology-Driven Information Extraction Burcu Yildiz 1 and Silvia Miksch 1, 2 1 Institute for Software Engineering and Interactive Systems, Vienna University of Technology, Vienna, Austria {yildiz,silvia}@

More information

Semantic-Based Web Mining Under the Framework of Agent

Semantic-Based Web Mining Under the Framework of Agent Semantic-Based Web Mining Under the Framework of Agent Usha Venna K Syama Sundara Rao Abstract To make automatic service discovery possible, we need to add semantics to the Web service. A semantic-based

More information

Information Retrieval (IR) through Semantic Web (SW): An Overview

Information Retrieval (IR) through Semantic Web (SW): An Overview Information Retrieval (IR) through Semantic Web (SW): An Overview Gagandeep Singh 1, Vishal Jain 2 1 B.Tech (CSE) VI Sem, GuruTegh Bahadur Institute of Technology, GGS Indraprastha University, Delhi 2

More information

Towards the Semantic Web

Towards the Semantic Web Towards the Semantic Web Ora Lassila Research Fellow, Nokia Research Center (Boston) Chief Scientist, Nokia Venture Partners LLP Advisory Board Member, W3C XML Finland, October 2002 1 NOKIA 10/27/02 -

More information

Semantic Web: vision and reality

Semantic Web: vision and reality Semantic Web: vision and reality Mile Jovanov, Marjan Gusev Institute of Informatics, FNSM, Gazi Baba b.b., 1000 Skopje {mile, marjan}@ii.edu.mk Abstract. Semantic Web is set of technologies currently

More information

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction Adaptable and Adaptive Web Information Systems School of Computer Science and Information Systems Birkbeck College University of London Lecture 1: Introduction George Magoulas gmagoulas@dcs.bbk.ac.uk October

More information

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial. A tutorial report for SENG 609.22 Agent Based Software Engineering Course Instructor: Dr. Behrouz H. Far XML Tutorial Yanan Zhang Department of Electrical and Computer Engineering University of Calgary

More information

The Semantic Planetary Data System

The Semantic Planetary Data System The Semantic Planetary Data System J. Steven Hughes 1, Daniel J. Crichton 1, Sean Kelly 1, and Chris Mattmann 1 1 Jet Propulsion Laboratory 4800 Oak Grove Drive Pasadena, CA 91109 USA {steve.hughes, dan.crichton,

More information

Knowledge Engineering with Semantic Web Technologies

Knowledge Engineering with Semantic Web Technologies This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) Knowledge Engineering with Semantic Web Technologies Lecture 5: Ontological Engineering 5.3 Ontology Learning

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 93-94 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

Helmi Ben Hmida Hannover University, Germany

Helmi Ben Hmida Hannover University, Germany Helmi Ben Hmida Hannover University, Germany 1 Summarizing the Problem: Computers don t understand Meaning My mouse is broken. I need a new one 2 The Semantic Web Vision the idea of having data on the

More information

Ontologies SKOS. COMP62342 Sean Bechhofer

Ontologies SKOS. COMP62342 Sean Bechhofer Ontologies SKOS COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Metadata Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies

More information

Structure of This Presentation

Structure of This Presentation Inferencing for the Semantic Web: A Concise Overview Feihong Hsu fhsu@cs.uic.edu March 27, 2003 Structure of This Presentation General features of inferencing for the Web Inferencing languages Survey of

More information

SKOS. COMP62342 Sean Bechhofer

SKOS. COMP62342 Sean Bechhofer SKOS COMP62342 Sean Bechhofer sean.bechhofer@manchester.ac.uk Ontologies Metadata Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies

More information

Smart Open Services for European Patients. Work Package 3.5 Semantic Services Definition Appendix E - Ontology Specifications

Smart Open Services for European Patients. Work Package 3.5 Semantic Services Definition Appendix E - Ontology Specifications 24Am Smart Open Services for European Patients Open ehealth initiative for a European large scale pilot of Patient Summary and Electronic Prescription Work Package 3.5 Semantic Services Definition Appendix

More information

Lightweight Semantic Web Motivated Reasoning in Prolog

Lightweight Semantic Web Motivated Reasoning in Prolog Lightweight Semantic Web Motivated Reasoning in Prolog Salman Elahi, s0459408@sms.ed.ac.uk Supervisor: Dr. Dave Robertson Introduction: As the Semantic Web is, currently, in its developmental phase, different

More information

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science Information Retrieval CS 6900 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Information Retrieval Information Retrieval (IR) is finding material of an unstructured

More information

Text Mining for Software Engineering

Text Mining for Software Engineering Text Mining for Software Engineering Faculty of Informatics Institute for Program Structures and Data Organization (IPD) Universität Karlsruhe (TH), Germany Department of Computer Science and Software

More information

Ontology integration in a multilingual e-retail system

Ontology integration in a multilingual e-retail system integration in a multilingual e-retail system Maria Teresa PAZIENZA(i), Armando STELLATO(i), Michele VINDIGNI(i), Alexandros VALARAKOS(ii), Vangelis KARKALETSIS(ii) (i) Department of Computer Science,

More information

JENA: A Java API for Ontology Management

JENA: A Java API for Ontology Management JENA: A Java API for Ontology Management Hari Rajagopal IBM Corporation Page Agenda Background Intro to JENA Case study Tools and methods Questions Page The State of the Web Today The web is more Syntactic

More information

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge Discover hidden information from your texts! Information overload is a well known issue in the knowledge industry. At the same time most of this information becomes available in natural language which

More information

Adding formal semantics to the Web

Adding formal semantics to the Web Adding formal semantics to the Web building on top of RDF Schema Jeen Broekstra On-To-Knowledge project Context On-To-Knowledge IST project about content-driven knowledge management through evolving ontologies

More information

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google, 1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to

More information

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96 ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 95-96 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology

More information

THE GETTY VOCABULARIES TECHNICAL UPDATE

THE GETTY VOCABULARIES TECHNICAL UPDATE AAT TGN ULAN CONA THE GETTY VOCABULARIES TECHNICAL UPDATE International Working Group Meetings January 7-10, 2013 Joan Cobb Gregg Garcia Information Technology Services J. Paul Getty Trust International

More information

NeOn Methodology for Building Ontology Networks: a Scenario-based Methodology

NeOn Methodology for Building Ontology Networks: a Scenario-based Methodology NeOn Methodology for Building Ontology Networks: a Scenario-based Methodology Asunción Gómez-Pérez and Mari Carmen Suárez-Figueroa Ontology Engineering Group. Departamento de Inteligencia Artificial. Facultad

More information

An Architecture for Semantic Enterprise Application Integration Standards

An Architecture for Semantic Enterprise Application Integration Standards An Architecture for Semantic Enterprise Application Integration Standards Nenad Anicic 1, 2, Nenad Ivezic 1, Albert Jones 1 1 National Institute of Standards and Technology, 100 Bureau Drive Gaithersburg,

More information

Ontology-Based Information Extraction

Ontology-Based Information Extraction Ontology-Based Information Extraction Daya C. Wimalasuriya Towards Partial Completion of the Comprehensive Area Exam Department of Computer and Information Science University of Oregon Committee: Dr. Dejing

More information

The Semantic Web & Ontologies

The Semantic Web & Ontologies The Semantic Web & Ontologies Kwenton Bellette The semantic web is an extension of the current web that will allow users to find, share and combine information more easily (Berners-Lee, 2001, p.34) This

More information

The Model-Driven Semantic Web Emerging Standards & Technologies

The Model-Driven Semantic Web Emerging Standards & Technologies The Model-Driven Semantic Web Emerging Standards & Technologies Elisa Kendall Sandpiper Software March 24, 2005 1 Model Driven Architecture (MDA ) Insulates business applications from technology evolution,

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Ontology Research and Development Part 1 A Review of Ontology Generation

Ontology Research and Development Part 1 A Review of Ontology Generation Ontology Research and Development Part 1 A Review of Ontology Generation Ying Ding Division of Mathematics and Computer Science Vrije Universiteit, Amsterdam (ying@cs.vu.nl) Schubert Foo Division of Information

More information

Business Rules in the Semantic Web, are there any or are they different?

Business Rules in the Semantic Web, are there any or are they different? Business Rules in the Semantic Web, are there any or are they different? Silvie Spreeuwenberg, Rik Gerrits LibRT, Silodam 364, 1013 AW Amsterdam, Netherlands {silvie@librt.com, Rik@LibRT.com} http://www.librt.com

More information

A Lightweight Approach to Semantic Tagging

A Lightweight Approach to Semantic Tagging A Lightweight Approach to Semantic Tagging Nadzeya Kiyavitskaya, Nicola Zeni, Luisa Mich, John Mylopoulus Department of Information and Communication Technologies, University of Trento Via Sommarive 14,

More information

XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI

XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI Chapter 18 XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI Fábio Ghignatti Beckenkamp and Wolfgang Pree Abstract: Key words: WebEDI relies on the Internet infrastructure for exchanging documents among

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 4, Jul-Aug 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 4, Jul-Aug 2015 RESEARCH ARTICLE OPEN ACCESS Multi-Lingual Ontology Server (MOS) For Discovering Web Services Abdelrahman Abbas Ibrahim [1], Dr. Nael Salman [2] Department of Software Engineering [1] Sudan University

More information

Enabling Semantic Search in Large Open Source Communities

Enabling Semantic Search in Large Open Source Communities Enabling Semantic Search in Large Open Source Communities Gregor Leban, Lorand Dali, Inna Novalija Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana {gregor.leban, lorand.dali, inna.koval}@ijs.si

More information

XML Support for Annotated Language Resources

XML Support for Annotated Language Resources XML Support for Annotated Language Resources Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York USA ide@cs.vassar.edu Laurent Romary Equipe Langue et Dialogue LORIA/CNRS Vandoeuvre-lès-Nancy,

More information

Information Retrieval

Information Retrieval Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,

More information

Semantic Web Mining and its application in Human Resource Management

Semantic Web Mining and its application in Human Resource Management International Journal of Computer Science & Management Studies, Vol. 11, Issue 02, August 2011 60 Semantic Web Mining and its application in Human Resource Management Ridhika Malik 1, Kunjana Vasudev 2

More information

Annotation Science From Theory to Practice and Use Introduction A bit of history

Annotation Science From Theory to Practice and Use Introduction A bit of history Annotation Science From Theory to Practice and Use Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York 12604 USA ide@cs.vassar.edu Introduction Linguistically-annotated corpora

More information

Text Mining. Representation of Text Documents

Text Mining. Representation of Text Documents Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,

More information

Semantics and Ontologies for Geospatial Information. Dr Kristin Stock

Semantics and Ontologies for Geospatial Information. Dr Kristin Stock Semantics and Ontologies for Geospatial Information Dr Kristin Stock Introduction The study of semantics addresses the issue of what data means, including: 1. The meaning and nature of basic geospatial

More information

Extracting knowledge from Ontology using Jena for Semantic Web

Extracting knowledge from Ontology using Jena for Semantic Web Extracting knowledge from Ontology using Jena for Semantic Web Ayesha Ameen I.T Department Deccan College of Engineering and Technology Hyderabad A.P, India ameenayesha@gmail.com Khaleel Ur Rahman Khan

More information

Enhanced retrieval using semantic technologies:

Enhanced retrieval using semantic technologies: Enhanced retrieval using semantic technologies: Ontology based retrieval as a new search paradigm? - Considerations based on new projects at the Bavarian State Library Dr. Berthold Gillitzer 28. Mai 2008

More information

An Evaluation of Geo-Ontology Representation Languages for Supporting Web Retrieval of Geographical Information

An Evaluation of Geo-Ontology Representation Languages for Supporting Web Retrieval of Geographical Information An Evaluation of Geo-Ontology Representation Languages for Supporting Web Retrieval of Geographical Information P. Smart, A.I. Abdelmoty and C.B. Jones School of Computer Science, Cardiff University, Cardiff,

More information

KNOWLEDGE MANAGEMENT VIA DEVELOPMENT IN ACCOUNTING: THE CASE OF THE PROFIT AND LOSS ACCOUNT

KNOWLEDGE MANAGEMENT VIA DEVELOPMENT IN ACCOUNTING: THE CASE OF THE PROFIT AND LOSS ACCOUNT KNOWLEDGE MANAGEMENT VIA DEVELOPMENT IN ACCOUNTING: THE CASE OF THE PROFIT AND LOSS ACCOUNT Tung-Hsiang Chou National Chengchi University, Taiwan John A. Vassar Louisiana State University in Shreveport

More information

Towards Ontology Mapping: DL View or Graph View?

Towards Ontology Mapping: DL View or Graph View? Towards Ontology Mapping: DL View or Graph View? Yongjian Huang, Nigel Shadbolt Intelligence, Agents and Multimedia Group School of Electronics and Computer Science University of Southampton November 27,

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Semantic Web. Tahani Aljehani

Semantic Web. Tahani Aljehani Semantic Web Tahani Aljehani Motivation: Example 1 You are interested in SOAP Web architecture Use your favorite search engine to find the articles about SOAP Keywords-based search You'll get lots of information,

More information

New Approach to Graph Databases

New Approach to Graph Databases Paper PP05 New Approach to Graph Databases Anna Berg, Capish, Malmö, Sweden Henrik Drews, Capish, Malmö, Sweden Catharina Dahlbo, Capish, Malmö, Sweden ABSTRACT Graph databases have, during the past few

More information

Proposal for Implementing Linked Open Data on Libraries Catalogue

Proposal for Implementing Linked Open Data on Libraries Catalogue Submitted on: 16.07.2018 Proposal for Implementing Linked Open Data on Libraries Catalogue Esraa Elsayed Abdelaziz Computer Science, Arab Academy for Science and Technology, Alexandria, Egypt. E-mail address:

More information

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall

The Semantic Web Revisited. Nigel Shadbolt Tim Berners-Lee Wendy Hall The Semantic Web Revisited Nigel Shadbolt Tim Berners-Lee Wendy Hall Today sweb It is designed for human consumption Information retrieval is mainly supported by keyword-based search engines Some problems

More information

Pedigree Management and Assessment Framework (PMAF) Demonstration

Pedigree Management and Assessment Framework (PMAF) Demonstration Pedigree Management and Assessment Framework (PMAF) Demonstration Kenneth A. McVearry ATC-NY, Cornell Business & Technology Park, 33 Thornwood Drive, Suite 500, Ithaca, NY 14850 kmcvearry@atcorp.com Abstract.

More information

Manually vs semiautomatic domain specific ontology building

Manually vs semiautomatic domain specific ontology building Facoltà di Lettere e Filosofia Corso di Laurea Specialistica in Comunicazione d impresa e pubblica Tesi di Laurea in Informatica per il Commercio Elettronico Manually vs semiautomatic domain specific ontology

More information

Ontology Development. Qing He

Ontology Development. Qing He A tutorial report for SENG 609.22 Agent Based Software Engineering Course Instructor: Dr. Behrouz H. Far Ontology Development Qing He 1 Why develop an ontology? In recent years the development of ontologies

More information

The Semantic Web: A Vision or a Dream?

The Semantic Web: A Vision or a Dream? The Semantic Web: A Vision or a Dream? Ben Weber Department of Computer Science California Polytechnic State University May 15, 2005 Abstract The Semantic Web strives to be a machine readable version of

More information

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper. Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...

More information

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. Knowledge Retrieval Franz J. Kurfess Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A. 1 Acknowledgements This lecture series has been sponsored by the European

More information

Design and Implementation of an RDF Triple Store

Design and Implementation of an RDF Triple Store Design and Implementation of an RDF Triple Store Ching-Long Yeh and Ruei-Feng Lin Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd., Sec. 3 Taipei, 04 Taiwan E-mail:

More information

Information Extraction Techniques in Terrorism Surveillance

Information Extraction Techniques in Terrorism Surveillance Information Extraction Techniques in Terrorism Surveillance Roman Tekhov Abstract. The article gives a brief overview of what information extraction is and how it might be used for the purposes of counter-terrorism

More information

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany

<is web> Information Systems & Semantic Web University of Koblenz Landau, Germany Information Systems & University of Koblenz Landau, Germany Semantic Search examples: Swoogle and Watson Steffen Staad credit: Tim Finin (swoogle), Mathieu d Aquin (watson) and their groups 2009-07-17

More information

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services.

H1 Spring B. Programmers need to learn the SOAP schema so as to offer and use Web services. 1. (24 points) Identify all of the following statements that are true about the basics of services. A. If you know that two parties implement SOAP, then you can safely conclude they will interoperate at

More information

Ontology-based Architecture Documentation Approach

Ontology-based Architecture Documentation Approach 4 Ontology-based Architecture Documentation Approach In this chapter we investigate how an ontology can be used for retrieving AK from SA documentation (RQ2). We first give background information on the

More information

WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES. Introduction. Production rules. Christian de Sainte Marie ILOG

WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES. Introduction. Production rules. Christian de Sainte Marie ILOG WHY WE NEED AN XML STANDARD FOR REPRESENTING BUSINESS RULES Christian de Sainte Marie ILOG Introduction We are interested in the topic of communicating policy decisions to other parties, and, more generally,

More information

THE TECHNIQUES FOR THE ONTOLOGY-BASED INFORMATION RETRIEVAL

THE TECHNIQUES FOR THE ONTOLOGY-BASED INFORMATION RETRIEVAL THE TECHNIQUES FOR THE ONTOLOGY-BASED INFORMATION RETRIEVAL Myunggwon Hwang 1, Hyunjang Kong 1, Sunkyoung Baek 1, Kwangsu Hwang 1, Pankoo Kim 2 1 Dept. of Computer Science Chosun University, Gwangju, Korea

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Semantic Web Domain Knowledge Representation Using Software Engineering Modeling Technique

Semantic Web Domain Knowledge Representation Using Software Engineering Modeling Technique Semantic Web Domain Knowledge Representation Using Software Engineering Modeling Technique Minal Bhise DAIICT, Gandhinagar, Gujarat, India 382007 minal_bhise@daiict.ac.in Abstract. The semantic web offers

More information

DCMI Abstract Model - DRAFT Update

DCMI Abstract Model - DRAFT Update 1 of 7 9/19/2006 7:02 PM Architecture Working Group > AMDraftUpdate User UserPreferences Site Page Actions Search Title: Text: AttachFile DeletePage LikePages LocalSiteMap SpellCheck DCMI Abstract Model

More information

Oracle Enterprise Data Quality for Product Data

Oracle Enterprise Data Quality for Product Data Oracle Enterprise Data Quality for Product Data Glossary Release 5.6.2 E24157-01 July 2011 Oracle Enterprise Data Quality for Product Data Glossary, Release 5.6.2 E24157-01 Copyright 2001, 2011 Oracle

More information

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ONTOLOGY LEARNING FOR THE SEMANTIC WEB ONTOLOGY LEARNING FOR THE SEMANTIC WEB by Alexander Maedche University of Karlsruhe, Germany SPRINGER

More information

Knowledge Representation, Ontologies, and the Semantic Web

Knowledge Representation, Ontologies, and the Semantic Web Knowledge Representation, Ontologies, and the Semantic Web Evimaria Terzi 1, Athena Vakali 1, and Mohand-Saïd Hacid 2 1 Informatics Dpt., Aristotle University, 54006 Thessaloniki, Greece evimaria,avakali@csd.auth.gr

More information

Standard Business Rules Language: why and how? ICAI 06

Standard Business Rules Language: why and how? ICAI 06 Standard Business Rules Language: why and how? ICAI 06 M. Diouf K. Musumbu S. Maabout LaBRI (UMR 5800 du CNRS), 351, cours de la Libération, F-33.405 TALENCE Cedex e-mail: {diouf, musumbu, maabout}@labri.fr

More information

STS Infrastructural considerations. Christian Chiarcos

STS Infrastructural considerations. Christian Chiarcos STS Infrastructural considerations Christian Chiarcos chiarcos@uni-potsdam.de Infrastructure Requirements Candidates standoff-based architecture (Stede et al. 2006, 2010) UiMA (Ferrucci and Lally 2004)

More information

Vocabulary Harvesting Using MatchIT. By Andrew W Krause, Chief Technology Officer

Vocabulary Harvesting Using MatchIT. By Andrew W Krause, Chief Technology Officer July 31, 2006 Vocabulary Harvesting Using MatchIT By Andrew W Krause, Chief Technology Officer Abstract Enterprises and communities require common vocabularies that comprehensively and concisely label/encode,

More information

Semantic Web Technologies

Semantic Web Technologies 1/57 Introduction and RDF Jos de Bruijn debruijn@inf.unibz.it KRDB Research Group Free University of Bolzano, Italy 3 October 2007 2/57 Outline Organization Semantic Web Limitations of the Web Machine-processable

More information

Domain Specific Semantic Web Search Engine

Domain Specific Semantic Web Search Engine Domain Specific Semantic Web Search Engine KONIDENA KRUPA MANI BALA 1, MADDUKURI SUSMITHA 2, GARRE SOWMYA 3, GARIKIPATI SIRISHA 4, PUPPALA POTHU RAJU 5 1,2,3,4 B.Tech, Computer Science, Vasireddy Venkatadri

More information

The Semantic Web: Yet Another Hip?

The Semantic Web: Yet Another Hip? to appear in Data and Knowledge Engineering, 2002, 18.12.01 1 The Semantic Web: Yet Another Hip? Ying Ding, Dieter Fensel, Michel Klein, and Borys Omelayenko Division of Mathmatics & Computer Science,

More information

Implementing a Knowledge Database for Scientific Control Systems. Daniel Gresh Wheatland-Chili High School LLE Advisor: Richard Kidder Summer 2006

Implementing a Knowledge Database for Scientific Control Systems. Daniel Gresh Wheatland-Chili High School LLE Advisor: Richard Kidder Summer 2006 Implementing a Knowledge Database for Scientific Control Systems Abstract Daniel Gresh Wheatland-Chili High School LLE Advisor: Richard Kidder Summer 2006 A knowledge database for scientific control systems

More information

Introduction. October 5, Petr Křemen Introduction October 5, / 31

Introduction. October 5, Petr Křemen Introduction October 5, / 31 Introduction Petr Křemen petr.kremen@fel.cvut.cz October 5, 2017 Petr Křemen (petr.kremen@fel.cvut.cz) Introduction October 5, 2017 1 / 31 Outline 1 About Knowledge Management 2 Overview of Ontologies

More information

Data formats for exchanging classifications UNSD

Data formats for exchanging classifications UNSD ESA/STAT/AC.234/22 11 May 2011 UNITED NATIONS DEPARTMENT OF ECONOMIC AND SOCIAL AFFAIRS STATISTICS DIVISION Meeting of the Expert Group on International Economic and Social Classifications New York, 18-20

More information

Final Project Discussion. Adam Meyers Montclair State University

Final Project Discussion. Adam Meyers Montclair State University Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...

More information

Automation of Semantic Web based Digital Library using Unified Modeling Language Minal Bhise 1 1

Automation of Semantic Web based Digital Library using Unified Modeling Language Minal Bhise 1 1 Automation of Semantic Web based Digital Library using Unified Modeling Language Minal Bhise 1 1 Dhirubhai Ambani Institute for Information and Communication Technology, Gandhinagar, Gujarat, India Email:

More information

University of Huddersfield Repository

University of Huddersfield Repository University of Huddersfield Repository Olszewska, Joanna Isabelle, Simpson, Ron and McCluskey, T.L. Appendix A: epronto: OWL Based Ontology for Research Information Management Original Citation Olszewska,

More information

Mapping between Digital Identity Ontologies through SISM

Mapping between Digital Identity Ontologies through SISM Mapping between Digital Identity Ontologies through SISM Matthew Rowe The OAK Group, Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield S1 4DP, UK m.rowe@dcs.shef.ac.uk

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION Most of today s Web content is intended for the use of humans rather than machines. While searching documents on the Web using computers, human interpretation is required before

More information