Next generation knowledge access

Size: px
Start display at page:

Download "Next generation knowledge access"

Transcription

1 Next generation knowledge access John Davies, Alistair Duke, Nick Kings, Dunja Mladenić, Kalina ontcheva, Miha Grčar, Richard enjamins, Jesus Contreras, Mercedes lazquez Civico and Tim Glover Abstract Purpose The paper shows how access to knowledge can be enhanced by using a set of innovative approaches and technologies based on the semantic web. Design/methodology/approach Emerging trends in knowledge access are considered followed by a description of how ontologies and semantics can contribute. A set of tools is then presented which is based on semantic web technology. For each of these tools a detailed description of the approach is given together with an analysis of related and future work as appropriate. Findings The tools presented are at the prototype stage but can already show how knowledge access can be improved by allowing users to more precisely express what they are looking for and by presenting to them in a form that is appropriate to their current context. Research limitations/implications The tools show promising results in improving access to knowledge which will be further evaluated within a practical setting. The tools will be integrated and trialled as part of case studies within the SEKT project. This will allow their usability and practical applicability to be measured. Practical implications Ontologies as a form of knowledge representation are increasing in importance. Knowledge management, and in particular knowledge access, will benefit from their widespread acceptance. The use of open standards and compatible tools in this area will be important to support interoperability and widespread access to disparate knowledge repositories. Originality/value The paper presents research in an emerging but increasingly important field, i.e. semantic web-based knowledge technology. It describes how this technology can satisfy the demand for improved knowledge access, including providing knowledge delivery to users at the right time and in the correct form. Keywords Knowledge management, Worldwide web, Semantics, Search engines Paper type Research paper 1. Introduction (Information about the authors can be found at the end of the article.) This work was supported by the IST Programme of the European Community under SEKT, Semantically Enabled Knowledge Technologies (IST IP) and PASCAL Network of Excellence (IST ); and partially by the Slovenian Research Agency. This publication only reflects the authors views. Today, we can observe a number of emerging trends in technologies for intelligent knowledge access, including developments in search engines, categorisation tools and visualisation systems. This paper gives a brief overview of them, describes ongoing efforts to develop semantic web-based knowledge access tools, and discusses how a semantic web-based approach can provide a coherent framework to address many of these emerging requirements. 1.1 Trends in knowledge access A number of trends can be discerned in the knowledge access marketplace as vendors and users alike start to think beyond Google. It is instructive to review these briefly, since the work described in the remainder of the paper addresses all of these issues via a single coherent technology framework: the semantic web. Desktop search Google are moving to support searching the desktop. This is because Microsoft is moving into search in a big way and currently they have the advantage that PAGE 64 j JOURNAL OF KNOWLEDGE MANAGEMENT j VOL. 9 NO , pp , Q Emerald Group Publishing Limited, ISSN DOI /

2 they are well versed in processing desktop formats. So Microsoft is moving into Google s space, and vice versa. Categorisation as ranking quality increases, it is likely that the relative differences between different search algorithms from different vendors will get smaller. Therefore a new differentiator has to be found and one possibility is organising the results of a search for the user by category (e.g. Verity, clusty.com). Integrated search future searches may not be initiated by visiting a webpage separate from your application but rather by, for example, highlighting a chunk of text in a Word document and right-clicking. This is an area Microsoft would hope to dominate by embedding its search capability into Office applications. What s the advantage? Reduced user overhead required to initiate searches. Seamless search this involves firing off implicit queries based on user activity. See blinx.com. This means less overhead is required to access information (you don t have to stop what you re doing). Ideally, this will combine a search of the desktop and the web (and other areas to which the user has access, e.g. networked drives, public folders, etc.). Personalised search tweaking the search based on a user s prior searches or a personal profile of some kind. Microsoft have Stuff I ve Seen (which could be used to derive a profile) and active folder research, which will add items, such as files, links, etc. to a folder that are found (on the desktop or the web) to be relevant to content of that folder (i.e. expanding it virtually). eyond search some vendors are aiming to add intelligent, sub-document analysis of results (e.g. Corpora s Jump!) Why? So as to not just give the user a long list of documents but also help with the next step the analysis of the returned information. This allows the user to browse digests of the information based on different topics/categories found in the results list. A9.com (from Amazon) provides a meta-search over selectable sources where the results are split into categories such as books, movies, images, references, etc. The user s bookmarks can also be searched forming another category. Visualisation visualisation of search results generally means using 2D or 3D representations of the search results and/or topics that they have been classified against. This can be useful because it allows the user to quickly grasp the results and the categories. The most important topics might be represented by larger icons, drawing the user to them first. Examples of visualisation-based search tool include webbrain.com and kartoo.com Device independence knowledge workers use an increasingly sophisticated and diverse range of devices and expect to be able to access information wherever and whenever they are. As well as PCs (desktop and laptops), mobile phones (including SMS messaging, WAP browsing and use of 3G multimedia capability) and various PDA device types are now commonly used. 1.2 Role of ontologies and semantics All of the trends identified above can be further enabled or enhanced by the application of semantic technology. As discussed in more detail by Davies and Sure (2005), in this issue, the semantic web (erners-lee et al., 2001) provides enhanced information access based on the exploitation of machine-processable metadata. Central to the vision of the semantic web are ontologies. These are seen as facilitating knowledge sharing and re-use between agents, be they human or artificial (Fensel, 2001). They offer this capability by providing a consensual and formal conceptualisation of a given domain. As such, the use of ontologies and supporting tools offer an opportunity to significantly improve knowledge management capabilities on the intranets of organisations and on the wider web. It is generally accepted that search engines based on conventional IR techniques (employing keyword and phrase matching between the query and index) alone tend to offer high recall and low precision. The user is faced with too many results and many results that VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 65

3 The semantic web provides enhanced information access based on the exploitation of machine-processable metadata. are irrelevant. The main reason for this is the failure to handle polysemy (a word that has two or more similar meanings) and synonymy (a word that has the same meaning as another word). The use of ontologies and associated metadata can allow the user to more precisely express their queries, thus avoiding the problems identified above. Users can choose ontological concepts to define their query or select from a set of returned concepts following a search in order to refine their query. This can improve the accuracy of a search and searching can also be extended, as we will see, by the use of a user profile or the context of a search making searching personalised and aiding integrated and seamless searching. Furthermore, the use of semantic technology offers the prospect of a more fundamental change to knowledge access: current technology supports a process wherein the user attempts to frame an information need by specifying a query in the form of either a set of keywords or a piece of natural language text. It is interesting to note that, despite the benefits claimed by some vendors for natural language querying [1], the average search engine query length on the web in a recent survey was just 2.2 words[2]. Having submitted a query, the user is then presented with a ranked list of documents of relevance to the query. The techniques for ranking documents have been the subject of more than 30 years research and are well understood, publicly available and differ relatively little in terms of performance, notwithstanding the claims of some search engine vendors. In short, today s search engines are much of a muchness. It is suggested here, therefore, that the future of search engines lies in supporting more of the information management process, as opposed to seeking incremental and modest improvements to relevance ranking of documents. In this approach, software supports the process of actually reading and analysing relevant documents, rather than merely listing them and leaving the rest of the information analysis task to the user. Corporate knowledge workers need information defined by its meaning, not by text strings ( bags of words ). They also need information relevant to their interests and to their current context. They need to find not just documents, but sections and information entities within documents and even digests of information created from multiple documents. As described below, the exploitation of metadata and ontological information can offer this information-centric approach, as opposed to the prevailing document-centric technology. The generation of ontologies and the creation of metadata attributing information to them is obviously key to the success of these advanced knowledge access approaches (see Cunningham and ontcheva, 2005, in this issue). Techniques allowing (semi-)automatic creation of these are under development. This paper considers automated profile construction while the related wider problem of knowledge discovery is considered by Grobelnik and Mladenić (2005) in this issue. 1.3 Overview of rest of the paper The remainder of this paper describes a number of ongoing efforts to develop semantic web based knowledge access tools mainly coming from the SEKT project[3]. Our vision is to develop and exploit the knowledge technologies which underlie next generation knowledge management. We envision knowledge workplaces where the boundaries between document management, content management, and knowledge management are broken down, and where knowledge management is an effortless part of day-to-day activities. Appropriate knowledge is automatically delivered to the right people at PAGE 66j j JOURNAL OF KNOWLEDGE MANAGEMENT VOL. 9 NO

4 the right time at the right granularity via a range of user devices. A number of systems have been developed to help realise this vision. The first of these, described in Section 2, considers how information delivery can be personalised based upon automatically constructed user profiles. The constructed profile is used to enable browsing of the user history in an interest-focused way. The user is able to see which part of an ontology are related to their current browsing focus and also which recently viewed pages are relevant to a selected concept from the ontology. The search and browse system in Section 3 describes an approach to provide search agents that use ontology-based queries incorporating named entities (e.g. search for a person named Nick Kings in an organisation named T ). The agents periodically crawl the web and update the users with new results found. The next section is concerned with the visualisation of ontologies with a view to supporting browsing (Section 4). The approach is unique in that it considers visualisation from the point of view of the user rather than focusing on the technical aspects. The knowledge generation section (Section 5) describes an approach to generate natural language from ontological data. This can provide automated documentation of ontologies and knowledge bases and unlike human-written texts, the automatic approach will constantly keep the text up-to-date which is vitally important in the semantic web context, where knowledge is dynamic and is updated frequently. The natural language generation approach also allows generation in multiple languages without the need for human or automatic translation. Finally the paper considers device independence in Section 6, the aim of which is to provide an effective user interface to a web application for devices with widely varying capabilities, without having to write a separate site for each class of devices 2. Personalisation With more than 8 billion documents on the web and billions more in corporate and government intranets, personalised information delivery based on user and document profiling is an important step in providing relevant and timely knowledge to the right people, a key concern of knowledge management. User profiles, which aim to model the user s information requirements, are central to personalised information delivery. 2.1 Profile construction A user profile, which is used as a basis for personalisation, can be constructed manually, semi-automatically or fully automatically. Manual approaches to constructing user profiles usually rely on the user or domain expert. The profile is provided by a human in a form of rules, filters, scripts, etc. (e.g. filters for sorting incoming s into the user s folders). Automatic and semi-automatic approaches, on the other hand, rely on a system that is usually capable of capturing user characteristics based on observing the user s behavior, in some cases requiring feedback or guidance from the user. In this paper we address automatic approaches to user profiling. User profiles can be automatically constructed from different data sources using a variety of techniques including content-based user profiling and collaborative user profiling. Content-based user profiling is usually applied on problems involving text documents (i.e. the user is accessing and reading text documents) where the content analysis of the document text is performed in order to construct a profile. For instance, content analysis is used for providing help to the user in web browsing by highlighting hyperlinks of documents similar to the already requested documents (Mladenić, 2002). Collaborative user profiling is based on the assumption that similar users have similar preferences. In other words, by finding users that are similar to the active user and by examining their preferences, the recommender system can predict the active user s preferences for certain items and provide a ranked list of items which the active user will most probably like. Collaborative user profiling generally ignores the form and the content of the items and can therefore also be applied to non-textual items. Furthermore, it can detect relationships between items that have no content similarities but are linked implicitly through the groups of users accessing them. These groups (communities) are formed around a specific user profile. VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 67

5 User profiles can be represented in different ways ranging from simple filters to statistical models (either numerical or symbolic) of a respectful complexity. The profile representation in the case of automatic profile construction mainly depends on the technique used for the profile construction. In this paper we concentrated on profiles represented by ontologies. 2.2 Representing profiles in ontologies Several researchers have developed approaches to user profiling that represent profiles in some kind of ontology. A topic ontology in the form of a tree-like hierarchy of the user interests was proposed in by Kim and Chan (2003), with the root being the user s general interest (i.e. long-term interest) and the leaves representing domains the user is or was ever interested in (i.e. short-term interests). User interest hierarchies are built using a form of hierarchical clustering on a set of web pages visited by a user. A similar approach was used by Grčar et al. (2005) for enhancing usage of the user browsing history, as described later in this section. Another way of constructing a user profile is to analyse the user s browsing history and apply modified collaborative filtering techniques (Sugiyama et al., 2004). Here, the user profile is also a combination of both the user s persistent preferences (long-term preferences) and the user s ephemeral preferences (short-term preferences, or today s preferences) and is represented as a vector of term weights. Modified collaborative filtering is then applied to a user-term matrix (in contrast to being applied to a user-item matrix as is the case with the original collaborative filtering approach hence the word modified ) to predict the missing term weights in each user profile. Clustering is used (in one of their approaches) to determine user communities. Cluster centroids are compared to the active user s term vector to find the user s neighborhood (a threshold is used to discard less relevant communities). The latter approach, according to Sugiyama et al. (2004), achieves the best results. In the Foxtrot recommender system (Middleton et al., 2003), an ontology based on the CORA digital library is used new documents are classified into the taxonomy by using a variant of the nearest neighbour algorithm (Mitchell, 1997). A user profile holds a set of topics and their corresponding interest values. Each topic adds 50 per cent of its interest value to its super-class. They also used static knowledge ontologies to alleviate the cold-start problem. The visualisation of profiles is used to encourage immediate user feedbacks. For evaluation, collaborative filtering is performed on a user-topic matrix (they term this technique collaborative and content-based recommendations ). Recently, Grčar et al. (2005) have proposed user profiling for interest-focused browsing history. The system provides a dynamic user profile in a form of topic ontology. After a page is viewed by the user, the textual content is extracted and stored as a text file. A collection of such text files is maintained in two folders. The first folder holds some relatively small number (e.g. five) of the most recently viewed pages (the short-term interest folder). The second folder contains a larger number (e.g. 300) of the last viewed pages (the long-term interest folder). When a page is first visited, it is placed into both folders. Eventually it gets pushed out by other pages that are viewed afterwards. A page stays in the long-term interest folder much longer than in the short-term interest folder (hence the terms long- and short-term), the reason for this being that a much higher number of new pages need to be viewed for the page to be pushed out of the long-term interest folder. The long-term interest pages are treated slightly differently from the short-term interest pages. To construct the user profile in the form of a topic ontology, a variant of hierarchical clustering is performed on the long-term interest folder to obtain the user topic ontology. The root of the topic ontology holds the user s general interest while the leaves represent his/her specific interests. General interest stands for all the topics the user is or ever was interested in, while the term specific interest Central to the vision of the semantic web are ontologies. PAGE 68j j JOURNAL OF KNOWLEDGE MANAGEMENT VOL. 9 NO

6 usually describes one more-or-less isolated topic that is or ever was of interest to the user. The constructed profile is used to enable browsing of the user history in an interest-focused way as follows. The recently visited pages (representing the user s short-term interest) are mapped to the user topic ontology. The mapping reveals the extent to which an ontology node from the user profile (i.e. a set of pages) is related to the user s short-term interest. y highlighting nodes with the intensity proportional to the similarity score, we can clearly expose the topic ontology segments that are of current interest to the user. Due to the highlighting, the user can clearly see which parts of the topic ontology are relevant to his/her current interest. He/she can also access previously visited pages by selecting a node in the ontology which is visualized in the application window. This can be explained as the user s interest-focused web browsing history, the interest being defined by the selected node. Grčar et al. (2005) have developed a system using the described approach, where the user profile is visualized on an Internet Explorer toolbar. In addition to having a visual presentation of his/her long-term interests with highlighted parts of the current interest, the user can select a node in the user profile ontology to get a list of the specific keywords and the associated web pages. To summarise, methods for automatic creation of user profiles and their representation in ontologies are in the process of becoming more mature and ready to be applied in a number of personalised applications, such as ontology-based search and browse, which is discussed next. 3. Search and browse We believe that it is important to develop tools to exploit the dynamic profile information discussed in the previous section, in order to capture a user s information needs. In turn, the tools will relate his or her information needs to the wider community s ontology. Having derived a profile, the issue is to intelligently present relevant information, and route information between two, or more, members of the community. y classifying information against an ontology, our goal is to provide facilities that augment each community member s personal memory, and enhance recall of information at later points in time. Ontologies have the potential to underpin and enable efficient searching, and large scale knowledge sharing by: The identification of communities of interest, within wider communities. It is feasible to identify sets of people with common sets of interests, but it is crucial to understand how these groups form and can be maintained. Using the underlying ontology to identify implicit user needs, and being able to fetch information in advance of a user s explicit query. One of the proposed toolsets is a platform to support searching and browsing, using semantic agents. The problems of outdated indexing and poor search coverage on the WWW are well known (Lawrence and Giles, 1999). Searching for information is also problematic, as conventional search engines tend to have a high recall and low precision[4]. This often results in the user being presented with far too many results in response to their query; many of the results are not relevant to the user s information need. There are a number of reasons for this, the foremost being the failure of the search engines to cope with the fact that words may have two, or more similar, meanings and that several terms are used to describe the same concept. y searching documents, classified against a domain ontology, the search engine is able to disambiguate the terms of the query, and locate information in a more precise manner. Our search and browse prototype is currently based around a centralised server, running an instantiation of the KIM platform (Popov et al., 2004). The KIM platform allows a user to have access to a series of documents that have been annotated against the KIMO ontology (Popov et al., 2004) (see Figure 1). KIM can be considered to be a number of application services to support the automatic semantic annotation, indexing, and retrieval of unstructured and semi-structured content. In VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 69

7 Figure 1 Outline architecture essence, the system operates as a typical web crawler, with the added stage of automatically extracting meta-data and annotating local copies of retrieved web pages, in a similar fashion to Armadillo (Ciravegna et al., 2004) and htechsight (Aldea et al., 2005). In addition to the semantic repository being constructed, each user also has a number of agents : these agents regularly search the local semantic index to notify the users that new pieces of information have been found; the agents perform queries against named entities rather than simple string matching, such as used with Google. As each document is stored within KIM, named entities are identified and extracted. New statements are added to the semantic repository as a result of the information extraction process. KIM is able to store explicit and implicit statements. Explicit statements are both about recognised entities and simple relations, such as a position within an organisation or organisation located in a location. Additionally, implicit statements are inferred according to the inherent transitivity of some properties, within the KIMO: if smith is of type X, it is also of type Y if X is a subclass of Y). For example, a web page may contain text about Nick Kings, which is identified as a named entity of type Man ; KIM also infers, through KIMO, that Nick Kings is also of type Person. Furthermore, the KIMO ontology contains a number of custom axioms used to yield more implicit statements, such as properties like subregionof. For example, KIMO can store Munich is in Germany but also infer that Munich is Europe because Germany is in, or more formally is a subregionof, Europe. KIM does not, itself, build direct associations between web pages, but will allow subsequent queries of the form find all documents about Nick Kings, where Nick Kings is a Person. Inference is carried out as pages are stored, rather than at the time when queries are carried out, in order to improve the performance for end users. 3.1 User agents Each user is able to create a number of semantic queries, termed here an agent, and these queries are carried out on a regular basis. Thus, each agent searches for documents that contain entities that match the user s long term interests. The user is, currently, able to express searches to find documents that contain information about the following: a named person holding a particular position, within a certain organisation; a named organisation located at a particular location; a particular person; a named location; and a named company, active in a particular industry sector. PAGE 70j j JOURNAL OF KNOWLEDGE MANAGEMENT VOL. 9 NO

8 Those documents, of course, have to been previously fetched and annotated by the web crawling part of the system. 3.2 Future developments The focus of the development is to provide a user with the ability to have regular content delivery, based around the carrying out of semantic queries, rather than the development of improved web crawling techniques. As this is the first stage prototype, the agents have been quite naive. It is envisaged that the next stages of development will increase the level of sophistication: More sophisticated feedback and learning for the agents. Agent systems, such as ProSearch, have already been used to incrementally build complex queries to represent a user s long term interests (see Davies et al., 1998 and Davies, 2000 for more details). For the next stage of the SEKTagent, as a user reads documents located by the agent s usage, feedback will be collected. From this usage information, complex semantic queries would be created to represent searches such as find information about iotech Ltd, in Germany, but not containing information about the CEO. Ability to update and detect changes to content that has already been added to the semantic repository. Keeping usage information is central to another strand of development. The information contained on the web is not static, with page contents changing on a daily basis. As the agent s role is to represent a user s interests, the agent should be able to identify when, and if, a page has changed and whether the use should be notified about the page contents. Even if a user has already seen a document, the agent must be able to decide whether the page has changed sufficiently to notify the user again. Incorporate background knowledge ontology and user profiles. We are developing an ontology, called PROTON[5], to represent the classes and relationships required to model a knowledge sharing community. Currently, the SEKTagent stores the queries within a simple database. The enhanced SEKTagent will represent the user and the user s interests through the Profile class, within PROTON. It is envisaged that by representing interests in this fashion, further inferences can be made about documents that a user will find useful. For example, a user would be able to state interests in topics, such as metallurgy, rather than having to express interests in terms of complex search strings. 4. Visualisation In addition to the ontology-based knowledge access to unstructured content discussed above, new methods are needed to provide user-friendly visualisation of the ontology itself. This needs to be intuitive, personalisable, and abstracted away from the formal representation of the knowledge as concepts, properties, and axioms. A number of relevant approaches is presented next. 4.1 Existing approaches for visualisation In the survey of semantic web visualisation tools (Sevilla et al., 2004) there is a list of current tools that allow the direct translation of ontologies into browsable formats. Many of those tools exactly reproduce the ontology structure in a visual formalism, without taking into account external constraints, such as usability issues (many ontology concepts have not been designed for visualisation purposes), browsing issues (some concepts, especially those representing relations introduces tedious browsing paths) nor the possibility of applying user defined rules for personalization or even interaction (for instance, in this kind of application cannot visualise the differences between instances from the same concept and detect separated instance groups). Some of the applications of this kind are: Spectacle developed by Aduna (Harmelen et al., 2001), Jambalaya[6], IsaViz[7], Ontorama (Eklund et al., 2002) or RDFSVisualizer[8]. The following are the requirements for the visualisation tool development that we have identified as important: VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 71

9 the existence of user profiles for the visualisation of the ontology. These approaches visualise the ontology knowledge as it is stored, without filtering it or transforming it for the different types of users (sometimes the ontology is modelled in ways that should not be presented to non-advanced users, and only the instances of a few classes or the aggregation of instances of several classes have to be presented); the specificities of visualising knowledge and its hierarchical structure, the explicit relationships between them, etc.; and easy configuration of how to present different types of instances. 4.2 Description of the approach The experience obtained as a result of the development of several applications based on ontologies, showed us that the knowledge base as modelled by domain experts and knowledge engineers is not always a good candidate for visualisation as is. Since many relations in this domain were modelled as explicit concepts, navigation became tedious and unfriendly. The main purpose for building ontologies is to provide semantic content for intelligent systems. The knowledge models are designed to offer the appropriate information to be exploited by the software. No visualisation criteria are used to build an ontology and often the information is not suitable to be published as it is, for example: concepts may have too many attributes; when relations are represented as independent concepts (first class objects) the navigation becomes tedious; and concepts to be shown do not always correspond to modelled ones. Therefore we felt a need for explicit visualisation rules that allow the creation of views on the domain ontology, in order to visualize only the relevant information in a user friendly way and filter according to using profiles. We introduced the concept of a visualisation ontology, which makes explicit all visualisation rules and allows easy interface management. This ontology will contain concepts and instances (publication entities) as seen on the interface by the end user, and it will retrieve the attribute values from the domain ontology using a query. It does not duplicate the content of the original ontology, but links the content to publication entities using an ontology query language RDQL[9] or SeRQL[10]. This way, one ontology that represents a particular domain can be visualized through different views, and contains the necessary information about how to interact with those contents. Figure 2 shows the idea of how this approach decouples the publication of the ontology in any kind of visualisation model (be it a set of HTML pages or a 3D model) from the knowledge contained in the domain ontology. The visualisation ontology has next predefined concepts: PublicationEntity concept that encapsulates objects as they will be published in the final application. Any concept defined in the visualisation ontology will inherit from it. PublicationSlot each attribute that is going to appear at the final application should inherit from this concept. PublicationInfo this concept allows defining the mappings between the visualisation ontology and domain ontology, to specify how each of the components of the visualisation ontology will be visualized, what its behaviour will be, etc; and also include what is the user profile of this information. The future of search engines lies in supporting more of the information management process. PAGE 72j j JOURNAL OF KNOWLEDGE MANAGEMENT VOL. 9 NO

10 Figure 2 Decoupled publishing of domain ontologies using visualisation ontologies With this visualisation ontology we must specify the navigation philosophy, the visualisation aspect (2D, 3D, shapes, etc.), and the interaction with the visualised objects. All these features will depend on the user profile, on the amount of knowledge actually stored in the domain ontology, on the type of knowledge being visualised, on the use by other applications, etc. With respect to the different aspects to be used according to the type of knowledge presented to the user, Figure 3 shows an example of two of the possible graphical presentations that can be used: graphs and art galleries. An example of this approach is to visualise the information related with an Author ( Persona ), their publications or their work at different periods of time. All of this information is distributed in the domain ontology, in different concepts that represent the publications and organizations, and also other concepts that express binary relations between concepts. In order to visualize all information related with an Author, we can define a PublicationEntity (see Figure 4). In Figure 4 we can also see all the PublicationSlots (attributes of Author) that we want to visualize in the final application. Figure 3 Different visualisation contexts according to the type of knowledge presented VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 73

11 Figure 4 Example of PublicationEntity In Figure 5 we can see some examples of PublicationSlots, author s name, his/her photograph ( Foto ) or where he/she studied ( Estudio En ). Those attributes can come from several concepts that represent binary relations with specific properties, so it is also necessary to define mappings (written in a specific query language) to the domain ontology, an example of this is the PublicationSlot Publications (Obras Publicadas). Therefore PublicationEntity and PublictionSlot has an associated PublicationInfo, where the user profile, the information about behaviour, geometry, and the mapping to domain ontology are defined. An example of mapping of the concept Author publications (Obras Publicadas) is: SELECT DISTINCT resource,obra,r_obra FROM {r}, rdf:type. {, K:Relacion Creacion. }; [, K:agente_responsable. {resource}]; [, K:creacion_relacionada. {r_obra}, K:referencia. {obra}] WHERE resource like #?resource# USING NAMESPACE K ¼,! In Figure 6, we show some examples of the results at the final application. Figure 6 shows an author and its publications with a 3D graph, taking into account the information provided in the visualisation ontology. Finally, several PublicationInfo can be associated with PublicationEntity and PublicationSlot, with the difference that each one has defined different Contexts (classes that define the scene and contain specific interaction and behaviours). For instance, we can associate a PublicationInfo to the GRAPH context and another to the HALL-LIRARY (as shown in Figure 5 Examples of PublicationSlot PAGE 74j j JOURNAL OF KNOWLEDGE MANAGEMENT VOL. 9 NO

12 Figure 6 Visualisation of an author (left) and of the author and its publications (right) using the GRAPH context and the cube geometry Figure 7 for the same information shown in the previous figures). This is useful to customise the visualisation for different types of users with different skills (expert, non-expert, etc.). 5. Knowledge generation Natural language generation (NLG) takes structured data in a knowledge base as input and produces natural language text, tailored to the presentational context and the target reader (Reiter and Dale, 2000). NLG techniques use and build models of the context and the user Figure 7 Example of HALL-LIRARY context, showing the same publications as those in Figure 6, visualised as books in a library VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 75

13 and use them to select appropriate presentation strategies, e.g. to deliver short summaries to the user s WAP phone or a longer multimodal text if the user is using their desktop. In the context of semantic web or knowledge management, NLG is required to provide automated documentation of ontologies and knowledge bases. Unlike human-written texts, an automatic approach will constantly keep the text up-to-date which is vitally important in the semantic web context, where knowledge is dynamic and is updated frequently. The NLG approach also allows generation in multiple languages without the need for human or automatic translation (see Aguado et al., 1998). Generation of natural language text from ontologies is an important problem, firstly because textual documentation is more readable than the corresponding formal notations and thus helps users who are not knowledge engineers to understand and use ontologies. Secondly, a number of applications have now started using ontologies to encode and reason with internally, but this formal knowledge needs to be also expressed in natural language in order to produce reports, letters, etc. In other words, NLG can be used to present structured information in a user-friendly way. There are several advantages to using NLG rather than using fixed templates where the query results are filled in: NLG can use different sentence structures depending on the number of query results, e.g. conjunction versus itemised list; depending on the user s profile of their interests, NLG can include different types of information affiliations, addresses, publication lists, indications on collaborations (derived from project information); and given this variety of what information from the ontology can be included and how it can be presented, depending on its type and amount, writing templates will be unfeasible because there will be too many combinations to be covered. This variation comes from the fact that each user of the system has a profile comprising of user supplied (or system derived) personal information (name, contact details, experience, projects worked on), plus information derived semi-automatically from the user s interaction with other applications. Therefore, there will be a need to tailor the generated presentations according to user s profile. NLG systems that are specifically targeted towards semantic web ontologies have started to emerge only recently. For example, there are some general purpose ontology verbalisers for RDF and DAML þ OIL (Wilcock and Jokinen, 2003) and OWL (Wilcock, 2003). They are based on templates and follow closely the ontology constructs, e.g.: This is a description of John Smith identified by His given name is John... }(Wilcock, 2003). The advantages of Wilcock s approach (Wilcock and Jokinen, 2003; Wilcock, 2003) is that it is fully automatic and does not require a lexicon. A more recent system which generates reports from RDF and DAML ontologies is MIAKT (ontcheva and Wilks, 2004). In contrast to Wilcock s approach, MIAKT requires some manual input (lexicons and domain schemas), but on the other hand it generates more fluent reports, oriented towards end-users, not ontology builders. It also uses reasoning and the property hierarchy to avoid repetitions, enable more generic text schemas, and perform aggregation. Our work extends the MIAKTapproach towards making it less domain dependent and easier to configure by non-nlg experts. A novel dimension is the focus on tailoring the summary formatting and length according to a device profile (e.g. mobile phone, web browser). Another innovative idea is the use of ontology mapping for summary generation from different ontologies. Summary generation in our system (called ONTOSUM) starts off by being given a set of statements (i.e. triples), in the form of RDF/OWL. Since there is some repetition, these triples are first pre-processed to remove already said facts. In addition to triples that have the same PAGE 76j j JOURNAL OF KNOWLEDGE MANAGEMENT VOL. 9 NO

14 property and arguments, the system also removes triples involving inverse properties with the same arguments as those of an already verbalised one. The information about inverse properties is provided by the ontology (if supported by the representation formalism). An example summary is shown in Figure 8. The lexicalisations of concepts and properties in the ontology can be specified by the ontology engineer, be taken to be the same as concept names themselves, or added manually as part of the customisation process. For instance, the AKT ontology[11] provides label statements for some of its concepts and instances, which are found and imported in the lexicon automatically. ONTOSUM is parameterised at run time by specifying which properties are to be used for building the lexicon. Summary structuring is done using discourse/text schemas (Reiter and Dale, 2000), which are script-like structures which represent discourse patterns. They can be applied recursively to generate coherent multisentential text. In more concrete terms, when given a set of statements about a given concept/instance, discourse schemas are used to impose an order on them, such that the resulting summary is coherent. For the purposes of our system, a coherent summary is a summary where similar statements are grouped together. The schemas are independent of the concrete domain and rely only on a core set of four basic properties active-action, passive-action, attribute, and part-whole. When a new ontology is connected to ONTOSUM, properties can be defined as a sub-property of one of these gour generic ones and then ONTOSUM will be able to verbalise them without any modifications to the discourse schemas. However, if more specialised treatment of some properties is required, it is possible to enhance the schema library with new patterns, that apply only to a specific property. Next ONTOSUM performs semantic aggregation, i.e., it joins RDF statements with the same property name and domain as one conceptual graph. Without this aggregation step, there will be three separate sentences instead of one bullet list (see Figure 8), resulting in a less coherent text. Finally, ONTOSUM verbalises the statements using the HYLITE þ surface realiser. The output is a textual summary. The overall system architecture is shown in Figure 9 and further details can be found in ontcheva (2005). An innovative aspect of ONTOSUM, in comparison to previous NLG systems for the semantic web, is that it implements tailoring/personalisation based on information from the user s device profile. Most specifically, we developed methods for generating summaries within a given length restriction (e.g. 160 characters for mobile phones) and in different formats HTML for browsers and plain texts for s and mobile phones (ontcheva, Figure 8 Example of a generated summary VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 77

15 Figure 9 Knowledge generation architecture 2005). The following section discusses a complementary approach to device independent knowledge access and future work will focus on combining the two. Another novel feature of ONTOSUM is its use of ontology mapping rules (de ruijn et al., 2004) to enable users to run the system on new ontologies, without any customisation efforts. 6. Device independence An increasingly important and frequent requirement towards knowledge access tools is that they are accessible via any web-enabled device. This includes PCs, PDAs, mobile phones, and speech processing devices. In order to meet this objective we have developed a device independent web application framework (DIWAF) written as a Java servlet, which has been designed to support the creation of device independent web sites. The aim of device independence is to provide an effective user interface to a web application for devices with widely varying capabilities, without having to write a separate site for each class of devices. The problem can be broken down into the following steps: identify the capabilities of the current device, taking user preferences into account; select suitable content; and adapt the content to the target device. In the DIWAF prototype, device characteristics are handled using the CC/PP standard, and content is adapted to its target device by user-defined templates. Content selection may be carried out either by the application, or by placing conditions on templates. These processes are described in more detail in the following sections. 6.1 Identifying device capabilities There is no universally accepted and supported method of communicating device requirements to the server. However, the most promising standard is Composite Capability/Preference Profiles (CC/PP)[12], a W3C recommendation. In this standard, the sending device extends each HTTP request with a reference to a default device profile and, optionally, a set of over-rides, or profile-diffs. The default profile reference takes the form of the URL of an RDF document describing the device. In the current implementation of the DIWAF, CC/PP profile information is handled by a standard open source Java implementation produced by the Java Community Process, led by Sun Microsystems. Aspects of the DELI system (utler, 2002) are used to handle default PAGE 78j j JOURNAL OF KNOWLEDGE MANAGEMENT VOL. 9 NO

16 We envision knowledge workplaces where the boundaries between document management, content management and knowledge management are broken down. behaviour if the request contains no CC/PP header. The profile information is made available to the servlet as a collection of attributes, such as screen size, browser name, etc. These attributes can be used to inform the subsequent selection and adaptation of content. However, CC/PP by itself is not sufficient to meet all the requirements. For example, one requirement is to push information to users when new documents in their domain of interest become available, using , SMS or WAP push technology. Since these messages are not a response to an HTTP request, CC/PP cannot be used, and the server must maintain a user profile describing the devices held by each registered user. 6.2 Selecting suitable content ecause different devices have different capabilities it is often necessary to select different content. For example, images should not be sent to a device that cannot display them; short, or abbreviated descriptions may be best for small screen devices, with a fuller description available for larger screens and so on. In many cases the DIWAF software can do this selection automatically, or semi-automatically as described below. In other cases, more sophisticated techniques might be appropriate. For example, the output of a Natural language processing engine might be adapted according to the required text length. In these cases, the DIWAF engine must pass profile information on to the client software generating the data. It is worth noting that the device profiles sometimes need to be interpreted to provide an adequate description. For example, the number of characters that can fit on one line depends on both the physical screen size, and the font size chosen by the user. It is useful to allow author-defined extensions to the profile based on combinations of standard attributes, known as capability classes. 6.3 Adapting the content The hardest problem in achieving device independence is adapting the selected content to the current device. The output must be in a suitable language, and must specify the required geometrical layout and style. In principle, artificial intelligence techniques could be used to understand the information to be presented and construct a suitable representation of it on the fly, possibly under the guidance of style rules. However, this is still a matter for research, and for the current prototype, simpler techniques are used. One thrust of research is to add metadata to the content describing its structure. For example, the metadata may be used to label headings and subheadings, input controls, and blocks of text. Different software drivers can then use this information to create an effective layout. HTML is a good example of this approach, and it has proved successful for many years. Technology such as CSS Media Queries (which allows selection of different elements based on CC/PP characteristics), and new small screen rendering technology, together with extensions to HTML such as XFORMS, promise to push the boundaries further. The main objection to this approach is that in order to make use of the full capabilities of each device they need to be given different metadata. A page that has been carefully constructed to look effective on a PC will not necessarily be a good starting point for speech generation. VOL. 9 NO j j JOURNAL OF KNOWLEDGE MANAGEMENT PAGE 79

User Profiling for Interest-focused Browsing History

User Profiling for Interest-focused Browsing History User Profiling for Interest-focused Browsing History Miha Grčar, Dunja Mladenič, Marko Grobelnik Jozef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia {Miha.Grcar, Dunja.Mladenic, Marko.Grobelnik}@ijs.si

More information

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction

Adaptable and Adaptive Web Information Systems. Lecture 1: Introduction Adaptable and Adaptive Web Information Systems School of Computer Science and Information Systems Birkbeck College University of London Lecture 1: Introduction George Magoulas gmagoulas@dcs.bbk.ac.uk October

More information

Semantic Web Technologies Trends and Research in Ontology-based Systems

Semantic Web Technologies Trends and Research in Ontology-based Systems Semantic Web Technologies Trends and Research in Ontology-based Systems John Davies BT, UK Rudi Studer University of Karlsruhe, Germany Paul Warren BT, UK John Wiley & Sons, Ltd Contents Foreword xi 1.

More information

Proposal for Implementing Linked Open Data on Libraries Catalogue

Proposal for Implementing Linked Open Data on Libraries Catalogue Submitted on: 16.07.2018 Proposal for Implementing Linked Open Data on Libraries Catalogue Esraa Elsayed Abdelaziz Computer Science, Arab Academy for Science and Technology, Alexandria, Egypt. E-mail address:

More information

Generating Tailored Textual Summaries from Ontologies

Generating Tailored Textual Summaries from Ontologies Generating Tailored Textual Summaries from Ontologies Kalina Bontcheva Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield, UK kalina@dcs.shef.ac.uk

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Information Retrieval

Information Retrieval Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have

More information

Ontology-based Architecture Documentation Approach

Ontology-based Architecture Documentation Approach 4 Ontology-based Architecture Documentation Approach In this chapter we investigate how an ontology can be used for retrieving AK from SA documentation (RQ2). We first give background information on the

More information

Content Enrichment. An essential strategic capability for every publisher. Enriched content. Delivered.

Content Enrichment. An essential strategic capability for every publisher. Enriched content. Delivered. Content Enrichment An essential strategic capability for every publisher Enriched content. Delivered. An essential strategic capability for every publisher Overview Content is at the centre of everything

More information

INTRODUCTION. Chapter GENERAL

INTRODUCTION. Chapter GENERAL Chapter 1 INTRODUCTION 1.1 GENERAL The World Wide Web (WWW) [1] is a system of interlinked hypertext documents accessed via the Internet. It is an interactive world of shared information through which

More information

Chapter 27 Introduction to Information Retrieval and Web Search

Chapter 27 Introduction to Information Retrieval and Web Search Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval

More information

Emerging Technologies in Knowledge Management By Ramana Rao, CTO of Inxight Software, Inc.

Emerging Technologies in Knowledge Management By Ramana Rao, CTO of Inxight Software, Inc. Emerging Technologies in Knowledge Management By Ramana Rao, CTO of Inxight Software, Inc. This paper provides an overview of a presentation at the Internet Librarian International conference in London

More information

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google,

In the recent past, the World Wide Web has been witnessing an. explosive growth. All the leading web search engines, namely, Google, 1 1.1 Introduction In the recent past, the World Wide Web has been witnessing an explosive growth. All the leading web search engines, namely, Google, Yahoo, Askjeeves, etc. are vying with each other to

More information

Managing Learning Objects in Large Scale Courseware Authoring Studio 1

Managing Learning Objects in Large Scale Courseware Authoring Studio 1 Managing Learning Objects in Large Scale Courseware Authoring Studio 1 Ivo Marinchev, Ivo Hristov Institute of Information Technologies Bulgarian Academy of Sciences, Acad. G. Bonchev Str. Block 29A, Sofia

More information

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE YING DING 1 Digital Enterprise Research Institute Leopold-Franzens Universität Innsbruck Austria DIETER FENSEL Digital Enterprise Research Institute National

More information

Web Portal : Complete ontology and portal

Web Portal : Complete ontology and portal Web Portal : Complete ontology and portal Mustafa Jarrar, Ben Majer, Robert Meersman, Peter Spyns VUB STARLab, Pleinlaan 2 1050 Brussel {Ben.Majer,Mjarrar,Robert.Meersman,Peter.Spyns}@vub.ac.be, www.starlab.vub.ac.be

More information

DLV02.01 Business processes. Study on functional, technical and semantic interoperability requirements for the Single Digital Gateway implementation

DLV02.01 Business processes. Study on functional, technical and semantic interoperability requirements for the Single Digital Gateway implementation Study on functional, technical and semantic interoperability requirements for the Single Digital Gateway implementation 18/06/2018 Table of Contents 1. INTRODUCTION... 7 2. METHODOLOGY... 8 2.1. DOCUMENT

More information

A Tagging Approach to Ontology Mapping

A Tagging Approach to Ontology Mapping A Tagging Approach to Ontology Mapping Colm Conroy 1, Declan O'Sullivan 1, Dave Lewis 1 1 Knowledge and Data Engineering Group, Trinity College Dublin {coconroy,declan.osullivan,dave.lewis}@cs.tcd.ie Abstract.

More information

KM COLUMN. How to evaluate a content management system. Ask yourself: what are your business goals and needs? JANUARY What this article isn t

KM COLUMN. How to evaluate a content management system. Ask yourself: what are your business goals and needs? JANUARY What this article isn t KM COLUMN JANUARY 2002 How to evaluate a content management system Selecting and implementing a content management system (CMS) will be one of the largest IT projects tackled by many organisations. With

More information

PROJECT PERIODIC REPORT

PROJECT PERIODIC REPORT PROJECT PERIODIC REPORT Grant Agreement number: 257403 Project acronym: CUBIST Project title: Combining and Uniting Business Intelligence and Semantic Technologies Funding Scheme: STREP Date of latest

More information

Information Retrieval (IR) through Semantic Web (SW): An Overview

Information Retrieval (IR) through Semantic Web (SW): An Overview Information Retrieval (IR) through Semantic Web (SW): An Overview Gagandeep Singh 1, Vishal Jain 2 1 B.Tech (CSE) VI Sem, GuruTegh Bahadur Institute of Technology, GGS Indraprastha University, Delhi 2

More information

Authoring and Maintaining of Educational Applications on the Web

Authoring and Maintaining of Educational Applications on the Web Authoring and Maintaining of Educational Applications on the Web Denis Helic Institute for Information Processing and Computer Supported New Media ( IICM ), Graz University of Technology Graz, Austria

More information

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai. UNIT-V WEB MINING 1 Mining the World-Wide Web 2 What is Web Mining? Discovering useful information from the World-Wide Web and its usage patterns. 3 Web search engines Index-based: search the Web, index

More information

The Topic Specific Search Engine

The Topic Specific Search Engine The Topic Specific Search Engine Benjamin Stopford 1 st Jan 2006 Version 0.1 Overview This paper presents a model for creating an accurate topic specific search engine through a focussed (vertical)

More information

OntoShare An Ontology-based Knowledge Sharing System for Virtual Communities of Practice

OntoShare An Ontology-based Knowledge Sharing System for Virtual Communities of Practice OntoShare An Ontology-based Knowledge Sharing System for Virtual Communities of Practice John Davies, Alistair Duke BTexact, Orion 5/12, Adastral Park, Ipswich IP5 3RE, UK john.nj.davies@bt.com, alistair.duke@bt.com

More information

Semantic-Based Web Mining Under the Framework of Agent

Semantic-Based Web Mining Under the Framework of Agent Semantic-Based Web Mining Under the Framework of Agent Usha Venna K Syama Sundara Rao Abstract To make automatic service discovery possible, we need to add semantics to the Web service. A semantic-based

More information

Design concepts for data-intensive applications

Design concepts for data-intensive applications 6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Design concepts for data-intensive applications Attila Adamkó Department of Information Technology, Institute of

More information

Jumpstarting the Semantic Web

Jumpstarting the Semantic Web Jumpstarting the Semantic Web Mark Watson. Copyright 2003, 2004 Version 0.3 January 14, 2005 This work is licensed under the Creative Commons Attribution-NoDerivs-NonCommercial License. To view a copy

More information

4. You should provide direct links to the areas of your site that you feel are most in demand.

4. You should provide direct links to the areas of your site that you feel are most in demand. Chapter 2: Web Site Design Principles TRUE/FALSE 1. Almost every Web site has at least one flaw. T PTS: 1 REF: 49 2. Not only should you plan for a deliberate look and feel for your Web site, but you must

More information

MESH. Multimedia Semantic Syndication for Enhanced News Services. Project Overview

MESH. Multimedia Semantic Syndication for Enhanced News Services. Project Overview MESH Multimedia Semantic Syndication for Enhanced News Services Project Overview Presentation Structure 2 Project Summary Project Motivation Problem Description Work Description Expected Result The MESH

More information

Designing a System Engineering Environment in a structured way

Designing a System Engineering Environment in a structured way Designing a System Engineering Environment in a structured way Anna Todino Ivo Viglietti Bruno Tranchero Leonardo-Finmeccanica Aircraft Division Torino, Italy Copyright held by the authors. Rubén de Juan

More information

SEXTANT 1. Purpose of the Application

SEXTANT 1. Purpose of the Application SEXTANT 1. Purpose of the Application Sextant has been used in the domains of Earth Observation and Environment by presenting its browsing and visualization capabilities using a number of link geospatial

More information

A Content Based Image Retrieval System Based on Color Features

A Content Based Image Retrieval System Based on Color Features A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris

More information

Automated Online News Classification with Personalization

Automated Online News Classification with Personalization Automated Online News Classification with Personalization Chee-Hong Chan Aixin Sun Ee-Peng Lim Center for Advanced Information Systems, Nanyang Technological University Nanyang Avenue, Singapore, 639798

More information

DITA for Enterprise Business Documents Sub-committee Proposal Background Why an Enterprise Business Documents Sub committee

DITA for Enterprise Business Documents Sub-committee Proposal Background Why an Enterprise Business Documents Sub committee DITA for Enterprise Business Documents Sub-committee Proposal Background Why an Enterprise Business Documents Sub committee Documents initiate and record business change. It is easy to map some business

More information

The Semantic Web & Ontologies

The Semantic Web & Ontologies The Semantic Web & Ontologies Kwenton Bellette The semantic web is an extension of the current web that will allow users to find, share and combine information more easily (Berners-Lee, 2001, p.34) This

More information

Enterprise Multimedia Integration and Search

Enterprise Multimedia Integration and Search Enterprise Multimedia Integration and Search José-Manuel López-Cobo 1 and Katharina Siorpaes 1,2 1 playence, Austria, 2 STI Innsbruck, University of Innsbruck, Austria {ozelin.lopez, katharina.siorpaes}@playence.com

More information

Information System Architecture. Indra Tobing

Information System Architecture. Indra Tobing Indra Tobing What is IS Information architecture is the term used to describe the structure of a system, i.e the way information is grouped, the navigation methods and terminology used within the system.

More information

ResolutionDefinition - PILIN Team Wiki - Trac. Resolve. Retrieve. Reveal Association. Facets. Indirection. Association data. Retrieval Key.

ResolutionDefinition - PILIN Team Wiki - Trac. Resolve. Retrieve. Reveal Association. Facets. Indirection. Association data. Retrieval Key. Resolve. Retrieve. Reveal Association. Facets. Indirection. Association data. Retrieval Key. ResolutionDefinitionBackground 1. Other definitions XRI definition: Resolution is the function of dereferencing

More information

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment

Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information

More information

A B2B Search Engine. Abstract. Motivation. Challenges. Technical Report

A B2B Search Engine. Abstract. Motivation. Challenges. Technical Report Technical Report A B2B Search Engine Abstract In this report, we describe a business-to-business search engine that allows searching for potential customers with highly-specific queries. Currently over

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

DCMI Abstract Model - DRAFT Update

DCMI Abstract Model - DRAFT Update 1 of 7 9/19/2006 7:02 PM Architecture Working Group > AMDraftUpdate User UserPreferences Site Page Actions Search Title: Text: AttachFile DeletePage LikePages LocalSiteMap SpellCheck DCMI Abstract Model

More information

CSC9Y4 Programming Language Paradigms Spring 2013

CSC9Y4 Programming Language Paradigms Spring 2013 CSC9Y4 Programming Language Paradigms Spring 2013 Assignment: Programming Languages Report Date Due: 4pm Monday, April 15th 2013 Your report should be printed out, and placed in the box labelled CSC9Y4

More information

Writing for the web and SEO. University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1

Writing for the web and SEO. University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1 Writing for the web and SEO University of Manchester Humanities T4 Guides Writing for the web and SEO Page 1 Writing for the web and SEO Writing for the web and SEO... 2 Writing for the web... 3 Change

More information

XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI

XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI Chapter 18 XML ALONE IS NOT SUFFICIENT FOR EFFECTIVE WEBEDI Fábio Ghignatti Beckenkamp and Wolfgang Pree Abstract: Key words: WebEDI relies on the Internet infrastructure for exchanging documents among

More information

Comp 336/436 - Markup Languages. Fall Semester Week 2. Dr Nick Hayward

Comp 336/436 - Markup Languages. Fall Semester Week 2. Dr Nick Hayward Comp 336/436 - Markup Languages Fall Semester 2017 - Week 2 Dr Nick Hayward Digitisation - textual considerations comparable concerns with music in textual digitisation density of data is still a concern

More information

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial.

A tutorial report for SENG Agent Based Software Engineering. Course Instructor: Dr. Behrouz H. Far. XML Tutorial. A tutorial report for SENG 609.22 Agent Based Software Engineering Course Instructor: Dr. Behrouz H. Far XML Tutorial Yanan Zhang Department of Electrical and Computer Engineering University of Calgary

More information

Extracting knowledge from Ontology using Jena for Semantic Web

Extracting knowledge from Ontology using Jena for Semantic Web Extracting knowledge from Ontology using Jena for Semantic Web Ayesha Ameen I.T Department Deccan College of Engineering and Technology Hyderabad A.P, India ameenayesha@gmail.com Khaleel Ur Rahman Khan

More information

C1 CMS User Guide Orckestra, Europe Nygårdsvej 16 DK-2100 Copenhagen Phone

C1 CMS User Guide Orckestra, Europe Nygårdsvej 16 DK-2100 Copenhagen Phone 2017-02-13 Orckestra, Europe Nygårdsvej 16 DK-2100 Copenhagen Phone +45 3915 7600 www.orckestra.com Content 1 INTRODUCTION... 4 1.1 Page-based systems versus item-based systems 4 1.2 Browser support 5

More information

Introduction to Semantic Web

Introduction to Semantic Web ه عا ی Semantic Web Introduction to Semantic Web Morteza Amini Sharif University of Technology Fall 95-96 Outline Thinking and Intelligent Applications The World Wide Web History The Problem with the Web

More information

An Approach To Web Content Mining

An Approach To Web Content Mining An Approach To Web Content Mining Nita Patil, Chhaya Das, Shreya Patanakar, Kshitija Pol Department of Computer Engg. Datta Meghe College of Engineering, Airoli, Navi Mumbai Abstract-With the research

More information

Information Discovery, Extraction and Integration for the Hidden Web

Information Discovery, Extraction and Integration for the Hidden Web Information Discovery, Extraction and Integration for the Hidden Web Jiying Wang Department of Computer Science University of Science and Technology Clear Water Bay, Kowloon Hong Kong cswangjy@cs.ust.hk

More information

Purpose, features and functionality

Purpose, features and functionality Topic 6 Purpose, features and functionality In this topic you will look at the purpose, features, functionality and range of users that use information systems. You will learn the importance of being able

More information

Next-Generation Standards Management with IHS Engineering Workbench

Next-Generation Standards Management with IHS Engineering Workbench ENGINEERING & PRODUCT DESIGN Next-Generation Standards Management with IHS Engineering Workbench The addition of standards management capabilities in IHS Engineering Workbench provides IHS Standards Expert

More information

Generalized Document Data Model for Integrating Autonomous Applications

Generalized Document Data Model for Integrating Autonomous Applications 6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Generalized Document Data Model for Integrating Autonomous Applications Zsolt Hernáth, Zoltán Vincellér Abstract

More information

Support notes (Issue 1) September 2018

Support notes (Issue 1) September 2018 Support notes (Issue 1) September 2018 Pearson Edexcel Level 2 Certificate/Diploma in Digital Applications (DA202) Unit 2: Creative Multimedia ONCE UPON A TIME Key points for this Summative Project Brief

More information

Just-In-Time Hypermedia

Just-In-Time Hypermedia A Journal of Software Engineering and Applications, 2013, 6, 32-36 doi:10.4236/jsea.2013.65b007 Published Online May 2013 (http://www.scirp.org/journal/jsea) Zong Chen 1, Li Zhang 2 1 School of Computer

More information

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge

Parmenides. Semi-automatic. Ontology. construction and maintenance. Ontology. Document convertor/basic processing. Linguistic. Background knowledge Discover hidden information from your texts! Information overload is a well known issue in the knowledge industry. At the same time most of this information becomes available in natural language which

More information

4D CONSTRUCTION SEQUENCE PLANNING NEW PROCESS AND DATA MODEL

4D CONSTRUCTION SEQUENCE PLANNING NEW PROCESS AND DATA MODEL 4D CONSTRUCTION SEQUENCE PLANNING NEW PROCESS AND DATA MODEL Jan Tulke 1, Jochen Hanff 2 1 Bauhaus-University Weimar, Dept. Informatics in Construction, Germany 2 HOCHTIEF ViCon GmbH,Essen, Germany ABSTRACT:

More information

Development of an Ontology-Based Portal for Digital Archive Services

Development of an Ontology-Based Portal for Digital Archive Services Development of an Ontology-Based Portal for Digital Archive Services Ching-Long Yeh Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd. 3rd Sec. Taipei, 104, Taiwan chingyeh@cse.ttu.edu.tw

More information

Lecture Notes on Programming Languages

Lecture Notes on Programming Languages Lecture Notes on Programming Languages 85 Lecture 09: Support for Object-Oriented Programming This lecture discusses how programming languages support object-oriented programming. Topics to be covered

More information

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN NOTES ON OBJECT-ORIENTED MODELING AND DESIGN Stephen W. Clyde Brigham Young University Provo, UT 86402 Abstract: A review of the Object Modeling Technique (OMT) is presented. OMT is an object-oriented

More information

Mymory: Enhancing a Semantic Wiki with Context Annotations

Mymory: Enhancing a Semantic Wiki with Context Annotations Mymory: Enhancing a Semantic Wiki with Context Annotations Malte Kiesel, Sven Schwarz, Ludger van Elst, and Georg Buscher Knowledge Management Department German Research Center for Artificial Intelligence

More information

Semantic Annotation, Search and Analysis

Semantic Annotation, Search and Analysis Semantic Annotation, Search and Analysis Borislav Popov, Ontotext Ontology A machine readable conceptual model a common vocabulary for sharing information machine-interpretable definitions of concepts in

More information

Chapter 7 Semantically Enhanced Search and Browse

Chapter 7 Semantically Enhanced Search and Browse Chapter 7 Semantically Enhanced Search and Browse Alistair Duke and Jörg Heizmann Abstract Squirrel, a search and browse tool that provides access to semantically annotated data is described. The tool

More information

Definition of Information Systems

Definition of Information Systems Information Systems Modeling To provide a foundation for the discussions throughout this book, this chapter begins by defining what is actually meant by the term information system. The focus is on model-driven

More information

Device Independent Principles for Adapted Content Delivery

Device Independent Principles for Adapted Content Delivery Device Independent Principles for Adapted Content Delivery Tayeb Lemlouma 1 and Nabil Layaïda 2 OPERA Project Zirst 655 Avenue de l Europe - 38330 Montbonnot, Saint Martin, France Tel: +33 4 7661 5281

More information

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck

Lecture Telecooperation. D. Fensel Leopold-Franzens- Universität Innsbruck Lecture Telecooperation D. Fensel Leopold-Franzens- Universität Innsbruck First Lecture: Introduction: Semantic Web & Ontology Introduction Semantic Web and Ontology Part I Introduction into the subject

More information

CTI Higher Certificate in Information Systems (Internet Development)

CTI Higher Certificate in Information Systems (Internet Development) CTI Higher Certificate in Information Systems (Internet Development) Module Descriptions 2015 1 Higher Certificate in Information Systems (Internet Development) (1 year full-time, 2½ years part-time) Computer

More information

Semantic Web Lecture Part 1. Prof. Do van Thanh

Semantic Web Lecture Part 1. Prof. Do van Thanh Semantic Web Lecture Part 1 Prof. Do van Thanh Overview of the lecture Part 1 Why Semantic Web? Part 2 Semantic Web components: XML - XML Schema Part 3 - Semantic Web components: RDF RDF Schema Part 4

More information

CONTEXT-SENSITIVE VISUAL RESOURCE BROWSER

CONTEXT-SENSITIVE VISUAL RESOURCE BROWSER CONTEXT-SENSITIVE VISUAL RESOURCE BROWSER Oleksiy Khriyenko Industrial Ontologies Group, Agora Center, University of Jyväskylä P.O. Box 35(Agora), FIN-40014 Jyväskylä, Finland ABSTRACT Now, when human

More information

Semantic Clickstream Mining

Semantic Clickstream Mining Semantic Clickstream Mining Mehrdad Jalali 1, and Norwati Mustapha 2 1 Department of Software Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran 2 Department of Computer Science, Universiti

More information

D WSMO Data Grounding Component

D WSMO Data Grounding Component Project Number: 215219 Project Acronym: SOA4All Project Title: Instrument: Thematic Priority: Service Oriented Architectures for All Integrated Project Information and Communication Technologies Activity

More information

Integration of distributed data sources for mobile services

Integration of distributed data sources for mobile services Integration of distributed data sources for mobile services Gianpietro Ammendola, Alessandro Andreadis, Giuliano Benelli, Giovanni Giambene Dipartimento di Ingegneria dell Informazione, Università di Siena

More information

Chapter 6: Information Retrieval and Web Search. An introduction

Chapter 6: Information Retrieval and Web Search. An introduction Chapter 6: Information Retrieval and Web Search An introduction Introduction n Text mining refers to data mining using text documents as data. n Most text mining tasks use Information Retrieval (IR) methods

More information

6.871 Expert System: WDS Web Design Assistant System

6.871 Expert System: WDS Web Design Assistant System 6.871 Expert System: WDS Web Design Assistant System Timur Tokmouline May 11, 2005 1 Introduction Today, despite the emergence of WYSIWYG software, web design is a difficult and a necessary component of

More information

INTELLIGENT SYSTEMS OVER THE INTERNET

INTELLIGENT SYSTEMS OVER THE INTERNET INTELLIGENT SYSTEMS OVER THE INTERNET Web-Based Intelligent Systems Intelligent systems use a Web-based architecture and friendly user interface Web-based intelligent systems: Use the Web as a platform

More information

All Adobe Digital Design Vocabulary Absolute Div Tag Allows you to place any page element exactly where you want it Absolute Link Includes the

All Adobe Digital Design Vocabulary Absolute Div Tag Allows you to place any page element exactly where you want it Absolute Link Includes the All Adobe Digital Design Vocabulary Absolute Div Tag Allows you to place any page element exactly where you want it Absolute Link Includes the complete URL of the linked document, including the domain

More information

Describing Computer Languages

Describing Computer Languages Markus Scheidgen Describing Computer Languages Meta-languages to describe languages, and meta-tools to automatically create language tools Doctoral Thesis August 10, 2008 Humboldt-Universität zu Berlin

More information

IRS-III: A Platform and Infrastructure for Creating WSMO-based Semantic Web Services

IRS-III: A Platform and Infrastructure for Creating WSMO-based Semantic Web Services IRS-III: A Platform and Infrastructure for Creating WSMO-based Semantic Web Services John Domingue, Liliana Cabral, Farshad Hakimpour, Denilson Sell, and Enrico Motta Knowledge Media Institute, The Open

More information

WEB PAGE RE-RANKING TECHNIQUE IN SEARCH ENGINE

WEB PAGE RE-RANKING TECHNIQUE IN SEARCH ENGINE WEB PAGE RE-RANKING TECHNIQUE IN SEARCH ENGINE Ms.S.Muthukakshmi 1, R. Surya 2, M. Umira Taj 3 Assistant Professor, Department of Information Technology, Sri Krishna College of Technology, Kovaipudur,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 4, Jul-Aug 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 4, Jul-Aug 2015 RESEARCH ARTICLE OPEN ACCESS Multi-Lingual Ontology Server (MOS) For Discovering Web Services Abdelrahman Abbas Ibrahim [1], Dr. Nael Salman [2] Department of Software Engineering [1] Sudan University

More information

Understanding the workplace of the future. Artificial Intelligence series

Understanding the workplace of the future. Artificial Intelligence series Understanding the workplace of the future Artificial Intelligence series Konica Minolta Inc. 02 Cognitive Hub and the Semantic Platform Within today s digital workplace, there is a growing need for different

More information

SOME TYPES AND USES OF DATA MODELS

SOME TYPES AND USES OF DATA MODELS 3 SOME TYPES AND USES OF DATA MODELS CHAPTER OUTLINE 3.1 Different Types of Data Models 23 3.1.1 Physical Data Model 24 3.1.2 Logical Data Model 24 3.1.3 Conceptual Data Model 25 3.1.4 Canonical Data Model

More information

CTI Short Learning Programme in Internet Development Specialist

CTI Short Learning Programme in Internet Development Specialist CTI Short Learning Programme in Internet Development Specialist Module Descriptions 2015 1 Short Learning Programme in Internet Development Specialist (10 months full-time, 25 months part-time) Computer

More information

Information Retrieval and Knowledge Organisation

Information Retrieval and Knowledge Organisation Information Retrieval and Knowledge Organisation Knut Hinkelmann Content Information Retrieval Indexing (string search and computer-linguistic aproach) Classical Information Retrieval: Boolean, vector

More information

SEMANTIC SUPPORT FOR MEDICAL IMAGE SEARCH AND RETRIEVAL

SEMANTIC SUPPORT FOR MEDICAL IMAGE SEARCH AND RETRIEVAL SEMANTIC SUPPORT FOR MEDICAL IMAGE SEARCH AND RETRIEVAL Wang Wei, Payam M. Barnaghi School of Computer Science and Information Technology The University of Nottingham Malaysia Campus {Kcy3ww, payam.barnaghi}@nottingham.edu.my

More information

WPS Workbench. user guide. "To help guide you through using the WPS user interface (Workbench) to create, edit and run programs"

WPS Workbench. user guide. To help guide you through using the WPS user interface (Workbench) to create, edit and run programs WPS Workbench user guide "To help guide you through using the WPS user interface (Workbench) to create, edit and run programs" Version: 3.1.7 Copyright 2002-2018 World Programming Limited www.worldprogramming.com

More information

AN SEO GUIDE FOR SALONS

AN SEO GUIDE FOR SALONS AN SEO GUIDE FOR SALONS AN SEO GUIDE FOR SALONS Set Up Time 2/5 The basics of SEO are quick and easy to implement. Management Time 3/5 You ll need a continued commitment to make SEO work for you. WHAT

More information

Information mining and information retrieval : methods and applications

Information mining and information retrieval : methods and applications Information mining and information retrieval : methods and applications J. Mothe, C. Chrisment Institut de Recherche en Informatique de Toulouse Université Paul Sabatier, 118 Route de Narbonne, 31062 Toulouse

More information

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,

More information

CSCU9T4: Managing Information

CSCU9T4: Managing Information CSCU9T4: Managing Information CSCU9T4 Spring 2016 1 The Module Module co-ordinator: Dr Gabriela Ochoa Lectures by: Prof Leslie Smith (l.s.smith@cs.stir.ac.uk) and Dr Nadarajen Veerapen (nve@cs.stir.ac.uk)

More information

Mining Web Data. Lijun Zhang

Mining Web Data. Lijun Zhang Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems

More information

These are all examples of relatively simple databases. All of the information is textual or referential.

These are all examples of relatively simple databases. All of the information is textual or referential. 1.1. Introduction Databases are pervasive in modern society. So many of our actions and attributes are logged and stored in organised information repositories, or Databases. 1.1.01. Databases Where do

More information

Mendeley Help Guide. What is Mendeley? Mendeley is freemium software which is available

Mendeley Help Guide. What is Mendeley? Mendeley is freemium software which is available Mendeley Help Guide What is Mendeley? Mendeley is freemium software which is available Getting Started across a number of different platforms. You can run The first thing you ll need to do is to Mendeley

More information

Overview of Web Mining Techniques and its Application towards Web

Overview of Web Mining Techniques and its Application towards Web Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

JISC PALS2 PROJECT: ONIX FOR LICENSING TERMS PHASE 2 (OLT2)

JISC PALS2 PROJECT: ONIX FOR LICENSING TERMS PHASE 2 (OLT2) JISC PALS2 PROJECT: ONIX FOR LICENSING TERMS PHASE 2 (OLT2) Functional requirements and design specification for an ONIX-PL license expression drafting system 1. Introduction This document specifies a

More information

Chapter 4. Fundamental Concepts and Models

Chapter 4. Fundamental Concepts and Models Chapter 4. Fundamental Concepts and Models 4.1 Roles and Boundaries 4.2 Cloud Characteristics 4.3 Cloud Delivery Models 4.4 Cloud Deployment Models The upcoming sections cover introductory topic areas

More information