easily extended to accommodate additional languages. The multilingual design presented is reusable: most of its components do not depend on DIENST and

Size: px
Start display at page:

Download "easily extended to accommodate additional languages. The multilingual design presented is reusable: most of its components do not depend on DIENST and"

Transcription

1 Multilingual Extensions to DIENST Sarantos Kapidakis Iakovos Mavroidis y Hariklia Tsalapata z April 19, 1999 Abstract Digital libraries enable on-line access of information and provide advanced methods for material search, retrieval, and presentation. In order to support collections of documents written in several languages and to increase the applicability of digital libraries in non-english speaking countries a multilingual digital library design is necessary that supports the native languages of the users. Issues that must be taken into account in a multilingual design include limitations on the use of more than one character sets concurrently and the availability (or lack of) of metadata in languages other than english. Furthermore, the desired display language of each piece of information depends on the languages that each individual user can understand, the languages in which the documents and their metadata are available, and the locally available resources (fonts). DIENST 1 is a digital library search tool developed at Cornell University. This report describes our work on extending DIENST to support multilingual search and presentation of documents. 1 Introduction The importance of digital libraries becomes more obvious every day. Digital libraries enable on-line access of information and provide advanced methods for material search, retrieval, and presentation. They support collections of digital data distributed all over the world. Users can access the data they are interested in from their computer and study digital copies of documents without ever having to visit the the library building itself. A digital library may contain objects in several formats including text, images, sound, and video. There are many reasons why digital libraries should support several languages. Perhaps the most obvious reason is the desire to store a document in the author's native language, which in most cases is not English. Another reason is that multilinguality permits non-english speaking users to locate and retrieve documents in a language that they can understand. This report describes our work on multilinguality in digital libraries. We focused on DI- ENST, a distributed digital library search tool developed at Cornell University, which we extended with multilingualty features. Our work was on the design of the multilingual data storage and management, and the multilingual client interface. This multilingual version of DIENST supports queries and repository browsing in several languages. Moreover, this version can be Institute of Computer Science, Foundation for Research and Technology { Hellas, P.O Box 1385 Heraklion 71110, Greece. sarantos@csi.forth.gr Tel: , Fax: (contact person) y Institute of Computer Science, Foundation for Research and Technology { Hellas, Heraklion 71110, Greece. jacob@csi.forth.gr z Institute of Computer Science, Foundation for Research and Technology { Hellas, Heraklion 71110, Greece. htsalapa@csi.forth.gr 1 Multilingual and other versions maintained in the Institute of Computer Science, Greece, can be found at ~ dienst

2 easily extended to accommodate additional languages. The multilingual design presented is reusable: most of its components do not depend on DIENST and can be applied to other similar environments. The technique indicates how such interfaces can be built, avoiding extensive changes to the digital library server code. The organization of this report is as follows: Section 2 briey describes DIENST functionality by presenting the steps followed to process a query. Section 3 presents the limitations related to multilinguality posed by the current HTML standards and some solutions that have been recently developed. Section 4 describes the multilignality features we added to DIENST. Section 5 gives implementation details and discusses how the DIENST code was modied to support the new features. Section 6 provides conguration information for the support of additional languages in the multilingual version of DIENST. Section 7 concludes the report and provides future directions. 2 A brief description of DIENST functionality DIENST is a client-server application. Users query the DIENST server for information through a web browser. The web browser is also used for presentation of query results. Each object of the digital library must be \registered" with the DIENST server. Through registration the DIENST server is made aware of the object's existence and can access it on-demand. Each registered object may available under certain digital formats (e.g. ps, pdl, ascii, etc.) and has metadata associated with it. Metadata stands for \data about data". In the DIENST environment it includes information on the document, title, author, and abstract. A user can browse through a sorted list of descriptions of all objects registered with a server. In addition, the user can query DIENST on the metadata of the stored objects. DIENST locates the relevant objects and provides the user with a list of references to them. An object reference includes metadata information about the object and links through which the object can be accessed in the available digital formats. 3 Limitations and Solutions to Multilinguality in Web interfaces The 8-bit number encoding that is used for internal representation of characters allows the representation of 256 characters at a time. Since the number of characters available in the various languages is far greater than 256, characters have been split into 8-bit \character sets". A character set includes the characters of one or more languages, symbols, and control characters. Recently, the UTF8 encoding has been introduced. In UTF8 the internal character representation has been expanded to 16 bits in order to allow a range of numbers that can represent the characters of almost every language in a unied character set. Furthermore, to display an HTML page in a specic language through a web browser the page must be encoded with the corresponding character set and the language encoding of the web browser must be set appropriately. It becomes apparent that character encodings play an important role in the support of multilingual HTML interfaces. 3.1 ISO restrictions ISO-8859 [2] is a standardized series of 8-bit character sets for Western languages. US-ASCII is a 7-bit character set, which maps the numbers from 0 to 127 to the characters of the english alphabet, symbols, and control characters. ISO-8859 is an extension of US-ASCII. In ISO- 2

3 8859 the mapping of numbers 0 to 127 to characters is identical to the one used in US-ASCII. Numbers 128 to 159 are mapped to control characters. ISO-8859 is partitioned into 10 dierent character set standards, namely ISO to ISO , and supports all languages. Each ISO-8859 standard denes a dierent mapping for numbers greater than 159 and includes mappings for the special characters (not included in US-ASCII) of one or more languages. For example, ISO covers most western european languages such as Albanian, Catalan, Danish, Dutch, English, Faeroese, Finnish, French, German, etc. since it includes mappings for the special characters of these languages. Unfortunately, with current HTML specications and browser technologies characters from dierent ISO character sets can not be displayed in the same HTML page. This is a major problem faced by multilingual interfaces that are based on ISO The ISO-8859 character sets support only a limited number of languages each, and HTML pages can only display languages that are supported by the same character set. Another problem with ISO is that some languages, such as Chinese, which have more than 256 characters can not be displayed in an HTML page. Of course most languages do not fall in this category. One more problem arises with user input to an HTML form: when the form is encoded in ISO the user can only enter input in the specic character set. The user query can not include characters from character sets not supported by the same ISO standard. 3.2 UTF8 as a recent solution A recent solution to the problem of using dierent character sets in the same HTML page is the development of unicode [4], whose goal is to unify the ISO character sets into a single encoding. Unicode includes mappings for characters of all languages. The HTML browsers supporting unicode (e.g. Netscape's version 4) are able to display text in any language. Many Java interfaces use unicode to solve the problem of multilinguality in HTML pages. However, Java introduces unnecessary complexity to the majority of simple interfaces. A distributed digital library tool must support the display of text written in several languages in a single HTML page. Thus, part of our work was on the development of a unicode-based digital library interface, without the use of Java, which we applied to DIENST. Specically, the new version of DIENST supports the UTF8 character encoding, which is a subset of the unicode standard. In the UTF8 standard a character (called glyph) is represented as a 16-bit integer, thus allowing the appearence of dierent characters in the same HTML page (instead of 256 that are supported by ISO-8859). The 16-bit representation of a character under UTF8 is derived by a \unicode" number, which corresponds to the character in question. Thus, one must know the \unicode" number of a character to nd its UTF8 representation. For example the \unicode" number of the Greek letter A is the hexadecimal number 0x391 (usually written as U+391). This number is used to nd the 2-byte UTF8 representation of A. The \unicode" number of each character used in the interface of DIENST is specied in the DIENST source code (see Section 5.7). Since many browsers do not support the UTF8 standard we maintained support of the ISO standard. The user can select the standard (ISO or UTF8) in which HTML pages will be displayed. 4 The multilingual version of DIENST In the version of DIENST we developed multilinguality is employed in each step of the process of retrieving [3] a document. The user interface was also modied to accommodate the new 3

4 Figure 1: UTF8 page encoding permits the coexistence of many dierent languages features. The exibility introduced by multilinguality raises some design issues. Those issues are discussed below. 4.1 Multilingual additions to DIENST Three steps are involved in processing a user query: the user enters the query by specifying a set of keywords through an appropriately designed HTML query form, the metadata are searched for matches to the specied keywords, and nally references to matched objects are presented to the user. In the multilingual version of DIENST each of these steps may be associated with several languages, instead of just English. For example, the user query may be entered in French, the relevant object metadata may be available in French and German, and the user may select German as the language of display of the results. In an eort to be less restrictive we did not require that all these steps be associated with the same language. Instead, the language to be used in a certain step is decoupled from the language used in the previous step. The following features have been added to DIENST to enable multilingual queries: FORM-QUERY LANGUAGE: In the multilingual version of DIENST the user is able to select the language in which the HTML query form will be displayed. In most cases this will be the user's native language rather than English, which is the only language sup- 4

5 ported by previous versions of DIENST. The same language will be used for the keywords entered into the form. METADATA LANGUAGE: The metadata consists of several elds. The values of these elds may be available in several languages. Of course, it is not practical to require that the values of all elds be available in the same set of languages, and we do not pose such a restriction. Thus, the values of metadata elds may be available in dierent sets of languages. For example, the Description eld of the metadata used to describe a non-english technical report may be also available in English, while the Title eld may be available only in the author's native language. RESULT LANGUAGE: The user is able to specify the language to be used for the display of results. This language may be dierent from the language used in the HTML query form. For example, a user may want to see the DIENST query pages and enter a query in his native language but have the results to his query displayed in English, since the terminology used in the specic area of his search may be more clearly expressed in English (as may be the case with Computer Science terminology). DIENST retrieves the metadata of query hits in all available languages. This ensures that users receive all information needed to make a switch from one result language to another and the switch can be performed eciently without having to contact the DIENST server again. An important aspect of this new version of DIENST is its interoperability with older nonmultilingual versions. As several DIENST servers may cooperate in a distributed environment, compatibility of the new design with older versions of the software is desirable; in large scale installations we can not assume that all nodes use an identical version of the software. The multilingual version of DIENST overcomes backwards compatibility issues in an elegant way (see Section 5.2). Finally, the code is easily extensible to accommodate additional languages (currently, the languages supported by the multilingual version of DIENST are English, Greek, and Italian). The addition of new character sets is discussed in Section The new user interface The web user interface must be augmented to support the desired languages. One approach to this issue would be to introduce separate interfaces for each language supporting the same functionality but dierent character sets. This approach would not allow the user to switch to a dierent language at any point after beginning the query process, for example when he wants to review query results in a dierent language, as would be the case when a document is not available in a specic language. In multilingual DIENST the user is able to change the display language and page encoding at any time. Initially, we added two selection menus in every HTML page of the DIENST interface so that the user can select at any time the FORM-QUERY and RESULT languages. The selected language is used for every subsequent HTML page, until a new language is selected. Since most users did not make use of the possibility to change the RESULT language this feature was removed from the default interface in order to simplify it. In this simpler interface the RESULT language is considered to be the same as the FORM-QUERY language. Thus, only one selection menu exists in the HTML pages (see gure 1). Another selection menu was added in the starting HTML page of the DIENST interface for selecting the page encoding (ISO or UTF8) that every subsequent HTML page will be displayed in. Figure 2 displays the starting HTML page of DIENST with the two new selection menus. 5

6 The multilingual interface is used for both searching and browsing. In older versions of DIENST the user browses by author name by selecting a range of letters from the English alphabet. The server responds with all documents for which the author name starts with a letter from the selected range. In the multilingual version, all alphabets are displayed in the starting HTML page. The user may select a range from any alphabet. The browser must support the character encoding (UTF8 or an ISO character set) that the user has selected. However, the user does not have to change the page encoding of the browser \manually" when he changes the FORM-QUERY language. This is done automatically by the interface. Figure 2: In the multilingual interface, the user can select the display language and the page encoding(iso or UTF8) 4.3 A problem As mentioned above, the multilingual version of DIENST allows the translation of metadata elds to several languages. For exibility reasons, however, it does not require that all elds be available in the same set of languages. This raises the following issue: an object may match the search criteria but only a subset of the corresponding metadata to be displayed to the user may be available in the RESULT language. In this case, the rst available translation is selected for the display of the elds that are not available in the RESULT language. 6

7 Another issue arises from the fact that HTML forms encoded in ISO support only one character set: a character set contains characters from a limited number of languages thus preventing the concurrent display of metadata elds in languages not supported by the specic set. The arbitrary choice of the rst available translation as the language of display for metadata elds that are not available in the RESULT language may result to a language not supported by the ISO character set selected by the user. This issue has not been addressed. 5 Implementation details In this section we discuss some of the technical details concerning the implementation of multilinguality in DIENST. 5.1 Added code and variables In order to support multilinguality, new variables and source code have been added to the original version of DIENST. Most of the added code is in the new le Kernel/language.pl which includes routines and denitions for the support multilingual search, browsing, and presentation of results. It is written in PERL like the rest of the DIENST code. Variable %Engl2Lang, dened in this le, holds the phrases used in the HTML interface of DIENST, along with their translations to the supported languages. Each entry of this hash variable consists of an English phrase and a comma-separated list of its translations. The following new variables are dened in le List of the languages supported by DIENST. The order in which they appear in this variable determines the order of languages in every multilingual string (see Section 5.2). %charset: The character sets corresponding to the supported languages. %setrange: List of the sets of identical characters for each supported language (see Section 5.3). %capitals: The capital letters of each supported language. %futf8 ISO g: The \unicode" numbers corresponding to the characters of the ISO character set. 5.2 Multilingual strings In the new version of DIENST the metadata and the HTML forms may be translated and stored in several languages. Dierent storage structures may be used for this purpose, each with its own diculties, problems, and restrictions. A signicant factor in our design is that we are not building a new system but rather we are modifying a working, single language system. For this reason, we prefer approaches that require the least possible code modications. Two approaches have been considered for the storage of multilingual elds: Introduce new variables for each metadata eld, one for each language, to hold the multilingual information. This is a clean solution as all data and their translations are clearly distinct. It is appropriate for the design of a new system but would require a lot of modications to the existing version of DIENST. 7

8 Use a single variable to hold all translations of a metadata eld. This implies the encoding of all translations in a single string that is easy to split. We call such a string a multilingual string. The second option is the one we selected because of its many advantages especially on string manipulations and interoperability with older versions. We will rst give a more formal description of the multilingual string and then discuss its advantages. A multilingual string consists of all translations of a string separated by a delimiter, which is a character (or string) that does not appear in any translation. The default delimiter is the '#' character as denoted by the $lang RS variable dened in le Cong/const.cong. The order of translations in a multilingual string must be the one dened in variable %language of the same le. As an example, if the order is English, German, and Italian then the corresponding multilingual string of the word title is title#titel#titolo. Most phrases appearing in the DIENST web interface and metadata are held in multilingual strings. The FORM-QUERY and RESULT languages determine the multilingual string translation that will be used for the display of query results. The multilingual string denition is exible, permitting a multilingual string not to include translations to some of the supported languages. Missing translations are denoted by an empty substring. For example, if the order of translations in a multilingual string is English, German, and Italian and the German translation is missing, then the corresponding multilingual string of the word author is author##autore. If the translation to the current active language is missing, then the rst translation present in the multilingual string will be used for the display of the eld. It must be noted that the dierent translations of a multilingual string will be indexed according to dierent \rules" when the database is built because they are encoded by dierent character sets. Thus, the order of translations in a multilingual string is important for both the display of the string and for indexing purposes. By exploiting the fact that many languages are encoded through the same character set we can make the multilingual string more exible. Any string can be considered as a multilingual string whose text is available in a single language and delimeters are not provided. This does not cause problems in indexing or querying the database. The only available translation is used independently of the language selected by the user at any given time. The above denition and use of multilingual string has many benets: It is simple to isolate the desired translation (substring). If the desired translation is missing a translation to a dierent language can be easily used based on precongured priorities. The entire multilingual string can be sent to the client; the multilingual string can be split into the individual translations locally at the client site. Thus, the user can switch to a dierent RESULT language without having to contact the DIENST server again. Splitting the multilingual string into its individual translations and using all translations when constructing the indices to the metadata elds will produce an all-language common index. Through this index the user can query on the metadata elds without specifying explicitly the language of the keywords. It is easy to congure a running system for the support of new languages (see Section 6). Interoperability of multilingual DIENST with older versions is ensured. Two issues must be considered regarding interoperability with older versions: 8

9 { How does the original user interface behave with data managed by multilingual servers? The multilingual servers send multilingual strings in place of simple strings of information. As the original DIENST interface is not aware of multilingual strings the whole string will be displayed. Thus, users will simply see a more verbose listing containing all available translations (although some of them will be displayed using the wrong character set and font) separated by the delimiter specied in $lang RS. { How does the multilingual user interface behave with data from the original, single language, servers? Data from the original, single language servers are treated as data for which only the translation to the rst language is available (and trailing delimiters are omitted, which is acceptable). The only available translation is used for the display of query forms and results in all supported languages. Multilingual servers are compatible even if they do not support the same set of languages, as long as the common languages are dened in the same order. Thus, the original and the multilingual version of DIENST can easily cooperate, in a way mostly transparent to the users. 5.3 Denition of DIENST identical characters The multilingual version of DIENST partitions character sets into sets of identical characters. Identical characters are those that when used in a query will produce the same results. For example the characters 'a' and 'A' in ISO are considered to be identical; the query "nd documents of author Antoniadis" is equivalent to the query "nd documents of author antoniadis". This partitioning of characters is used by DIENST when creating the indices of the database. The sets of identical characters for ISO and ISO are dened near the end of le Cong/const.cong. Denitions of additional identical character sets can be specied in variable %setrange. 5.4 Modications to the database structure Bib les The DIENST metadata are stored in special format les, whose names have a \.bib" ending. In the following we will refer to those les as \bib" les. The format of a bib le is simple: it includes metadata elds that describe the object (e.g. DATE, TITLE, AUTHOR, ABSTRACT, etc.) and their corresponding values. In the multilingual version of DIENST bib les are converted to multilingual bib les, where metadata elds are multilingual strings. Since our implementation concentrates on digital libraries for technical reports we are using the following metadata elds: AUTHOR, DATE, ABSTRACT, and TITLE. All of these elds except DATE are stored as multilingual strings. DATE is stored as a regular string in the bib le; however, it is treated as a multilingual string when it is displayed in an HTML page. Multilinguality for the DATE eld is achieved by maintaining translations of the month names in le Kernel/language.pl Building of inverted indices The identical string of a string is generated by replacing each character in the string with the rst character in the set of identical characters. For example the identical string of author name \FaHndRich" is \fahndrich". 9

10 DIENST uses prebuilt inverted indices to eciently search the distributed database. The creation of inverted indices has many similarities to older versions. However, a few modications to the indexing process were required. Specically, certain preprocessing steps have been introduced: the multilingual string is split into substrings corresponding to the individual translations, each translation is transformed into its identical string, and every transformed translation is indexed. 5.5 Modications to the basic DIENST operations Search The elds of a query (or the query itself in \simple search") are considered to be in the current FORM-QUERY language. This language is used to create the identical string of the query which is then used to search the database using the pre-built indices in a manner similar to that of older versions Browse DIENST supports database browsing by author and by year. Browsing by author can be performed according to the letters of any of the supported languages. All translations of multilingual string AUTHOR are tested in each bib le. This implies that browsing is not aected by the order of translations in the string. Furthermore, browsing is not aected by the FORM- QUERY language that has been selected for the HTML page. Browsing by year is performed as in older versions. 5.6 Modications to the web pages The following strings of an HTML page are considered to be multilingual: standard text, results of a query, dates, names of collections, and end of page signature. Storage of their individual translations to the supported languages depends on the usage of the string. Specifically, the translations of standard text and dates are dened in variable %Engl2Lang in le Kernel/language.pl. The names of the collections and the end of page signature are in multilingual string format. The translations of the names of collections can also be specied in the %Engl2Lang variable. Finally, metadata of matched documents and their translations are stored in the bib les. The original DIENST web pages have been adapted for the support of multilinguality. The new pages use JavaScript to implement selection menus, buttons, and form submission. 5.7 UTF8 encoding support The multilingual version of DIENST supports UTF8 encoding of HTML pages. To support UTF8 encoding DIENST must be aware of the \unicode" numbers of the characters to be displayed. These numbers are used to nd the UTF8 representation of the characters in question. The characters of every ISO-8859 character set also correspond to \unicode" numbers. The corresponding \unicode" numbers of the characters of ISO are declared in the %futf8 iso g variable in le Cong/const.cong. The \unicode" numbers of the characters of ISO are the same as the ISO representation of these characters. For this reason they are not declared in le Cong/const.cong. To add the \unicode" numbers of an additional ISO character set in the Cong/const.cong le one must add a variable %futf8 iso g with the corresponding \unicode" numbers. 10

11 We have added this new feature only in the interface of DIENST. UTF8 encoding could also be applied to the metadata of the DIENST library. However, this was not implemented because of interoperability issues with older non-multilingual versions. Older versions of DIENST would not be aware that the metadata are UTF8 encoded. Thus they would be unable to query a library supported by the new multilingual version and retrieve a useful and readable result. Also, the new version of DIENST would not understand the metadata of older libraries encoded in ISO standard, since it would assume that they are encoded in UTF8. 6 Adding a new language Multilingual DIENST has been designed to be easily extensible to support a new language. As an example, the Greek and Italian languages have been added to this version. Use the following steps to add a language to the multiligual version of DIENST: Edit the Cong/const.cong le and append to the %charset, and %capitals, the new language, its charset, and its capitals respectively. If you want the end-of-page-signature and the publishers to be translated, modify the variable $end of page signature in le Cong/const.cong and the names of the publishers in Cong/install.cong by appending the translations to the already existing strings. The added translations should be delimited by the $lang RS language separator. Append each eld of variable %Engl2Lang of Kernel/language.pl with the translation of the corresponding sentence or phrase. To query the local database DIENST uses the indices created on the bib le elds. Thus, to enable queries in a new language you should modify the bib les before rebuilding the indices in the last step. The TITLE, AUTHOR, and ABSTRACT elds of each bib le should be appended with the translations to the new language delimited by the $lang RS language separator. The initial and help HTML pages of multilingual DIENST must be translated to the new language. The initial HTML pages translated to the languages supported by DIENST are the les htdocs/index*.html. Create a new HTML le htdocs/indexnew number.html containing the translation to the new language of the text of the initial (english) HTML page (htdocs/index0.html). In this discussion, new number is the rst number not used by the translations of the initial page. Modify the old HTML les htdocs/index*.html by appending the 'OPTION' elds near the start of these les with an entry for the new language. You may also create the htdocs/dienst runtime/search help/new number.html le as a translation of the htdocs/dienst runtime/search help0.html to be used as the help le for the new language. Run Cong/auto cong.pl to recongure DIENST with the new modications. 7 Conclusions - Future aspects The need of a multilingual design for digital libraries becomes apparent from the fact that objects stored in a digital library may be available in a variety of languages. In addition, digital library users may need to interact with a digital library in a variety of languages. This report described our work in extending DIENST with multilinguality features. Specically, it described how multilingual search and document presentation was implemented and provided instructions 11

12 on conguring the multilingual version of DIENST. The multilingual version of DIENST has been tested under the Netscape 3.0 browser. It does not work with browsers that do not support JavaScript. Future work in this area includes the implementation of a thesauri for translating terms among languages. Users will be able to query the system on the thesauri terms and be presented with a list of documents whose metadata contain either the specied query keywords or their translation to one of the supported languages. References [1] Carl Lagoze and Jim Davis: \DIENST: An Architecture for Distributed Documents Libraries", Communications of the ACM, 38(4), April 1995, p. 47. [2] Roman Czyborra: \ISO 8859 Alphabet Soup"; [3] Douglas W. Oard and Bonnie J. Dorr: \A Survey of Multilingual Text Retrieval", (CS- TR-3615) University of Massachusetts, April [4] Lisa MooreAndre Seznec: \The Unicode Standard, Version 2.1", February 18, 1998; [5] Communications of the ACM, Digital Libraries, April [6] P. Sheridan and J.P. Ballerini: \Experiments in Multilingual Information Retrieval using the SPIDER System", article, Institute for Information Systems, ETH Zurich, May 1996; [7] Akira Maeda, Takehisa Fujita, Lee Swee Choo, Tetsuo Sakaguchi, Shigeo Sugimoto and Koichi Tabata: "A Multilingual Browser for WWW without Preloaded Fonts", University of Library and Information Science; 12

Computer Networks 000 (2000) 000±000. Submission and repository management of digital libraries, using www

Computer Networks 000 (2000) 000±000. Submission and repository management of digital libraries, using www ELSEVIER SCIENCE B.V. [DTD 4.1.0] JOURNAL COMPNW ARTICLE No. 2407 PAGES 01-12 DISPATCH 11 August 2000 PROD. TYPE: FROM Disk COMPNW 2407 A Submission and repository management of digital libraries, using

More information

Activity Report at SYSTRAN S.A.

Activity Report at SYSTRAN S.A. Activity Report at SYSTRAN S.A. Pierre Senellart September 2003 September 2004 1 Introduction I present here work I have done as a software engineer with SYSTRAN. SYSTRAN is a leading company in machine

More information

Web site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client.

Web site Image database. Web site Video database. Web server. Meta-server Meta-search Agent. Meta-DB. Video query. Text query. Web client. (Published in WebNet 97: World Conference of the WWW, Internet and Intranet, Toronto, Canada, Octobor, 1997) WebView: A Multimedia Database Resource Integration and Search System over Web Deepak Murthy

More information

The status of Babel Johannes L. Braams 1 Introduction In this article I will give an overview of what has happened to babel

The status of Babel Johannes L. Braams 1 Introduction In this article I will give an overview of what has happened to babel The status of Babel Johannes L. Braams j.l.braams@research.ptt.nl 1 Introduction In this article I will give an overview of what has happened to babel lately. First I will briey describe the history of

More information

The Architecture of a System for the Indexing of Images by. Content

The Architecture of a System for the Indexing of Images by. Content The Architecture of a System for the Indexing of s by Content S. Kostomanolakis, M. Lourakis, C. Chronaki, Y. Kavaklis, and S. C. Orphanoudakis Computer Vision and Robotics Laboratory Institute of Computer

More information

Can R Speak Your Language?

Can R Speak Your Language? Languages Can R Speak Your Language? Brian D. Ripley Professor of Applied Statistics University of Oxford ripley@stats.ox.ac.uk http://www.stats.ox.ac.uk/ ripley The lingua franca of computing is (American)

More information

TREC-7 Experiments at the University of Maryland Douglas W. Oard Digital Library Research Group College of Library and Information Services University

TREC-7 Experiments at the University of Maryland Douglas W. Oard Digital Library Research Group College of Library and Information Services University TREC-7 Experiments at the University of Maryland Douglas W. Oard Digital Library Research Group College of Library and Information Services University of Maryland, College Park, MD 20742 oard@glue.umd.edu

More information

Localizing Intellicus. Version: 7.3

Localizing Intellicus. Version: 7.3 Localizing Intellicus Version: 7.3 Copyright 2015 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied or derived from,

More information

Brouillon d'article pour les Cahiers GUTenberg n?? February 5, xndy A Flexible Indexing System Roger Kehr Institut fur Theoretische Informatik

Brouillon d'article pour les Cahiers GUTenberg n?? February 5, xndy A Flexible Indexing System Roger Kehr Institut fur Theoretische Informatik Brouillon d'article pour les Cahiers GUTenberg n?? February 5, 1998 1 xndy A Flexible Indexing System Roger Kehr Institut fur Theoretische Informatik Darmstadt University of Technology Wilhelminenstrae

More information

LBSC 690: Information Technology Lecture 05 Structured data and databases

LBSC 690: Information Technology Lecture 05 Structured data and databases LBSC 690: Information Technology Lecture 05 Structured data and databases William Webber CIS, University of Maryland Spring semester, 2012 Interpreting bits "my" 13.5801 268 010011010110 3rd Feb, 2014

More information

Assignment 4. Overview. Prof. Stewart Weiss. CSci 335 Software Design and Analysis III Assignment 4

Assignment 4. Overview. Prof. Stewart Weiss. CSci 335 Software Design and Analysis III Assignment 4 Overview This assignment combines several dierent data abstractions and algorithms that we have covered in class, including priority queues, on-line disjoint set operations, hashing, and sorting. The project

More information

(Preliminary Version 2 ) Jai-Hoon Kim Nitin H. Vaidya. Department of Computer Science. Texas A&M University. College Station, TX

(Preliminary Version 2 ) Jai-Hoon Kim Nitin H. Vaidya. Department of Computer Science. Texas A&M University. College Station, TX Towards an Adaptive Distributed Shared Memory (Preliminary Version ) Jai-Hoon Kim Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3 E-mail: fjhkim,vaidyag@cs.tamu.edu

More information

COM Text User Manual

COM Text User Manual COM Text User Manual Version: COM_Text_Manual_EN_V2.0 1 COM Text introduction COM Text software is a Serial Keys emulator for Windows Operating System. COM Text can transform the Hexadecimal data (received

More information

Multilingual Information Processing for Digital Libraries

Multilingual Information Processing for Digital Libraries Multilingual Information Processing for Digital Libraries Akira Maeda Department of Computer Science, Ritsumeikan University 1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577, Japan E-mail: amaeda@cs.ritsumei.ac.jp

More information

GGPerf: A Perfect Hash Function Generator Jiejun KONG June 30, 1997

GGPerf: A Perfect Hash Function Generator Jiejun KONG June 30, 1997 GGPerf: A Perfect Hash Function Generator Jiejun KONG June 30, 1997 Contents 1 Introduction................................. 1 1.1 Minimal Perfect Hash Function................ 1 1.2 Generators and Scripting....................

More information

Representing Characters, Strings and Text

Representing Characters, Strings and Text Çetin Kaya Koç http://koclab.cs.ucsb.edu/teaching/cs192 koc@cs.ucsb.edu Çetin Kaya Koç http://koclab.cs.ucsb.edu Fall 2016 1 / 19 Representing and Processing Text Representation of text predates the use

More information

Shigeru Chiba Michiaki Tatsubori. University of Tsukuba. The Java language already has the ability for reection [2, 4]. java.lang.

Shigeru Chiba Michiaki Tatsubori. University of Tsukuba. The Java language already has the ability for reection [2, 4]. java.lang. A Yet Another java.lang.class Shigeru Chiba Michiaki Tatsubori Institute of Information Science and Electronics University of Tsukuba 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8573, Japan. Phone: +81-298-53-5349

More information

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES A. Likas, K. Blekas and A. Stafylopatis National Technical University of Athens Department

More information

Tutorial to QuotationFinder_0.6

Tutorial to QuotationFinder_0.6 Tutorial to QuotationFinder_0.6 What is QuotationFinder, and for which purposes can it be used? QuotationFinder is a tool for the automatic comparison of fully digitized texts. It can detect quotations,

More information

Database Systems Concepts *

Database Systems Concepts * OpenStax-CNX module: m28156 1 Database Systems Concepts * Nguyen Kim Anh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract This module introduces

More information

SAMOS: an Active Object{Oriented Database System. Stella Gatziu, Klaus R. Dittrich. Database Technology Research Group

SAMOS: an Active Object{Oriented Database System. Stella Gatziu, Klaus R. Dittrich. Database Technology Research Group SAMOS: an Active Object{Oriented Database System Stella Gatziu, Klaus R. Dittrich Database Technology Research Group Institut fur Informatik, Universitat Zurich fgatziu, dittrichg@ifi.unizh.ch to appear

More information

Binary Representation. Jerry Cain CS 106AJ October 29, 2018 slides courtesy of Eric Roberts

Binary Representation. Jerry Cain CS 106AJ October 29, 2018 slides courtesy of Eric Roberts Binary Representation Jerry Cain CS 106AJ October 29, 2018 slides courtesy of Eric Roberts Once upon a time... Claude Shannon Claude Shannon was one of the pioneers who shaped computer science in its early

More information

Paul Taylor. Abstract. CaberNet is the ESPRIT network of excellence in distributed systems consisting of several

Paul Taylor. Abstract. CaberNet is the ESPRIT network of excellence in distributed systems consisting of several Aontas: The CaberNet Technical Abstracts Service Paul Taylor October 1995 Distributed Systems Group Department of Computer Science University of Dublin Trinity College, Dublin 2, Ireland. Fax: +353-1-6772204

More information

Representing Characters and Text

Representing Characters and Text Representing Characters and Text cs4: Computer Science Bootcamp Çetin Kaya Koç cetinkoc@ucsb.edu Çetin Kaya Koç http://koclab.org Winter 2018 1 / 28 Representing Text Representation of text predates the

More information

Generating Continuation Passing Style Code for the Co-op Language

Generating Continuation Passing Style Code for the Co-op Language Generating Continuation Passing Style Code for the Co-op Language Mark Laarakkers University of Twente Faculty: Computer Science Chair: Software engineering Graduation committee: dr.ing. C.M. Bockisch

More information

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University

Markup Languages SGML, HTML, XML, XHTML. CS 431 February 13, 2006 Carl Lagoze Cornell University Markup Languages SGML, HTML, XML, XHTML CS 431 February 13, 2006 Carl Lagoze Cornell University Problem Richness of text Elements: letters, numbers, symbols, case Structure: words, sentences, paragraphs,

More information

Push button sensor 3 Plus - Brief instructions for loading additional display languages Order-No , , 2042 xx, 2043 xx, 2046 xx

Push button sensor 3 Plus - Brief instructions for loading additional display languages Order-No , , 2042 xx, 2043 xx, 2046 xx KNX/EIB Product documentation Issue: 01.07.2011 65yxx220 Push button sensor 3 Plus - Brief instructions for loading additional display languages KNX/EIB Product documentation Contents 1 Product definition...

More information

Carnegie Mellon University in Qatar Spring Problem Set 4. Out: April 01, 2018

Carnegie Mellon University in Qatar Spring Problem Set 4. Out: April 01, 2018 Carnegie Mellon University in Qatar 15415 - Spring 2018 Problem Set 4 Out: April 01, 2018 Due: April 11, 2018 1 1 B + External Sorting [20 Points] Suppose that you just nished inserting several records

More information

Copyright 2018 Maxprograms

Copyright 2018 Maxprograms Copyright 2018 Maxprograms Table of Contents Introduction... 1 TMXEditor... 1 Features... 1 Getting Started... 2 Editing an existing file... 2 Create New TMX File... 3 Maintenance Tasks... 4 Sorting TM

More information

A Web Based Registration system for Higher Educational Institutions in Greece: the case of Energy Technology Department-TEI of Athens

A Web Based Registration system for Higher Educational Institutions in Greece: the case of Energy Technology Department-TEI of Athens A Web Based Registration system for Higher Educational Institutions in Greece: the case of Energy Technology Department-TEI of Athens S. ATHINEOS 1, D. KAROLIDIS 2, P. PRENTAKIS 2, M. SAMARAKOU 2 1 Department

More information

ANS Forth Internationalisation proposal \\stephen\d\mpe\projects\international\i18n.propose.v7.doc Revised 25 Mar 2001

ANS Forth Internationalisation proposal \\stephen\d\mpe\projects\international\i18n.propose.v7.doc Revised 25 Mar 2001 ANS Forth Internationalisation proposal \\stephen\d\mpe\projects\international\i18n.propose.v7.doc Revised 25 Mar 2001 Authors: Stephen Pelc, MicroProcessor Engineering, sfp@mpeltd.demon.co.uk Willem Botha,

More information

The Unicode Standard Version 11.0 Core Specification

The Unicode Standard Version 11.0 Core Specification The Unicode Standard Version 11.0 Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers

More information

Martin P. Robillard and Gail C. Murphy. University of British Columbia. November, 1999

Martin P. Robillard and Gail C. Murphy. University of British Columbia. November, 1999 Migrating a Static Analysis Tool to AspectJ TM Martin P. Robillard and Gail C. Murphy Department of Computer Science University of British Columbia 201-2366 Main Mall Vancouver BC Canada V6T 1Z4 fmrobilla,murphyg@cs.ubc.ca

More information

ABCDE. HP Part No Printed in U.S.A U0989

ABCDE. HP Part No Printed in U.S.A U0989 Switch Programing Guide HP 3000 Computer Systems ABCDE HP Part No. 32650-90014 Printed in U.S.A. 19890901 U0989 The information contained in this document is subject to change without notice. HEWLETT-PACKARD

More information

Routing and Ad-hoc Retrieval with the. Nikolaus Walczuch, Norbert Fuhr, Michael Pollmann, Birgit Sievers. University of Dortmund, Germany.

Routing and Ad-hoc Retrieval with the. Nikolaus Walczuch, Norbert Fuhr, Michael Pollmann, Birgit Sievers. University of Dortmund, Germany. Routing and Ad-hoc Retrieval with the TREC-3 Collection in a Distributed Loosely Federated Environment Nikolaus Walczuch, Norbert Fuhr, Michael Pollmann, Birgit Sievers University of Dortmund, Germany

More information

2011 Martin v. Löwis. Data-centric XML. Character Sets

2011 Martin v. Löwis. Data-centric XML. Character Sets Data-centric XML Character Sets Character Sets: Rationale Computer stores data in sequences of bytes each byte represents a value in range 0..255 Text data are intended to denote characters, not numbers

More information

UTF and Turkish. İstinye University. Representing Text

UTF and Turkish. İstinye University. Representing Text Representing Text Representation of text predates the use of computers for text Text representation was needed for communication equipment One particular commonly used communication equipment was teleprinter

More information

Introduction to programming Lecture 4: processing les and counting words

Introduction to programming Lecture 4: processing les and counting words Introduction to programming Lecture 4: processing les and counting words UNIVERSITY OF Richard Johansson September 22, 2015 overview of today's lecture le processing splitting into sentences and words

More information

2007 Martin v. Löwis. Data-centric XML. Character Sets

2007 Martin v. Löwis. Data-centric XML. Character Sets Data-centric XML Character Sets Character Sets: Rationale Computer stores data in sequences of bytes each byte represents a value in range 0..255 Text data are intended to denote characters, not numbers

More information

and easily tailor it for use within the multicast system. [9] J. Purtilo, C. Hofmeister. Dynamic Reconguration of Distributed Programs.

and easily tailor it for use within the multicast system. [9] J. Purtilo, C. Hofmeister. Dynamic Reconguration of Distributed Programs. and easily tailor it for use within the multicast system. After expressing an initial application design in terms of MIL specications, the application code and speci- cations may be compiled and executed.

More information

System Demonstration TRADOS TRANSLATOR'S WORKBENCH

System Demonstration TRADOS TRANSLATOR'S WORKBENCH System Demonstration TRADOS TRANSLATOR'S WORKBENCH Mark Berry MCB Systems 1. System Builders and Contacts Developer TRADOS GmbH Tel. +49 (711) 168 77-0 Hackländerstrasse 17 Fax +49 (711) 168 77-50 D-70187

More information

Online Inspector for UniBoard

Online Inspector for UniBoard Bachelor Project Fall Term 2014 Online Inspector for UniBoard Author: Priya Bianchetti Supervisor: Rolf Haenni January 16, 2015 Contents 1 Introduction 5 1.1 Present Situation................................

More information

68000 Assembler by Paul McKee. User's Manual

68000 Assembler by Paul McKee. User's Manual Contents 68000 Assembler by Paul McKee User's Manual 1 Introduction 2 2 Source Code Format 2 2.1 Source Line Format............................... 2 2.1.1 Label Field............................... 2 2.1.2

More information

Construction of a Dynamic Document Using Context-Free Pieces Takayama, Ikoma, Nara , Japan

Construction of a Dynamic Document Using Context-Free Pieces Takayama, Ikoma, Nara , Japan Construction of a Dynamic Document Using Context-Free Pieces Kenji Hanakawa Department of Electrical Engineering and Computer Science Osaka Prefectural College of Technology 26-12 Saiwaicho, Neyagawashi

More information

A taxonomy of race. D. P. Helmbold, C. E. McDowell. September 28, University of California, Santa Cruz. Santa Cruz, CA

A taxonomy of race. D. P. Helmbold, C. E. McDowell. September 28, University of California, Santa Cruz. Santa Cruz, CA A taxonomy of race conditions. D. P. Helmbold, C. E. McDowell UCSC-CRL-94-34 September 28, 1994 Board of Studies in Computer and Information Sciences University of California, Santa Cruz Santa Cruz, CA

More information

Desktop Crawls. Document Feeds. Document Feeds. Information Retrieval

Desktop Crawls. Document Feeds. Document Feeds. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Web crawlers Retrieving web pages Crawling the web» Desktop crawlers» Document feeds File conversion Storing the documents Removing noise Desktop Crawls! Used

More information

XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013

XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013 Assured and security Deep-Secure XDS An Extensible Structure for Trustworthy Document Content Verification Simon Wiseman CTO Deep- Secure 3 rd June 2013 This technical note describes the extensible Data

More information

Outline. Computer Science 331. Information Hiding. What This Lecture is About. Data Structures, Abstract Data Types, and Their Implementations

Outline. Computer Science 331. Information Hiding. What This Lecture is About. Data Structures, Abstract Data Types, and Their Implementations Outline Computer Science 331 Data Structures, Abstract Data Types, and Their Implementations Mike Jacobson 1 Overview 2 ADTs as Interfaces Department of Computer Science University of Calgary Lecture #8

More information

Prior Art Database Keyword Search Guide 1

Prior Art Database Keyword Search Guide 1 Contents PAD Keyword Search Overview... 2 Syntax Overview... 3 Detailed Examples... 4 Terms and Phrases... 4 Special Characters (% # ~ *)... 5 Numeric Range Searching... 6 Variable Term Weighting... 7

More information

Dierential-Linear Cryptanalysis of Serpent? Haifa 32000, Israel. Haifa 32000, Israel

Dierential-Linear Cryptanalysis of Serpent? Haifa 32000, Israel. Haifa 32000, Israel Dierential-Linear Cryptanalysis of Serpent Eli Biham, 1 Orr Dunkelman, 1 Nathan Keller 2 1 Computer Science Department, Technion. Haifa 32000, Israel fbiham,orrdg@cs.technion.ac.il 2 Mathematics Department,

More information

reasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap

reasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap Storage-Ecient Finite Field Basis Conversion Burton S. Kaliski Jr. 1 and Yiqun Lisa Yin 2 RSA Laboratories 1 20 Crosby Drive, Bedford, MA 01730. burt@rsa.com 2 2955 Campus Drive, San Mateo, CA 94402. yiqun@rsa.com

More information

For our sample application we have realized a wrapper WWWSEARCH which is able to retrieve HTML-pages from a web server and extract pieces of informati

For our sample application we have realized a wrapper WWWSEARCH which is able to retrieve HTML-pages from a web server and extract pieces of informati Meta Web Search with KOMET Jacques Calmet and Peter Kullmann Institut fur Algorithmen und Kognitive Systeme (IAKS) Fakultat fur Informatik, Universitat Karlsruhe Am Fasanengarten 5, D-76131 Karlsruhe,

More information

1. Editing mathematical documents, which could be papers, letters, slides for talks, or lecture notes for students. 2. Being able to deal with the nat

1. Editing mathematical documents, which could be papers, letters, slides for talks, or lecture notes for students. 2. Being able to deal with the nat Euromath system: alphabets and fonts Richard M. Timoney School of Mathematics Trinity College Dublin 2 Ireland richardt@maths.tcd.ie April 1997 Abstract We describe some of the principles behind the design

More information

To appear in: IEEE Transactions on Knowledge and Data Engineering. The Starburst Active Database Rule System. Jennifer Widom. Stanford University

To appear in: IEEE Transactions on Knowledge and Data Engineering. The Starburst Active Database Rule System. Jennifer Widom. Stanford University To appear in: IEEE Transactions on Knowledge and Data Engineering The Starburst Active Database Rule System Jennifer Widom Department of Computer Science Stanford University Stanford, CA 94305-2140 widom@cs.stanford.edu

More information

Sources of Evidence. CSF: Forensics Cyber-Security. Part I. Foundations of Digital Forensics. Fall 2015 Nuno Santos

Sources of Evidence. CSF: Forensics Cyber-Security. Part I. Foundations of Digital Forensics. Fall 2015 Nuno Santos Sources of Evidence Part I. Foundations of Digital Forensics CSF: Forensics Cyber-Security Fall 2015 Nuno Santos Summary Reasoning about sources of evidence Data representation and interpretation Number

More information

jwhois Version July 2005 Jonas berg

jwhois Version July 2005 Jonas berg jwhois Version 3.2.3 10 July 2005 Jonas berg (jonas@gnu.org) Copyright c 1999-2005 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of

More information

Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong.

Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong. Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, N.T., Hong Kong SAR, China fyclaw,jleeg@cse.cuhk.edu.hk

More information

Casabac Unicode Support

Casabac Unicode Support Unicode Support Unicode Support Full Unicode support was added into the GUI Server with build 25_20040105. Before ISO 8859-1 was used for encoding and decoding HTML pages and your system's default encoding

More information

Friendly Fonts for your Design

Friendly Fonts for your Design Friendly Fonts for your Design Choosing the right typeface for your website copy is important, since it will affect the way your readers perceive your page (serious and formal, or friendly and casual).

More information

Google Search Appliance

Google Search Appliance Google Search Appliance Search Appliance Internationalization Google Search Appliance software version 7.2 and later Google, Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043 www.google.com GSA-INTL_200.01

More information

Network Working Group Request for Comments: 2318 Category: Informational W3C March 1998

Network Working Group Request for Comments: 2318 Category: Informational W3C March 1998 Network Working Group Request for Comments: 2318 Category: Informational H. Lie B. Bos C. Lilley W3C March 1998 The text/css Media Type Status of this Memo This memo provides information for the Internet

More information

Part III: Survey of Internet technologies

Part III: Survey of Internet technologies Part III: Survey of Internet technologies Content (e.g., HTML) kinds of objects we re moving around? References (e.g, URLs) how to talk about something not in hand? Protocols (e.g., HTTP) how do things

More information

The process of preparing an application to support more than one language and data format is called internationalization. Localization is the process

The process of preparing an application to support more than one language and data format is called internationalization. Localization is the process 1 The process of preparing an application to support more than one language and data format is called internationalization. Localization is the process of adapting an internationalized application to support

More information

2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t

2 Data Reduction Techniques The granularity of reducible information is one of the main criteria for classifying the reduction techniques. While the t Data Reduction - an Adaptation Technique for Mobile Environments A. Heuer, A. Lubinski Computer Science Dept., University of Rostock, Germany Keywords. Reduction. Mobile Database Systems, Data Abstract.

More information

Best Current Practice; mandatory IETF RFCs not on standards track, see below.

Best Current Practice; mandatory IETF RFCs not on standards track, see below. Request for Comments In computer network engineering, a Request for Comments () is a memorandum, usually published by the Editor on behalf of the Internet Engineering Task Force (IETF), describing methods,

More information

An On-line Variable Length Binary. Institute for Systems Research and. Institute for Advanced Computer Studies. University of Maryland

An On-line Variable Length Binary. Institute for Systems Research and. Institute for Advanced Computer Studies. University of Maryland An On-line Variable Length inary Encoding Tinku Acharya Joseph F. Ja Ja Institute for Systems Research and Institute for Advanced Computer Studies University of Maryland College Park, MD 242 facharya,

More information

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream

Steering. Stream. User Interface. Stream. Manager. Interaction Managers. Snapshot. Stream Agent Roles in Snapshot Assembly Delbert Hart Dept. of Computer Science Washington University in St. Louis St. Louis, MO 63130 hart@cs.wustl.edu Eileen Kraemer Dept. of Computer Science University of Georgia

More information

Computer Science Applications to Cultural Heritage. Introduction to computer systems

Computer Science Applications to Cultural Heritage. Introduction to computer systems Computer Science Applications to Cultural Heritage Introduction to computer systems Filippo Bergamasco (filippo.bergamasco@unive.it) http://www.dais.unive.it/~bergamasco DAIS, Ca Foscari University of

More information

This is the Pre-Published Version

This is the Pre-Published Version This is the Pre-Published Version Path Dictionary: A New Approach to Query Processing in Object-Oriented Databases Wang-chien Lee Dept of Computer and Information Science The Ohio State University Columbus,

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Dell Zhang Birkbeck, University of London 2016/17 IR Chapter 02 The Term Vocabulary and Postings Lists Constructing Inverted Indexes The major steps in constructing

More information

Dewayne E. Perry. Abstract. An important ingredient in meeting today's market demands

Dewayne E. Perry. Abstract. An important ingredient in meeting today's market demands Maintaining Consistent, Minimal Congurations Dewayne E. Perry Software Production Research, Bell Laboratories 600 Mountain Avenue, Murray Hill, NJ 07974 USA dep@research.bell-labs.com Abstract. An important

More information

Design and Implementation of an RDF Triple Store

Design and Implementation of an RDF Triple Store Design and Implementation of an RDF Triple Store Ching-Long Yeh and Ruei-Feng Lin Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd., Sec. 3 Taipei, 04 Taiwan E-mail:

More information

Guide & User Instructions

Guide & User Instructions Guide & User Instructions Revised 06/2012 726 Grant Street Troy Ohio 45373 877.698.3262 937.335.3887 onecallnow.com support@onecallnow.com America s Largest Message Notification Provider Copyright 2009-2012

More information

Natural Semantics [14] within the Centaur system [6], and the Typol formalism [8] which provides us with executable specications. The outcome of such

Natural Semantics [14] within the Centaur system [6], and the Typol formalism [8] which provides us with executable specications. The outcome of such A Formal Executable Semantics for Java Isabelle Attali, Denis Caromel, Marjorie Russo INRIA Sophia Antipolis, CNRS - I3S - Univ. Nice Sophia Antipolis, BP 93, 06902 Sophia Antipolis Cedex - France tel:

More information

HELP DOCUMENTATION A User s Guide

HELP DOCUMENTATION A User s Guide HELP DOCUMENTATION A User s Guide 1. GETTING STARTED...3 2. SEARCHING...3 QUICK SEARCH...3 ADVANCED SEARCH...3 SEARCH OPERATORS...4 A WORD ABOUT ADJACENCY, NEAR, AND CHOICE OPERATORS...4 ADDITIONAL SEARCH

More information

RECONFIGURATION OF HIERARCHICAL TUPLE-SPACES: EXPERIMENTS WITH LINDA-POLYLITH. Computer Science Department and Institute. University of Maryland

RECONFIGURATION OF HIERARCHICAL TUPLE-SPACES: EXPERIMENTS WITH LINDA-POLYLITH. Computer Science Department and Institute. University of Maryland RECONFIGURATION OF HIERARCHICAL TUPLE-SPACES: EXPERIMENTS WITH LINDA-POLYLITH Gilberto Matos James Purtilo Computer Science Department and Institute for Advanced Computer Studies University of Maryland

More information

Multilingual implementations of OSI applications

Multilingual implementations of OSI applications Multilingual implementations of OSI applications C. Bouras 1,2 D. Fotakis 2 V. Kapoulas 1,2 S. Kontogiannis 2 P. Lampsas 1,2 P. Spirakis 1,2 A. Tatakis 2 1 Computer Technology Institute, 2 Department of

More information

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the Heap-on-Top Priority Queues Boris V. Cherkassky Central Economics and Mathematics Institute Krasikova St. 32 117418, Moscow, Russia cher@cemi.msk.su Andrew V. Goldberg NEC Research Institute 4 Independence

More information

Pizza Delivery Helper

Pizza Delivery Helper Pizza Delivery Helper Aldo Doronzo 2008 / 2009 Abstract This is a report describing the Pizza Delivery Helper project realized during the course of Mobile Services taught by prof. Ricci at the Free University

More information

Probabilistic Learning Approaches for Indexing and Retrieval with the. TREC-2 Collection

Probabilistic Learning Approaches for Indexing and Retrieval with the. TREC-2 Collection Probabilistic Learning Approaches for Indexing and Retrieval with the TREC-2 Collection Norbert Fuhr, Ulrich Pfeifer, Christoph Bremkamp, Michael Pollmann University of Dortmund, Germany Chris Buckley

More information

Building a Search Engine: Part 2

Building a Search Engine: Part 2 Brown University CSCI 1580 Spring 2013 Building a Search Engine: Part 2 Due: 2:00pm, 8 Mar 2013 Overview In this part of the Course Project you will enhance the simplied search engine that you built in

More information

C. E. McDowell August 25, Baskin Center for. University of California, Santa Cruz. Santa Cruz, CA USA. abstract

C. E. McDowell August 25, Baskin Center for. University of California, Santa Cruz. Santa Cruz, CA USA. abstract Unloading Java Classes That Contain Static Fields C. E. McDowell E. A. Baldwin 97-18 August 25, 1997 Baskin Center for Computer Engineering & Information Sciences University of California, Santa Cruz Santa

More information

2 Texts and Text Styles The texts considered for generation in the current stage of the AGILE project are simplied versions of routine passages occurr

2 Texts and Text Styles The texts considered for generation in the current stage of the AGILE project are simplied versions of routine passages occurr Text Structuring in a Multilingual System for Generation of Instructions Ivana Kruij-Korbayova and Geert-Jan M. Kruij Institute of Formal and Applied Linguistics ( UFAL) Faculty of Mathematics and Physics,

More information

Experiments on string matching in memory structures

Experiments on string matching in memory structures Experiments on string matching in memory structures Thierry Lecroq LIR (Laboratoire d'informatique de Rouen) and ABISS (Atelier de Biologie Informatique Statistique et Socio-Linguistique), Universite de

More information

Comparing Open Source Digital Library Software

Comparing Open Source Digital Library Software Comparing Open Source Digital Library Software George Pyrounakis University of Athens, Greece Mara Nikolaidou Harokopio University of Athens, Greece Topic: Digital Libraries: Design and Development, Open

More information

\Classical" RSVP and IP over ATM. Steven Berson. April 10, Abstract

\Classical RSVP and IP over ATM. Steven Berson. April 10, Abstract \Classical" RSVP and IP over ATM Steven Berson USC Information Sciences Institute April 10, 1996 Abstract Integrated Services in the Internet is rapidly becoming a reality. Meanwhile, ATM technology is

More information

Tutorial to QuotationFinder_0.4.4

Tutorial to QuotationFinder_0.4.4 Tutorial to QuotationFinder_0.4.4 What is Quotation Finder and for which purposes can it be used? Quotation Finder is a tool for the automatic comparison of fully digitized texts. It can detect quotations,

More information

Identifying Updated Metadata and Images from a Content Provider

Identifying Updated Metadata and Images from a Content Provider University of Iowa Libraries Staff Publications 4-8-2010 Identifying Updated Metadata and Images from a Content Provider Wendy Robertson University of Iowa 2010 Wendy C Robertson Comments Includes presenter's

More information

XPath. Contents. Tobias Schlitt Jakob Westho November 17, 2008

XPath. Contents. Tobias Schlitt Jakob Westho November 17, 2008 XPath Tobias Schlitt , Jakob Westho November 17, 2008 Contents 1 Introduction 2 1.1 The XML tree model......................... 2 1.1.1 Clarication of terms.....................

More information

PORTAL RESOURCES INFORMATION SYSTEM: THE DESIGN AND DEVELOPMENT OF AN ONLINE DATABASE FOR TRACKING WEB RESOURCES.

PORTAL RESOURCES INFORMATION SYSTEM: THE DESIGN AND DEVELOPMENT OF AN ONLINE DATABASE FOR TRACKING WEB RESOURCES. PORTAL RESOURCES INFORMATION SYSTEM: THE DESIGN AND DEVELOPMENT OF AN ONLINE DATABASE FOR TRACKING WEB RESOURCES by Richard Spinks A Master s paper submitted to the faculty of the School of Information

More information

Tutorial to QuotationFinder_0.4.3

Tutorial to QuotationFinder_0.4.3 Tutorial to QuotationFinder_0.4.3 What is Quotation Finder and for which purposes can it be used? Quotation Finder is a tool for the automatic comparison of fully digitized texts. It can either detect

More information

Character Encodings. Fabian M. Suchanek

Character Encodings. Fabian M. Suchanek Character Encodings Fabian M. Suchanek 22 Semantic IE Reasoning Fact Extraction You are here Instance Extraction singer Entity Disambiguation singer Elvis Entity Recognition Source Selection and Preparation

More information

Variables and Data Representation

Variables and Data Representation You will recall that a computer program is a set of instructions that tell a computer how to transform a given set of input into a specific output. Any program, procedural, event driven or object oriented

More information

DHCP for IPv6. Palo Alto, CA Digital Equipment Company. Nashua, NH mentions a few directions about the future of DHCPv6.

DHCP for IPv6. Palo Alto, CA Digital Equipment Company. Nashua, NH mentions a few directions about the future of DHCPv6. DHCP for IPv6 Charles E. Perkins and Jim Bound Sun Microsystems, Inc. Palo Alto, CA 94303 Digital Equipment Company Nashua, NH 03062 Abstract The Dynamic Host Conguration Protocol (DHCPv6) provides a framework

More information

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute

Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Functional Programming in Haskell Prof. Madhavan Mukund and S. P. Suresh Chennai Mathematical Institute Module # 02 Lecture - 03 Characters and Strings So, let us turn our attention to a data type we have

More information

Computer Technology Institute. Patras, Greece. In this paper we present a user{friendly framework and a

Computer Technology Institute. Patras, Greece. In this paper we present a user{friendly framework and a MEASURING SOFTWARE COMPLEXITY USING SOFTWARE METRICS 1 2 Xenos M., Tsalidis C., Christodoulakis D. Computer Technology Institute Patras, Greece In this paper we present a user{friendly framework and a

More information

FAO TERM PORTAL User s Guide

FAO TERM PORTAL User s Guide FAO TERM PORTAL User s Guide February 2016 Table of contents 1. Introduction to the FAO Term Portal... 3 2. Home Page description... 4 3. How to search for a term... 6 4. Results view... 8 5. Entry Details

More information

Stackable Layers: An Object-Oriented Approach to. Distributed File System Architecture. Department of Computer Science

Stackable Layers: An Object-Oriented Approach to. Distributed File System Architecture. Department of Computer Science Stackable Layers: An Object-Oriented Approach to Distributed File System Architecture Thomas W. Page Jr., Gerald J. Popek y, Richard G. Guy Department of Computer Science University of California Los Angeles

More information

under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli

under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli Interface Optimization for Concurrent Systems under Timing Constraints David Filo David Ku Claudionor N. Coelho, Jr. Giovanni De Micheli Abstract The scope of most high-level synthesis eorts to date has

More information

II-7Numeric and String Variables

II-7Numeric and String Variables Chapter II-7 II-7Numeric and String Variables Overview... 94 Creating Global Variables... 94 Uses For Global Variables... 94 Variable Names... 94 System Variables... 95 User Variables... 95 Special User

More information