Building a Faceted Browser in CouchDB Using Views on Views and Erlang Metaprogramming

Size: px
Start display at page:

Download "Building a Faceted Browser in CouchDB Using Views on Views and Erlang Metaprogramming"

Transcription

1 Browser in Using on and Erlang Browser in Using on and Erlang WFLP-2011 Odense, July on views The NaLiDa Project Nachhaltigkeit Linguistischer Daten

2 Browser in Using on and Erlang infrastructure (in Linguistics) on on views.2

3 State of affairs in the Humanities (and elsewhere) no systematic management of the underlying research data increasing pressure from funding agencies to document and make public research data escience infrastructure needed to support reproduction of results over identical data sets increase scientific quality and fights fraud in science help avoiding unmeant duplication of research work NaLiDa Project Browser in Using on and Erlang on views contributes to infrastructure for languages resources (corpora, lexica,...) and software tools (part-of-speech taggers, parsers,...) supports scientific community with infrastructure building, metadata management and storage assists institutions to systematically describe and expose their research with metadata terms of XML-based documents increase access to and visibility of resources.3

4 Data Aggregation and Exposure Browser in Using on and Erlang XML A XML B XML C OAI-PMH Harvesting At regular intervals new providers may join Document Storage on views.4

5 Metadata Descriptions in Linguistics Browser in Using on and Erlang can be very detailed with large variety in the usage of metadata field descriptors and their structural organisation most of the information is of little use for most users some information pieces matter for most users Increasing Popularity of Faceted Browsing well-suited for naive users to explore large data sets with small but informative set of facets customers can identify products along many dimensions facets & their value range & number of corresponding items shows structure and content of the search space many users learn the main criteria for navigation on views.5

6 Facet Selection governed by search for common denominator across collections will yield rather small set of (semantically similar) metadata fields main facets: organisation, language, resource type, modality conditional facets such as lifecyle status, tool type if ressource type is tool Facetification Facets: F 1,..., F n with values ranges {f 11,... f 1n }... {f n1,... f nm} document must be indexed by at least one facet-value pair Browser in Using on and Erlang on views document can be described by more than one value f ij for F i metadata for multimodal corpus with F i = modality and f ij gesture, sign language and spoken language.6

7 German English French Dutch Sign Language British Sign Language Swedish Sign Language German Sign Language Georgian Hungarian Dutch Italian Latin Russian Computations Languages Browser in Using on and Erlang Once facet-value pair f ik is selected, corresponding document set f ik must be intersected with each of the other subsets of F j with 1 < j < n, j i: document set of ring segment f ik must be intersected with document sets of all segments of all rings other than F i When users select facet F i with value f ik and facet F j with f jl first build intersection between the two corresponding document collections then, intersect (non-empty) result with all ring segments of all rings other than F i and F j on views.7

8 Browser in Using on and Erlang Requirements cope with metadata heterogeneity, given that documents will adhere to different schemas each defining its own structured set of descriptors and values preserve the original format of all metadata descriptions, and consider storing primary data in addition to the metadata describing it handle regular additions to document storage with only incremental update for document access provide effective and user-friendly access to all documents use a REST-based approach to make data storage read & write web-accessible on views.8

9 schema-less database design permits the inclusion of arbitrarily structured documents into the database original metadata format can be preserved, and primary data can also be associated with the metadata describing it map-reduce framework promises incrementality and scalability features a REST-based interface for document uploading, downloading and querying also hosts GUI, and provides Lucene port correspond to hardwired DB queries; also stored in once a query is executed, its result is also stored defined in terms of map & reduce written in Erlang, Javascript, and other languages Browser in Using on and Erlang on views.9

10 Motivation process lots of data to produce other data using many CPUs supporting automatic parallelization & distribution, fault-tolerance, I/O scheduling, status and monitoring Programming Model: Map processes input documents (key-value pairs) produces set/table of intermediate pairs map(in_key, in_value) list(out_key, intermed_value) must be referentially transparent given a document, the function will always emit the same key-value pairs document indexing process is incremental, can run in parallel can be written in Javascript and Erlang (& other ports) Browser in Using on and Erlang on views.10

11 Programming Model: Reduce combines all values for a particular key produces a set of merged output values (usually just one) reduce(out_key, list(intermed_value)) list(out_value) map function can be complemented by a reduce function takes as input the table of emitted values with identical keys as generated by the map function, and aggregates them, e.g., summing up the values associated with the same key: function(key, values) { return sum(values); } must be referentially transparent, commutative and associative must be call-able with output of map process, but also with intermediate values computed by prior reduce (rereduce). Browser in Using on and Erlang on views.11

12 Framework Browser in Using on and Erlang documents map map documents key-1 values key-2 values key-3 values key-1 values key-2 values key-3 values aggregation key 1 key 2 key 3 values values values on views intermediate values reduce reduce reduce final key-1 values final key-2 values final key-3 values.12

13 Browser in Using on and Erlang Stages 1 ingestion: OAI-PMH-harvested documents validated against their schema, which are then converted from XML to JSON supplied with unique id, timestamp, source, and schema information, and added to DB with original XML as attachment 2 indexing: to attack data heterogeneity at schema level 3 curation: to address variability in facet values 4 faceted search indexing: to precompute all possible queries 5 presentation: to give users navigation access to datasets on views.13

14 Document Indexing with map-reduce document indexing tackles data heterogeneity given that documents may adhere to different schemas Browser in Using on and Erlang Map Example (template) function(doc) { switch( doc.schema ) { case "<reference_to_schema_a>": if ( <tree_has_node> ) { emit(<path_to_node_val>, 1); break; } case "<reference_to_schema_b>": [...] [...] } } on views.14

15 Map to index organisations (fragment) function(doc) { switch( doc.schema ) { case " if ( doc.cmd && doc.cmd.components && doc.cmd.components.textcorpusprofile && doc.cmd.components.textcorpusprofile.generalinfo && doc.cmd.components.textcorpusprofile.generalinfo.legalowner && doc.cmd.components.textcorpusprofile.generalinfo.legalowner.$t ) { emit( doc.cmd.components.textcorpusprofile.generalinfo.legalowner.$t, 1); break; } } } case " if ( doc.lexicalresource && doc.lexicalresource.organization && doc.lexicalresource.organization.$t ) { emit( doc.lexicalresource.organization.$t, 1); break; }... Browser in Using on and Erlang on views.15

16 Map Result (organisations) Browser in Using on and Erlang on views.16

17 Reduce Result (organisations) Browser in Using on and Erlang on views.17

18 Reduce Result (organisations) Browser in Using on and Erlang on views Note: need for data curation.17

19 Document Indexing with map-reduce initially, manually coded, and adapted after schema change but this is tedious and prone to error now automatic generation of views from declarative facet specification using JavaScript (string concatenation) Facet specification { "facet" : "modality", "pathinfos" : [ { "schema": " "path" : "doc.cmd.components.textcorpusprofile...", }, { "schema": " "path" : "doc.cmd.components.lexicalresourceprofile..." },... ] } { "facet" : "language", "pathinfos" : [... ] } [...] Browser in Using on and Erlang on views.18

20 Data Curation each map function gives a view of the document space in terms of the facet it represents analysis shows large variability for many facet values, e.g., organisations with different names devised curation tables that map given names to preferred names data curation performed on the indices (for faceted search) rather than the original documents Conversion of to Documents faceted search to be defined in terms of document indexing established in first map-reduce cycle but s map-reduce framework is defined in terms of documents thus, not possible to define views on views, at least not directly Browser in Using on and Erlang on views.19

21 on re-using the result of document indexing by converting resulting views into documents conversion takes care of data curation conversion written in JavaScript implementing hash table of hash tables outer hash table gives access to the facets language inner hash table to all the values a chosen hash can take associating key German with all documents with this piece of information new index (of type docindex ) is stored into extra DB also holds all views to implement faceted search one index file for each document collection Browser in Using on and Erlang on views.20

22 document index for one collection Browser in Using on and Erlang on views.21

23 Map View for Country fun ({Doc}) -> case proplists:get_value(<<"doctype">>, Doc) of <<"docindex">> -> {CountryHash} = proplists:get_value(<<"country">>, Doc, {[]}), {LanguageHash} = proplists:get_value(<<"language">>, Doc, {[]}), <other hashes> lists:foreach(fun (CountryItem) -> DocSet = proplists:get_value(countryitem, CountryHash), DocSetSize = ordsets:size(docset), if DocSetSize > 0 -> Emit(CountryItem, {[{<<"facet">>, <<"_total_">>}, {<<"value">>, <<"_total_">>}, {<<"docs">>, DocSet}]}), lists:foreach(fun (LanguageItem) -> Intersection = ordsets:intersection(proplists:get_value(languageitem, LanguageHash), proplists:get_value(countryitem, CountryHash)), case Intersection == [] of false -> Emit(CountryItem, {[{<<"facet">>, <<"language">>}, {<<"value">>, LanguageItem}, {<<"docs">>, ordsets:size(intersection)}]}); _ -> ok end end, proplists:get_keys(languagehash)), Browser in Using on and Erlang on views <other intersections for other facets[...]> true -> ok end end, proplists:get_keys(countryhash)); _ -> ok end end..22

24 Result for Country View (fragment) Browser in Using on and Erlang on views.23

25 Reduce Function (common to FB views fun (Key, Values) -> AddToDict = fun (CurrentEntry, Dict) -> {[{<<"facet">>, Facet}, {<<"value">>, Value}, {<<"docs">>, Documents}]} = CurrentEntry, DictKey = {Facet, Value}, case Facet of <<"_total_">> -> dict:append_list(dictkey, Documents, Dict); _ -> dict:update(dictkey, fun (Old) -> Old + Documents end, Documents, Dict) end end, DictToList = fun (Dict) -> lists:map(fun (Entry) -> {{Facet, Value}, Docs} = Entry, {struct, [{<<"facet">>, Facet}, {<<"value">>, Value}, {<<"docs">>, Docs}]} end, dict:to_list(dict)) end, end. DictToList(lists:foldl(fun (Value, Dict) -> AddToDict(Value, Dict) end, dict:new(), Values)) Browser in Using on and Erlang on views.24

26 Coding of initially, views were coded manually in JavaScript but poor performance in view computation on large index files lead to the usage of Erlang instead, which resulted into a significant performance boost writing views by hand is tedious and prone to error have written Erlang code that generates the code definitions for Erlang views automatically Erlang meta-code based on the concatenation of Erlang code strings facet specification -define( FACETS, ["country","language","modality", "organisation", "resourceclass"] ). -define( COND_FACETS, [ { "resourceclass", "corpus", ["genre"] }, { "resourceclass", "Tool", ["tooltype", "applicationtype" "inputtype", "outputtype", "lifecyclestatus" ]}]). Browser in Using on and Erlang on views.25

27 Coding of (cont d) specification leads to the generation of 121 views, with each view having between 5000 and bytes of Erlang code not all possible combinations of set intersections are necessary document sets resulting from first selecting facet F 1 and then selecting facet F 2 are identical to those when F 2 is selected first and then F 1 realized computation of all necessary intersections using Erlang combinators Use of Erlang Combinators comb_4(l) -> case length(l) < 4 of true -> "supply lists with length >= 4" ; _ -> [ {A,B,C,D,Z} A <- L, B <- L--[A], A < B, C <- L--[A,B], B < C, D <- L--[A,B,C], C < D, Z <- [L--[A,B,C,D]] ] end. Browser in Using on and Erlang on views.26

28 GUI Browser in Using on and Erlang on views.27

29 Queries = View request Browser in Using on and Erlang /mpi_mgt/_design/country/_view/country?key= Germany &reduce=true View result {"rows":[ {"key":"germany"," value":[ {"facet":"modality","value":"unspecified","docs":140}, {"facet":"modality","value":"speech/gestures","docs":230}, {"facet":"language","value":"german Sign Language","docs":433}, {"facet":"genre","value":"secondary document","docs":3}, {"facet":"genre","value":"movie","docs":458}, {"facet":"_total_","value":"_total_", "docs":["oai: "oai: [...] ]}]} on views.28

30 Browser in Using on and Erlang for Document Indexing views for document indexing are automatically generated from facet specification using JavaScript resulting map and reduce functions are in JavaScript too, s default view language computation of the view organisation takes approximately 25 minutes on 86k documents one-time payoff no effort has been made yet to increase the speed of view computation small changes in document database will have only small impact on view recomputation at the document indexing level on views.29

31 for computation of faceted search views computationally expensive JavaScript too slow Erlang much faster (better in memory and processor usage) Browser in Using on and Erlang setting each Erlang view stored in separate design document executed map-reduce computation to 24-core 96GB machine harvested and ingested approximately metadata documents on language resources five unconditional facets language (371), country (67), organisation (39), modality (32), and genre (50) many different facet values: modality = speech (59463); language = Dutch (18345); country = Germany (16178); organisation = Max Planck Institute for Psycholinguistics (16568), and genre = Discourse (33676) 31 different map-reduce pairs on views.30

32 Browser in Using on and Erlang Computation for generation of the views language, country, organisation, modality, and genre takes altogether less than one minute (using 5 cpus) generation of the ten 2-level views (users selected two facets, e.g., country : genre, country : language...) was computed in less than 1 minute (using 10 cpus). computation of the ten 3-level views where users selected three facets: < 7.5 minutes computation of the 5 4-level views: more than 2 hours to compute on views.31

33 Future work for optimisation Browser in Using on and Erlang currently, one indexing document for each of the metadata providers update from one data provider only requires a limited view recomputation but some data providers provide s of documents optimise index documents for faceted search reflect additions by new index document, so that incremental updates are indeed limited to document additions modifications and deletions by introducing MODIFY and DELETE lists that a revised map-reduce combination would need to consider on views.32

34 Related Work: Flamenco toolkit with web-based interface to give faceted access to large data collections given import format: the file facets.tsv listing all facets the file attrs.tsv listing all attributes of a given item the file items.tsv listing each collection item (following definition in attrs.tsv) with unique id for each entry facet in facets.tsv file facet_term: lists all terms for given facet with unique facet term ids facet_map associates unique facet term id with item ids data files ingested into Flamenco relational database (MySQL) Flamenco generates faceted browser s default/customizable GUI user s selection of facet terms translated into corresponding MySQL queries to compute all necessary set interactions results of executing MySQL queries are cached to avoid re-computation Browser in Using on and Erlang on views.33

35 Flamenco used in VLO faceted search access to language resources using Flamenco with same dataset See used Perl to translate XML-based metadata files into Flamenco s indexing data format (incl. curation) ingested data into the Flamenco database and adapted GUI script to generate all queries to warm-up the cache Comparison data preparation required for Flamenco roughly corresponds to our -based document indexing phase (simple views) data curation only happens when the views of the indexing phase are converted into the indexing documents MySQL queries fired by Flamenco correspond to the views computed in terms of the indexing documents Browser in Using on and Erlang on views.34

36 Advantages of also stores original metadata documents (with varying schemata), thus also serves as permanent storage conditional facets contribute to usability guiding users navigation need only be computed in subsets whose documents are indexed against terms the conditional facet depends on index generation accommodates for incremental updates on the metadata sets, supporting regular harvesting without recomputing all indices/views In Flamenco, any change in data set requires overwriting of all contexts/caches facet specification offers more declarative view index generation taken to higher level; easy to experiment with different facet configurations but, once facet specification is changed, index generation starts from scratch Browser in Using on and Erlang on views.35

37 Browser in Using on and Erlang with its native language Erlang is well suited for the development of industrial-strength applications s REST-based interface offers lean alternative to established software (Java-based Apache Tomcat webserver) Erlang s main limitations is lack of full macro package allowing users to write programs to write other programs Common-Lisp like defmacro would have made life easier currently, no strong support for Lisp (or Haskell) port to index and query documents in s main limitation when used with Erlang being the lack of documentation and example code available on views.36

38 Browser in Using on and Erlang general approach to aggregate heterogeneously structured documents and to make them accessible via faceted (and full-text) search works as long as documents relevant content can be given in JSON ( s native format) for given context, facet specification was straightforward desirable to detect good facet candidates automatically Castanet algorithm requires definition of target terms to best reflect the topics present in given collection combines target terms with hypernymy (IS-A) information of WordNet to both build facet hierarchies and to assign documents to the facets on views.37

39 Questions Browser in Using on and Erlang on views.38

The Virtual Language Observatory!

The Virtual Language Observatory! The Virtual Language Observatory! Dieter Van Uytvanck! CMDI workshop, Nijmegen! 2012-09-13! 1! Overview! VLO?! What is behind it? Relation to CMDI?! How do I get my data in there?! Demo + excercises!!

More information

CLARIN for Linguists Portal & Searching for Resources. Jan Odijk LOT Summerschool Nijmegen,

CLARIN for Linguists Portal & Searching for Resources. Jan Odijk LOT Summerschool Nijmegen, CLARIN for Linguists Portal & Searching for Resources Jan Odijk LOT Summerschool Nijmegen, 2014-06-23 1 Overview CLARIN Portal Find data and tools 2 Overview CLARIN Portal Find data and tools 3 CLARIN

More information

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH

SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH SMART CONNECTOR TECHNOLOGY FOR FEDERATED SEARCH VERSION 1.4 27 March 2018 EDULIB, S.R.L. MUSE KNOWLEDGE HEADQUARTERS Calea Bucuresti, Bl. 27B, Sc. 1, Ap. 10, Craiova 200675, România phone +40 251 413 496

More information

Flexible Design for Simple Digital Library Tools and Services

Flexible Design for Simple Digital Library Tools and Services Flexible Design for Simple Digital Library Tools and Services Lighton Phiri Hussein Suleman Digital Libraries Laboratory Department of Computer Science University of Cape Town October 8, 2013 SARU archaeological

More information

MuseKnowledge Hybrid Search

MuseKnowledge Hybrid Search MuseKnowledge Hybrid Search MuseGlobal, Inc. One Embarcadero Suite 500 San Francisco, CA 94111 415 896-6873 www.museglobal.com MuseGlobal S.A Calea Bucuresti Bl. 27B, Sc. 1, Ap. 10 Craiova, România 40

More information

Data Science Services Dirk Engfer Page 1 of 5

Data Science Services Dirk Engfer Page 1 of 5 Page 1 of 5 Services SAS programming Conform to CDISC SDTM and ADaM within clinical trials. Create textual outputs (tables, listings) and graphical output. Establish SAS macros for repetitive tasks and

More information

Data, Information, and Databases

Data, Information, and Databases Data, Information, and Databases BDIS 6.1 Topics Covered Information types: transactional vsanalytical Five characteristics of information quality Database versus a DBMS RDBMS: advantages and terminology

More information

META-SHARE: An Open Resource Exchange Infrastructure for Stimulating Research and Innovation

META-SHARE: An Open Resource Exchange Infrastructure for Stimulating Research and Innovation META-SHARE: An Open Resource Exchange Infrastructure for Stimulating Research and Innovation Stelios Piperidis Athena RC, Greece spip@ilsp.athena-innovation.gr Solutions for Multilingual Europe Budapest,

More information

Ing. José A. Mejía Villar M.Sc. Computing Center of the Alfred Wegener Institute for Polar and Marine Research

Ing. José A. Mejía Villar M.Sc. Computing Center of the Alfred Wegener Institute for Polar and Marine Research Ing. José A. Mejía Villar M.Sc. jmejia@awi.de Computing Center of the Alfred Wegener Institute for Polar and Marine Research 29. November 2011 Contents 1. Fedora Commons Repository 2. Federico 3. Federico's

More information

Metadata Ingestion and Processinng

Metadata Ingestion and Processinng biomedical and healthcare Data Discovery Index Ecosystem Ingestion and Processinng Jeffrey S. Grethe, Ph.D. 2017 BioCADDIE All Hands Meeting prototype Ingestion Indexing Repositories Ingestion ElasticSearch

More information

B2FIND: EUDAT Metadata Service. Daan Broeder, et al. EUDAT Metadata Task Force

B2FIND: EUDAT Metadata Service. Daan Broeder, et al. EUDAT Metadata Task Force B2FIND: EUDAT Metadata Service Daan Broeder, et al. EUDAT Metadata Task Force EUDAT Joint Metadata Domain of Research Data Deliver a service for searching and browsing metadata across communities Appropriate

More information

META-SHARE : the open exchange platform Overview-Current State-Towards v3.0

META-SHARE : the open exchange platform Overview-Current State-Towards v3.0 META-SHARE : the open exchange platform Overview-Current State-Towards v3.0 Stelios Piperidis Athena RC, Greece spip@ilsp.gr A Strategy for Multilingual Europe Brussels, Belgium, June 20/21, 2012 Co-funded

More information

National Documentation Centre Open access in Cultural Heritage digital content

National Documentation Centre Open access in Cultural Heritage digital content National Documentation Centre Open access in Cultural Heritage digital content Haris Georgiadis, Ph.D. Senior Software Engineer EKT hgeorgiadis@ekt.gr The beginning.. 42 institutions documented & digitalized

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Dell Zhang Birkbeck, University of London 2015/16 IR Chapter 04 Index Construction Hardware In this chapter we will look at how to construct an inverted index Many

More information

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS Questions & Answers- DBMS https://career.guru99.com/top-50-database-interview-questions/ 1) Define Database. A prearranged collection of figures known as data is called database. 2) What is DBMS? Database

More information

Bonus Content. Glossary

Bonus Content. Glossary Bonus Content Glossary ActiveX control: A reusable software component that can be added to an application, reducing development time in the process. ActiveX is a Microsoft technology; ActiveX components

More information

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data

EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data EUDAT-B2FIND A FAIR and Interdisciplinary Discovery Portal for Research Data Heinrich Widmann, DKRZ Claudia Martens, DKRZ Open Science Days, Berlin, 17 October 2017 www.eudat.eu EUDAT receives funding

More information

Curation module in action - its preliminary findings on VLO metadata quality

Curation module in action - its preliminary findings on VLO metadata quality Curation module in action - its preliminary findings on VLO metadata quality Davor Ostojić, Go Sugimoto, Matej Ďurčo (Austrian Centre for Digital Humanities) CLARIN Annual Conference 2016, Aix-en-Provence,

More information

Search Framework for a Large Digital Records Archive DLF SPRING 2007 April 23-25, 25, 2007 Dyung Le & Quyen Nguyen ERA Systems Engineering National Ar

Search Framework for a Large Digital Records Archive DLF SPRING 2007 April 23-25, 25, 2007 Dyung Le & Quyen Nguyen ERA Systems Engineering National Ar Search Framework for a Large Digital Records Archive DLF SPRING 2007 April 23-25, 25, 2007 Dyung Le & Quyen Nguyen ERA Systems Engineering National Archives & Records Administration Agenda ERA Overview

More information

Testbed a walk-through

Testbed a walk-through Testbed a walk-through Digital Preservation Planning: Principles, Examples and the Future with Planets, July 2008 Matthew Barr HATII at the University of Glasgow Contents Definitions and goals Achievements

More information

Capabilities of Cloudant NoSQL Database IBM Corporation

Capabilities of Cloudant NoSQL Database IBM Corporation Capabilities of Cloudant NoSQL Database After you complete this section, you should understand: The features of the Cloudant NoSQL Database: HTTP RESTfulAPI Secondary indexes and MapReduce Cloudant Query

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing Big Data Analytics Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 Big Data "The world is crazy. But at least it s getting regular analysis." Izabela

More information

CA Productivity Accelerator 12.1 and Later

CA Productivity Accelerator 12.1 and Later CA Productivity Accelerator 12.1 and Later Localize Content Localize Content Once you have created content in one language, you might want to translate it into one or more different languages. The Developer

More information

Repository In A Box (RIB)

Repository In A Box (RIB) Repository In A Box (RIB) August 18, 2003 Yuanlei Zhang To Be Covered» Brief overview of RIB» The RIB patches» RIB v2.2 The last Perl version release» RIB v3.0 Java version release 1 RIB History & Overview»

More information

clarin:el an infrastructure for documenting, sharing and processing language data

clarin:el an infrastructure for documenting, sharing and processing language data clarin:el an infrastructure for documenting, sharing and processing language data Stelios Piperidis, Penny Labropoulou, Maria Gavrilidou (Athena RC / ILSP) the problem 19/9/2015 ICGL12, FU-Berlin 2 use

More information

Database infrastructure for electronic structure calculations

Database infrastructure for electronic structure calculations Database infrastructure for electronic structure calculations Fawzi Mohamed fawzi.mohamed@fhi-berlin.mpg.de 22.7.2015 Why should you be interested in databases? Can you find a calculation that you did

More information

Oracle BI 12c: Build Repositories

Oracle BI 12c: Build Repositories Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Oracle BI 12c: Build Repositories Duration: 5 Days What you will learn This Oracle BI 12c: Build Repositories training teaches you

More information

Building a Digital Repository on a Shoestring Budget

Building a Digital Repository on a Shoestring Budget Building a Digital Repository on a Shoestring Budget Christinger Tomer University of Pittsburgh! PALA September 30, 2014 A version this presentation is available at http://www.pitt.edu/~ctomer/shoestring/

More information

UIS USER GUIDE SEPTEMBER 2013 USER GUIDE FOR UIS.STAT (BETA)

UIS USER GUIDE SEPTEMBER 2013 USER GUIDE FOR UIS.STAT (BETA) UIS USER GUIDE SEPTEMBER 2013 USER GUIDE FOR UIS.STAT (BETA) Published in 2013 by: UNESCO Institute for Statistics P.O. Box 6128, Succursale Centre-Ville Montreal, Quebec H3C 3J7 Canada Tel: (1 514) 343-6880

More information

THE POSIT TOOLSET WITH GRAPHICAL USER INTERFACE

THE POSIT TOOLSET WITH GRAPHICAL USER INTERFACE THE POSIT TOOLSET WITH GRAPHICAL USER INTERFACE Martin Baillie George R. S. Weir Department of Computer and Information Sciences University of Strathclyde Glasgow G1 1XH UK mbaillie@cis.strath.ac.uk george.weir@cis.strath.ac.uk

More information

MapReduce Algorithm Design

MapReduce Algorithm Design MapReduce Algorithm Design Contents Combiner and in mapper combining Complex keys and values Secondary Sorting Combiner and in mapper combining Purpose Carry out local aggregation before shuffle and sort

More information

Chapter 11 - Data Replication Middleware

Chapter 11 - Data Replication Middleware Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 11 - Data Replication Middleware Motivation Replication: controlled

More information

The use of OpenSource technologies for distributing historic maps and creating search engines for searching though the catalogues

The use of OpenSource technologies for distributing historic maps and creating search engines for searching though the catalogues The use of OpenSource technologies for distributing historic maps and creating search engines for searching though the catalogues Manfred Buchroithner*,János Jeney*+** * Technical University Dresden **

More information

Patent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF

Patent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF Patent Terminlogy Analysis: Passage Retrieval Experiments for the Intellecutal Property Track at CLEF Julia Jürgens, Sebastian Kastner, Christa Womser-Hacker, and Thomas Mandl University of Hildesheim,

More information

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders Data Warehousing ETL Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders 1 Overview Picture other sources Metadata Monitor & Integrator OLAP Server Analysis Operational DBs Extract Transform Load

More information

Database of historical places, persons, and lemmas

Database of historical places, persons, and lemmas Database of historical places, persons, and lemmas Natalia Korchagina Outline 1. Introduction 1.1 Swiss Law Sources Foundation as a Digital Humanities project 1.2 Data to be stored 1.3 Final goal: how

More information

DB2 for z/os: Programmer Essentials for Designing, Building and Tuning

DB2 for z/os: Programmer Essentials for Designing, Building and Tuning Brett Elam bjelam@us.ibm.com - DB2 for z/os: Programmer Essentials for Designing, Building and Tuning April 4, 2013 DB2 for z/os: Programmer Essentials for Designing, Building and Tuning Information Management

More information

Web Services for Visualization

Web Services for Visualization Web Services for Visualization Gordon Erlebacher (Florida State University) Collaborators: S. Pallickara, G. Fox (Indiana U.) Dave Yuen (U. Minnesota) State of affairs Size of datasets is growing exponentially

More information

Edge Side Includes (ESI) Overview

Edge Side Includes (ESI) Overview Edge Side Includes (ESI) Overview Abstract: Edge Side Includes (ESI) accelerates dynamic Web-based applications by defining a simple markup language to describe cacheable and non-cacheable Web page components

More information

Composer Guide for JavaScript Development

Composer Guide for JavaScript Development IBM Initiate Master Data Service Version 10 Release 0 Composer Guide for JavaScript Development GI13-2630-00 IBM Initiate Master Data Service Version 10 Release 0 Composer Guide for JavaScript Development

More information

New EuroVO registry. architecture and status as of May Menelaus Perdikeas, ESAC Neuropublic.

New EuroVO registry. architecture and status as of May Menelaus Perdikeas, ESAC Neuropublic. New EuroVO registry * architecture and status as of May 2014 Menelaus Perdikeas, ESAC Neuropublic mperdikeas@sciops.esa.int EuroVO new registry developed from scratch as a drop-in replacement of existing

More information

Introduction

Introduction Introduction EuropeanaConnect All-Staff Meeting Berlin, May 10 12, 2010 Welcome to the All-Staff Meeting! Introduction This is a quite big meeting. This is the end of successful project year Project established

More information

How to pimp high volume PHP websites. 27. September 2008, PHP conference Barcelona. By Jens Bierkandt

How to pimp high volume PHP websites. 27. September 2008, PHP conference Barcelona. By Jens Bierkandt How to pimp high volume PHP websites 27. September 2008, PHP conference Barcelona By Jens Bierkandt 1 About me Jens Bierkandt Working with PHP since 2000 From Germany, living in Spain, speaking English

More information

An Experimental Command and Control Information System based on Enterprise Java Bean Technology

An Experimental Command and Control Information System based on Enterprise Java Bean Technology An Experimental Command and Control Information System based on Enterprise Java Technology Gerhard Bühler & Heinz Faßbender Research Establishment for Applied Sciences Research Institute for Communication,

More information

Thinking Beyond Search with Solr Understanding How Solr Can Help Your Business Scale. Magento Expert Consulting Group Webinar July 31, 2013

Thinking Beyond Search with Solr Understanding How Solr Can Help Your Business Scale. Magento Expert Consulting Group Webinar July 31, 2013 Thinking Beyond Search with Solr Understanding How Solr Can Help Your Business Scale Magento Expert Consulting Group Webinar July 31, 2013 The presenters Magento Expert Consulting Group Udi Shamay Head,

More information

FINANCIAL REGULATORY REPORTING ACROSS AN EVOLVING SCHEMA

FINANCIAL REGULATORY REPORTING ACROSS AN EVOLVING SCHEMA FINANCIAL REGULATORY REPORTING ACROSS AN EVOLVING SCHEMA MODELDR & MARKLOGIC - DATA POINT MODELING MARKLOGIC WHITE PAPER JUNE 2015 CHRIS ATKINSON Contents Regulatory Satisfaction is Increasingly Difficult

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

DELIVERABLE. D2.2: Modified MINT prototype. LoCloud. Local content in a Europeana cloud. Project Acronym: Grant Agreement number:

DELIVERABLE. D2.2: Modified MINT prototype. LoCloud. Local content in a Europeana cloud. Project Acronym: Grant Agreement number: DELIVERABLE Project Acronym: LoCloud Grant Agreement number: 325099 Project Title: Local content in a Europeana cloud D2.2: Modified MINT prototype Revision: Final Authors: Natasa Sofou (NTUA) Nassos Drosopoulos

More information

Sustainability of Text-Technological Resources

Sustainability of Text-Technological Resources Sustainability of Text-Technological Resources Maik Stührenberg, Michael Beißwenger, Kai-Uwe Kühnberger, Harald Lüngen, Alexander Mehler, Dieter Metzing, Uwe Mönnich Research Group Text-Technological Overview

More information

Erhard Hinrichs, Thomas Zastrow University of Tübingen

Erhard Hinrichs, Thomas Zastrow University of Tübingen WebLicht A Service Oriented Architecture for Language Resources and Tools Erhard Hinrichs, Thomas Zastrow University of Tübingen Current Situation Many linguistic resources (corpora, dictionaries, ) and

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1 Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished

More information

Implementing a Numerical Data Access Service

Implementing a Numerical Data Access Service Implementing a Numerical Data Access Service Andrew Cooke October 2008 Abstract This paper describes the implementation of a J2EE Web Server that presents numerical data, stored in a database, in various

More information

Version 2 Release 2. IBM i2 Enterprise Insight Analysis Upgrade Guide IBM SC

Version 2 Release 2. IBM i2 Enterprise Insight Analysis Upgrade Guide IBM SC Version 2 Release 2 IBM i2 Enterprise Insight Analysis Upgrade Guide IBM SC27-5091-00 Note Before using this information and the product it supports, read the information in Notices on page 35. This edition

More information

Néonaute: mining web archives for linguistic analysis

Néonaute: mining web archives for linguistic analysis Néonaute: mining web archives for linguistic analysis Sara Aubry, Bibliothèque nationale de France Emmanuel Cartier, LIPN, University of Paris 13 Peter Stirling, Bibliothèque nationale de France IIPC Web

More information

Efficient, Scalable, and Provenance-Aware Management of Linked Data

Efficient, Scalable, and Provenance-Aware Management of Linked Data Efficient, Scalable, and Provenance-Aware Management of Linked Data Marcin Wylot 1 Motivation and objectives of the research The proliferation of heterogeneous Linked Data on the Web requires data management

More information

FLAT: A CLARIN-compatible repository solution based on Fedora Commons

FLAT: A CLARIN-compatible repository solution based on Fedora Commons FLAT: A CLARIN-compatible repository solution based on Fedora Commons Paul Trilsbeek The Language Archive Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Paul.Trilsbeek@mpi.nl Menzo

More information

Wayne State University Libraries Digital Collections Platform: A New Home for Research on Detroit

Wayne State University Libraries Digital Collections Platform: A New Home for Research on Detroit Wayne State University Library Scholarly Publications Wayne State University Libraries 9-1-2014 Wayne State University Libraries Digital Collections Platform: A New Home for Research on Detroit Amelia

More information

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure

Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure Twan Goosen 1 (CLARIN ERIC), Nuno Freire 2, Clemens Neudecker 3, Maria Eskevich

More information

The Design of a DLS for the Management of Very Large Collections of Archival Objects

The Design of a DLS for the Management of Very Large Collections of Archival Objects Session: VLDL Architectures The Design of a DLS for the Management of Very Large Collections of Archival Objects Maristella Agosti, Nicola Ferro and Gianmaria Silvello Information Management Research Group

More information

Adobe. Using DITA XML for Instructional Documentation. Andrew Thomas 08/10/ Adobe Systems Incorporated. All Rights Reserved.

Adobe. Using DITA XML for Instructional Documentation. Andrew Thomas 08/10/ Adobe Systems Incorporated. All Rights Reserved. Adobe Using DITA XML for Instructional Documentation Andrew Thomas 08/10/2005 2005 Adobe Systems Incorporated. All Rights Reserved. Publishing & localization at Adobe Direct localization of software, documentation,

More information

TAUdb: PerfDMF Refactored

TAUdb: PerfDMF Refactored TAUdb: PerfDMF Refactored Kevin Huck, Suzanne Millstein, Allen D. Malony and Sameer Shende Department of Computer and Information Science University of Oregon PerfDMF Overview Performance Data Management

More information

Open Archives Forum - Technical Validation -

Open Archives Forum - Technical Validation - Open Archives Forum - Technical Validation - Birgit Matthaei Humboldt University Berlin, Germany Computer and Media Service, Electronic Publishing Group birgit.matthaei@cms.hu-berlin.de Creating Information

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 12 (Wrap-up) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information

Oracle BI 11g R1: Build Repositories

Oracle BI 11g R1: Build Repositories Oracle University Contact Us: + 36 1224 1760 Oracle BI 11g R1: Build Repositories Duration: 5 Days What you will learn This Oracle BI 11g R1: Build Repositories training is based on OBI EE release 11.1.1.7.

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 1, 2017 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 12 (Wrap-up) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2457

More information

A Closer Look at Fedora s Ingest Performance

A Closer Look at Fedora s Ingest Performance A Closer Look at Fedora s Ingest Performance Kai Strnad, Matthias Razum kai.strnad@fiz karlsruhe.de, matthias.razum@fiz karlsruhe.de FIZ Karlsruhe Development and Applied Research Germany Motivation It

More information

Apparo Fast Edit Edit Data Version management 3 in an IBM Cognos environment Technical Document

Apparo Fast Edit Edit Data Version management 3 in an IBM Cognos environment Technical Document 1 Mastertitelformat bearbeiten Apparo Fast Edit Edit Data Version management 3 in an IBM Cognos environment Technical Document Mastertextformat bearbeiten Zweite Ebene Data management in an IBM Cognos

More information

See Types of Data Supported for information about the types of files that you can import into Datameer.

See Types of Data Supported for information about the types of files that you can import into Datameer. Importing Data When you import data, you import it into a connection which is a collection of data from different sources such as various types of files and databases. See Configuring a Connection to learn

More information

Introduction to Hadoop and MapReduce

Introduction to Hadoop and MapReduce Introduction to Hadoop and MapReduce Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Large-scale Computation Traditional solutions for computing large

More information

From Open Data to Data- Intensive Science through CERIF

From Open Data to Data- Intensive Science through CERIF From Open Data to Data- Intensive Science through CERIF Keith G Jeffery a, Anne Asserson b, Nikos Houssos c, Valerie Brasse d, Brigitte Jörg e a Keith G Jeffery Consultants, Shrivenham, SN6 8AH, U, b University

More information

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes?

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes? White Paper Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes? How to Accelerate BI on Hadoop: Cubes or Indexes? Why not both? 1 +1(844)384-3844 INFO@JETHRO.IO Overview Organizations are storing more

More information

Processing 11 billions events a day with Spark. Alexander Krasheninnikov

Processing 11 billions events a day with Spark. Alexander Krasheninnikov Processing 11 billions events a day with Spark Alexander Krasheninnikov Badoo facts 46 languages 10M Photos added daily 320M registered users 190 countries 21M daily active users 3000+ servers 2 data-centers

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

8) A top-to-bottom relationship among the items in a database is established by a

8) A top-to-bottom relationship among the items in a database is established by a MULTIPLE CHOICE QUESTIONS IN DBMS (unit-1 to unit-4) 1) ER model is used in phase a) conceptual database b) schema refinement c) physical refinement d) applications and security 2) The ER model is relevant

More information

NOMAD Metadata for all

NOMAD Metadata for all EMMC Workshop on Interoperability NOMAD Metadata for all Cambridge, 8 Nov 2017 Fawzi Mohamed FHI Berlin NOMAD Center of excellence goals 200,000 materials known to exist basic properties for very few highly

More information

Oracle BI 11g R1: Build Repositories

Oracle BI 11g R1: Build Repositories Oracle University Contact Us: 02 6968000 Oracle BI 11g R1: Build Repositories Duration: 5 Days What you will learn This course provides step-by-step procedures for building and verifying the three layers

More information

3) CHARLIE HULL. Implementing open source search for a major specialist recruiting firm

3) CHARLIE HULL. Implementing open source search for a major specialist recruiting firm Advice: The time spent on pre-launch analysis is worth the effort to avoid starting from scratch and further alienating already frustrated users by implementing a search which appears to have no connection

More information

Business Intelligence and Reporting Tools

Business Intelligence and Reporting Tools Business Intelligence and Reporting Tools Release 1.0 Requirements Document Version 1.0 November 8, 2004 Contents Eclipse Business Intelligence and Reporting Tools Project Requirements...2 Project Overview...2

More information

Persistent identifiers: jnbn, a JEE application for the management of a national NBN infrastructure

Persistent identifiers: jnbn, a JEE application for the management of a national NBN infrastructure Persistent identifiers: j, a JEE application for the management of a national infrastructure Mario Incarnato, Roberto Puccinelli, Marco Spasiano Consiglio Nazionale delle Ricerche Emanuele Bellini Fondazione

More information

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal Heinrich Widmann, DKRZ DI4R 2016, Krakow, 28 September 2016 www.eudat.eu EUDAT receives funding from the European Union's Horizon 2020

More information

Best practices in the design, creation and dissemination of speech corpora at The Language Archive

Best practices in the design, creation and dissemination of speech corpora at The Language Archive LREC Workshop 18 2012-05-21 Istanbul Best practices in the design, creation and dissemination of speech corpora at The Language Archive Sebastian Drude, Daan Broeder, Peter Wittenburg, Han Sloetjes The

More information

PROCESSING MANAGEMENT TOOLS FOR EARTH OBSERVATION PRODUCTS AT DLR-DFD

PROCESSING MANAGEMENT TOOLS FOR EARTH OBSERVATION PRODUCTS AT DLR-DFD PROCESSING MANAGEMENT TOOLS FOR EARTH OBSERVATION PRODUCTS AT DLR-DFD M. Böttcher (1), R. Reißig (2), E. Mikusch (2), C. Reck (2) (1) Werum Software & Systems, D-21337 Lüneburg, Germany, phone: +49 4131

More information

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation Voldemort Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/29 Outline 1 2 3 Smruti R. Sarangi Leader Election 2/29 Data

More information

How can CLARIN archive and curate my resources?

How can CLARIN archive and curate my resources? How can CLARIN archive and curate my resources? Christoph Draxler draxler@phonetik.uni-muenchen.de Outline! Relevant resources CLARIN infrastructure European Research Infrastructure Consortium National

More information

HKTA TANG HIN MEMORIAL SECONDARY SCHOOL SECONDARY 3 COMPUTER LITERACY. Name: ( ) Class: Date: Databases and Microsoft Access

HKTA TANG HIN MEMORIAL SECONDARY SCHOOL SECONDARY 3 COMPUTER LITERACY. Name: ( ) Class: Date: Databases and Microsoft Access Databases and Microsoft Access Introduction to Databases A well-designed database enables huge data storage and efficient data retrieval. Term Database Table Record Field Primary key Index Meaning A organized

More information

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT

EUDAT. A European Collaborative Data Infrastructure. Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT EUDAT A European Collaborative Data Infrastructure Daan Broeder The Language Archive MPI for Psycholinguistics CLARIN, DASISH, EUDAT OpenAire Interoperability Workshop Braga, Feb. 8, 2013 EUDAT Key facts

More information

CACAO PROJECT AT THE 2009 TASK

CACAO PROJECT AT THE 2009 TASK CACAO PROJECT AT THE TEL@CLEF 2009 TASK Alessio Bosca, Luca Dini Celi s.r.l. - 10131 Torino - C. Moncalieri, 21 alessio.bosca, dini@celi.it Abstract This paper presents the participation of the CACAO prototype

More information

Incremental Updates VS Full Reload

Incremental Updates VS Full Reload Incremental Updates VS Full Reload Change Data Capture Minutes VS Hours 1 Table of Contents Executive Summary - 3 Accessing Data from a Variety of Data Sources and Platforms - 4 Approaches to Moving Changed

More information

Contents. Microsoft is a registered trademark of Microsoft Corporation. TRAVERSE is a registered trademark of Open Systems Holdings Corp.

Contents. Microsoft is a registered trademark of Microsoft Corporation. TRAVERSE is a registered trademark of Open Systems Holdings Corp. TPLWPT Contents Summary... 1 General Information... 1 Technology... 2 Server Technology... 2 Business Layer... 4 Client Technology... 4 Structure... 4 Ultra-Thin Client Considerations... 7 Internet and

More information

Community Edition. Web User Interface 3.X. User Guide

Community Edition. Web User Interface 3.X. User Guide Community Edition Talend MDM Web User Interface 3.X User Guide Version 3.2_a Adapted for Talend MDM Web User Interface 3.2 Web Interface User Guide release. Copyright This documentation is provided under

More information

SDMX self-learning package No. 3 Student book. SDMX-ML Messages

SDMX self-learning package No. 3 Student book. SDMX-ML Messages No. 3 Student book SDMX-ML Messages Produced by Eurostat, Directorate B: Statistical Methodologies and Tools Unit B-5: Statistical Information Technologies Last update of content February 2010 Version

More information

Generalized Document Data Model for Integrating Autonomous Applications

Generalized Document Data Model for Integrating Autonomous Applications 6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Generalized Document Data Model for Integrating Autonomous Applications Zsolt Hernáth, Zoltán Vincellér Abstract

More information

Using the data in the archive

Using the data in the archive Using the data in the archive Jacquelijn Ringersma The Language Archive Max Planck Institute for Psycholinguistics DGfS-CNRS Summer School on Linguistic Typology A very rich archive A very rich archive

More information

Appendix REPOX User Manual

Appendix REPOX User Manual D5.3.1 Europeana OAI-PMH Infrastructure Documentation and final prototype co-funded by the European Union The project is co-funded by the European Union, through the econtentplus programme http://ec.europa.eu/econtentplus

More information

Pentaho Data Integration (PDI) Standards for Lookups, Joins, and Subroutines

Pentaho Data Integration (PDI) Standards for Lookups, Joins, and Subroutines Pentaho Data Integration (PDI) Standards for Lookups, Joins, and Subroutines Change log (if you want to use it): Date Version Author Changes 10/11/2017 1.0 Matthew Casper Contents Overview... 1 Before

More information

IBM InfoSphere Information Server Version 8 Release 7. Reporting Guide SC

IBM InfoSphere Information Server Version 8 Release 7. Reporting Guide SC IBM InfoSphere Server Version 8 Release 7 Reporting Guide SC19-3472-00 IBM InfoSphere Server Version 8 Release 7 Reporting Guide SC19-3472-00 Note Before using this information and the product that it

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in

More information

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2

More information