Semantic Exploitation of Engineering Models: An Application to Oilfield Models Laura Silveira Mastella 1,YamineAït-Ameur 2,Stéphane Jean 2, Michel Perrin 1, and Jean-François Rainaud 3 1 Ecole des Mines de Paris, Paris, France 2 LISI/ENSMA and University of Poitiers, Futuroscope, France 3 Institut Français du Pétrole, Rueil-Malmaison, France {laura.mastella,michel.perrin}@ensmp.fr {yamine,jean}@ensma.fr j-francois.rainaud@ifp.fr Abstract. Engineering development activities rely on computer-based models, which enclose technical data issued from different sources. In this heterogeneous context, retrieving, re-using and merging information is a challenge. We propose to annotate engineering models with concepts of domain ontologies, which provide data with explicit semantics. The semantic annotation makes it possible to formulate queries using the semantic concepts that are significant to the domain of the engineers. This work is inspired from a petroleum engineering case study and we validate our approach by presenting an implementation of this case study. Keywords: Ontologies, Ontology-based databases, Semantic annotation, Oilfield engineering models. 1 Introduction Engineering development activities produce a huge quantity of technical data that can be expressed in various types of models: database tables, programming modules, mathematical expressions, and so on. Retrieving and re-using information created in such heterogeneous models is a challenge. The engineering area, studied in this work, is the petroleum exploration, and, in particular, the activity of oil & gas reservoir modelling. Considering a typical reservoir modelling workflow, geoscientists rely on three-dimensional representations of the earth underground (called reservoir models or oilfield models) to take important decisions about oil-reservoir operations. The proposal of this work is an approach based on semantic annotation of engineering models. We envisage the use of semantic annotation for: (i) makingthe expert knowledge explicit in the model and (ii) querying raw data using semantic concepts. To carry out this approach, we consider the use of an Ontology-Based Database (OBDB), that stores data and ontologies in a common and shared database. An implementation for the oil & gas reservoir modelling activity and some initial results are presented to illustrate how this approach enables emergence of the semantics of the concepts manipulated by engineering models. A.P. Sexton (Ed.): BNCOD 2009, LNCS 5588, pp. 203 207, 2009. c Springer-Verlag Berlin Heidelberg 2009
204 L. Silveira Mastella et al. 2 Background The last decade has seen the emergence of the use of ontologies, inordertoprovide explicit and formal semantics to specific domains [1]. Several tools support ontology-based annotations creation over resources (web pages, textual documents, multimedia files). From a comparative analysis of semantic annotation projects, available in [2], we understand that most of these tools still rely on knowledge stored in HTML pages, XML documents or in other textual resources. None of the annotation tools proposed so far, enable the annotation of engineering models (or, more generally, annotation of computer-based models). As a matter of fact, no technique allowing to complete computer-based models by formal comments or explanations, nor to attach more semantics to the technical data produced by modelling tools is available. Indeed, a big part of a company s knowledge can be found in text repositories, such as projects documentation and reports. Nevertheless, engineering models keep storing some strategic knowledge that cannot be lost. Next section proposes an approach for addressing these issues. 3 Proposed Approach In order to make experts knowledge explicit in engineering models, we propose to annotate these models with domain ontologies concepts and/or instances. The engineering models annotation process must consider the following elements: (i) ontologies and their instances; (ii) engineering models and their data and (iii) annotations of the engineering models, which establish links between the (i) and (ii). (i) Knowledge related to the considered specialized fields has been designed and formalized as domain ontologies and stored in an ontology-based database. (ii) We are interested in persisting engineering data in the same database where ontologies are stored. But it is not desirable to represent the engineering meta-data using constructs of ontologies, since we do not expect engineering models to have the same features as those that are currently proposed for ontologies (e.g., subsumption between concepts). Constructs of engineering meta-data should be different from those used to define ontologies (such as owl:class in OWL language), because the two entities have different purposes. For these reasons, an Engineering Meta-model is defined. This Engineering Meta-model encodes the minimum necessary set of features that allows a uniform description of engineering models (file name, identificator, main composite objects, etc.). These constructs make it possible to represent the structure in which data are organized. The main constructs for building engineering meta-data are #DataElement and #DataAttribute (part (2) of Fig. 1). (iii) Finally, we provide resources for linking engineering meta-models to the concepts of ontologies. In this context, each end-user may have a different interpretation of the model instances. For the same dataset, different annotations expressing each user s opinion probably exist. They must be uniquely identified.
Semantic Exploitation of Engineering Models: Oilfield Case Study 205 Fig. 1. Extension of OntoDB (2) and implementation of the case study (3 and 4) One user should be able to annotate several data elements with one ontology concept, and vice-versa. As a consequence, a N-to-N relationship for the annotation elements is required. In this approach, annotation becomes a top-level entity, separated from the ontological concept and from the entity being annotated. The introduced annotation entity has also its own attributes, such as creation date, author name, version information, etc. Therefore, a Meta-model for Annotation is also required. The #Annotation construct creates a link between the construct of ontology concepts and the #DataElement construct through the relations #annotates and #isannotatedby. The added meta-models are illustrated in part (2) of the UML diagrams of Fig. 1. 4 Case Study: Annotating Oilfield Models In order to implement the case study, we store the whole data and knowledge manipulated by engineers in a persistent infrastructure. For this purpose, we use ontology-based databases. 4.1 Ontology-Based Databases (OBDBs) Ontology-Based Databases (OBDB) address the persistence of ontologies while taking advantage of the characteristics of databases (scalability, safety, capability to manage a huge amount of data, etc.)[3]. The OntoDB system [4] makes a clear separation of modelling layers. The approach enables the extension of the coremodel with constructors of other ontology models (e.g, RDF, OWL) and also the separation of the instances from their data structure and from their meta-model. The architecture of OntoDB is composed of four parts (see Fig. 1): system tables (1), meta-schema constructs (2), ontologies (3) and instances (1).
206 L. Silveira Mastella et al. In order to exploit the OntoDB system, the OntoQL language has been proposed in [5]. The OntoQL language has a syntax similar to SQL, and provides operations at the three layers of OntoDB, from the logical level to the meta-schema level. Consequently, it is possible to extend the core-model of OntoDB using OntoQL Definition Language operators, which alter the meta-schema level. Support of evolution of the OBDB core-model is an important characteristic, since we need to extend this architecture to represent other data containers than the ontology meta-model (i.e., an annotation meta-model). As a consequence, we have chosen the OntoDB system for the persistence of data and ontologies in our approach. 4.2 Implementation The first implementation step consists in extending the OntoDB s core-model (which already contains the ontology constructs #Class and #Property) toinclude the constructs of the Engineering Meta-Model and of the Annotation Meta-Model proposed in the previous section (see part (2) of Fig. 1). When the meta-model is set up, the oilfield meta-data have been defined using the new constructs for Engineering Meta-Models. For the case study, we chose a format known as XYZ Format, which represents raw data as 3D points. The OntoQL statement Q1 exemplifies the creation, by means of the added construct #DataElement, of an XYZFile element, with filename and surfacename of type String, and multiplicity 1 as attributes (part (3) of Fig. 1). Fig. 1, part (4), shows an instance of such data element representing the file reflect3d 0047.xyz. Q1: CREATE #DataElement XYZFile (PROPERTIES (filename String 1 1, surfacename String 1 1)) The advantage of representing the technical data as instances within OntoDB is the capability to store both data and ontologies in the same repository, offering the possibility to create the link between the two. Next, with the help of the end-user, the annotations that represent the experts interpretation about field data are created. In the present case study, a well known annotation rule set up by experts is used: data contained in an XYZ file are interpreted by geologists as corresponding to some Seismic Reflector. Seismic Reflector is a term from the GeoSeismics domain, and it is represented as the ontology concept Reflector. Therefore, by means of the added construct #Annotation, the OntoQL statement Q2 creates an annotation-type that links elements of type XYZFile to concepts of type Reflector ((see part (3) of Fig. 1)). Q2: CREATE #Annotation ReflectorAnnotation (XYZFileURI REF(XYZFile), ReflectorURI REF(Reflector)) Part (4) of Fig. 1 shows an instance of the typed-annotation ReflectorAnnotation, which refers to an instance of the meta-data XYZFile andaninstanceof the ontology concept Reflector.
Semantic Exploitation of Engineering Models: Oilfield Case Study 207 4.3 Exploitation of the Extended OntoDB Architecture At this point of the work, it is possible to query field data using concepts from the domain ontologies. To illustrate this querying capability, query Q3 retrieves the filename reflect3d 0047.xyz, which is interpreted as the Seismic Reflector identified by URI r1: Q3: SELECT filename from XYZFile JOIN ReflectorAnnotation ON XYZFile.oid = ReflectorAnnotation.annotates.oid WHERE ReflectorAnnotation.isAnnotatedBy.oid = (select Reflector.oid from Reflector where Reflector.URI = r1 ) Thanks to the new proposed constructs, the semantic concerning the engineering models, which is usually implicit within data, can be added in the database and retrieved by means of semantic queries. 5 Conclusions and Future Work This paper has presented an extension of ontology-based databases that handles the semantic annotation of data elements issued from engineering models. As a consequence, we have obtained a homogeneous representation of the whole data and knowledge manipulated by engineers. This approach makes it possible to formulate queries that use domain specific semantic concepts instead of enforcing users to understand how data are stored within the database. As future work, we intend to explore the multidisciplinary aspect of this domain. We aim at correlating data issued from various fields of expertise, by means of ontology mappings and subsumption relations. Acknowledgments. This work is sponsored by The CAPES Foundation, Ministry of Education of Brazil (process no. 4232/05-4). References 1. Gruber, T.: Toward principles for the design of ontologies used for knowledge sharing. Int. Journal of Human and Computer Studies 43(5/6), 907 928 (1995) 2. Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F.: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. In: Web Semantics: Science, Services and Agents on the World Wide Web, vol. 4 (2006) 3. Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54 68. Springer, Heidelberg (2002) 4. Dehainsala, H., Pierra, G., Bellatreche, L.: OntoDB: An ontology-based database for data intensive applications. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 497 508. Springer, Heidelberg (2007) 5. Jean, S., Aït-Ameur, Y., Pierra, G.: Querying ontology based database using ontoql (an ontology query language). In: Meersman, R., Tari, Z. (eds.) OTM 2006. LNCS, vol. 4275, pp. 704 721. Springer, Heidelberg (2006)