FrameNet extension for the Semantic Web creation of the RDF/OWL version of the repository of senses, resource evaluation and lessons learned Irina Sergienya, University of Trento advisors: Volha Bryl (DKM) and Sara Tonelli (HLT), Foundazione Bruno Kessler
Content Introduction FrameNet WordNet Ontologies and OWL/RDF representation Sense Repository Work Resources and Tools Repository OWL representation: structure of ontology Filter and Statistics of the Repository with examples Results Literature and Links
Introduction
Introduction. FrameNet FrameNet is a lexical database of English, developed in Berkeley since 1997. Based on Frame Semantics and supported by corpus evidence. Words (<word, meaning> pairs Lexical Units) evoke Frames, Frames have participants semantic roles (= Frame Elements). Examples: [ Cook The boys] GRILL [ Food their catches] [ Heating_instrument on an open fire]. [ Cook Drew] BAKED [ Food an apple pie] [ Container in a pie tin]. 1043 Lexical Frames, 10014 FEs in Lexical Frames Relations between frames exist (e.g. inheritance, causation, precedence,...)
Introduction. Semantic Types Semantic Types. Used for frames, frame elements, lexical units. Basic type of fillers of frame elements. Example: [ Cook Drew] BAKED [ Food an apple pie] [ Container in a pie tin] Cook: Sentient Container: Container Heating_Instrument: Physical_entity 73 semantic types in all, 46 semantic types for frame elements. 29 semantic types were used for annotation. Problems with semantic types: Too general (e.g. Physical_entity) or hard to make use of (e.g. Goal) Coverage not very high: ~54% of FEs have semtypes
Introduction. WordNet WordNet is a large lexical database of English, developed in Princeton. Synsets sets of cognitive synonyms for nouns, verbs, adjectives and adverbs. Example: Plant (n) {plant, works, industrial plant} (n) {plant, flora, plant life} (v) {plant, set} (v) {plant, implant} 117 000 synsets. Relations between synsets: hyperonymy, hyponymy, meronymy, troponymy, antonymy. Example: {bed} is hyponym of {furniture, piece_of_furniture}
Introduction. OWL/RDF Ontology is formal representation of knowledge as a set of concepts within a domain, and the relationships among those concepts. The Web Ontology Language (OWL) is a family of KR languages for authoring ontologies. Formal semantics, RDF/XML based serializations for the Semantic Web. OWL can represent: Classes, Properties (Object and Datatype), Instances, Operations (Union, Intersection,...).
Introduction. OWL/RDF The Resource Description Framework (RDF) is a family of specifications originally designed as a metadata data model. General method for conceptual description or modeling of information. Statements in form of triples: Subject Predicate Object. Example: "The sky has the color blue" in RDF is as the triple: a subject denoting "the sky", a predicate denoting "has the color", an object denoting "blue". ex:sky rdf:type owl:thing ex:hascolor rdf:type rdf:property ex:blue rdf:type ex:color ex:sky ex:hascolor ex:blue
Introduction. OWL/RDF The Resource Description Framework (RDF) is a family of specifications originally designed as a metadata data model. General method for conceptual description or modeling of information. Statements in form of triples: Subject Predicate Object. Example: "The sky has the color blue" in RDF is as the triple: a subject denoting "the sky", a predicate denoting "has the color", an object denoting "blue". ex:sky rdf:type owl:thing ex:hascolor rdf:type rdf:property ex:blue rdf:type ex:color ex:sky ex:hascolor ex:blue
Sense Repository
Sense Repository A Novel FrameNet based Resource for the Semantic Web by Volha Bryl, Sara Tonelli, Claudio Giuliano, Luciano Serafini. Create the repository of senses for frame elements, where senses are WordNet synsets. Why? To add or enhance the semantic type information: To improve the resource itself, To improve frame annotation tools performance.
Sense Repository
Sense Repository 3846 Frame FE pairs, 24 569 lines in files
Sense Repository Goal: make the resource available to Semantic Web An intermediate level between two resources FrameNet OWL representation where FE is a class WordNet in RDF/OWL where synset is an individual Created with Protégé...actually, latex2owl...
Work
Resources and Tools WordNet 3.0 RDF representation; FrameNet 1.5 xml representation; Mapping from FrameNet FE semantic types to WordNet synset; FrameNet Repository files. jdk 1.6; Intellij IDEA 11.1; Apache Jena 2.7.3 for processing ontologies; latex2owl 1.6 for translation ontology from latex style format to OWL.
Structure of ontology base = http://dkm.fbk.eu/index.php/framenet_extension:_repository_of_senses# fn = http://www.icsi.berkeley.edu/~jan/framenet.owl# wn30 = http://purl.org/vocabularies/princeton/wn30/
Repository in Protégé ontology editor
Filtering the Repository Filter can be used to filter Repository data with specified criteria: entity that corresponds to one of the set of frames, frame elements, semantic types, synsets or is hyponym of one of the synsets. Filter.jar [[-f F(1)... F(k)] [-fe FE(1)... FE(l)] [-st ST(1)... ST(m)] [-wn WN(1)... WN(n)] [-hyp H(1)... H(p)]] java -jar Filter.jar -f Cooking_creation -wn synset-food-noun-1 -hyp synset-food-noun-1 Cooking_creation Produced_food null synset-nutriment-noun-1 53 16 0.3019 Cooking_creation Produced_food null synset-foodstuff-noun-2 53 11 0.2075 Cooking_creation Produced_food null synset-food-noun-1 53 4 0.0755 Cooking_creation Ingredients null synset-beverage-noun-1 10 2 0.2000 Cooking_creation Ingredients null synset-cream-noun-2 10 2 0.2000 Cooking_creation Ingredients null synset-egg-noun-2 10 1 0.1000 Cooking_creation Ingredients null synset-food-noun-1 10 1 0.1000 Cooking_creation Ingredients null synset-foodstuff-noun-2 10 1 0.1000 Cooking_creation Means State_of_affairs synset-salt-noun-2 1 1 1.0000 Number of entries: 9 Number of examples: 39
Repository Statistics Statistics script counts statistics for the information in the Repository. eu.fbk.dkm.filterrepository.statistics [arguments [ top] [ threshold]] no args: statistics for all semtypes connected to WordNet synsets, semtype wnsynset totalnumex matchnumex rate 1 arg <semtype>: statistics for semtype, wnsynset numex rate 2 args <frame frameelement>: statistics for frame frameelement. wnsynset numex rate top < top N> or < top N%>: write to output only top N or top N% entities with the highest rate, threshold < threshold M> or < threshold M%>: write to output only entities that have more than M examples or more than M% rate.
Repository Statistics java -classpath Filter.jar eu.fbk.dkm.filterrepository.statistics semtype wnsynset totalnumex matchnumex rate Animate_being synset-animal-noun-1 0 0 0.0000 Artifact synset-artifact-noun-1 1106 752 0.6799 Body_of_water synset-body_of_water-noun-1 0 0 0.0000 Content synset-content-noun-5 1855 0 0.0000 Event synset-event-noun-1 56 6 0.1071 Group synset-group-noun-1 0 0 0.0000 Human synset-person-noun-1 2957 2579 0.8722 Human_act synset-act-noun-2 246 7 0.0285 Living_thing synset-organism-noun-1 110 108 0.9818 Location synset-location-noun-1 2355 525 0.2229 Material synset-material-noun-1 0 0 0.0000 Message synset-message-noun-2 2286 22 0.0096 Organization synset-organization-noun-1 23 6 0.2609 Physical_entity synset-entity-noun-1 805 805 1.0000 Physical_object synset-object-noun-1 10730 7725 0.7199 Quantity synset-measure-noun-2 90 57 0.6333 Region synset-geological_formation-noun-1 0 0 0.0000 Running-water synset-watercourse-noun-1 0 0 0.0000 Shape synset-shape-noun-2 0 0 0.0000 Social relation synset-social_relation-noun-1 0 0 0.0000 State synset-state-noun-2 185 1 0.0054 Structure synset-structure-noun-1 0 0 0.0000
Repository Statistics java -classpath Filter.jar eu.fbk.dkm.filterrepository.statistics Location -top 10% synset-whole-noun-2 406 0.1724 synset-physical_entity-noun-1 234 0.0994 synset-location-noun-1 227 0.0964 synset-object-noun-1 168 0.0713 synset-region-noun-3 135 0.0573 synset-abstraction-noun-6 86 0.0365 synset-living_thing-noun-1 73 0.0310 synset-artifact-noun-1 66 0.0280 synset-organism-noun-1 60 0.0255 synset-instrumentality-noun-3 41 0.0174 synset-structure-noun-1 34 0.0144 synset-act-noun-2 32 0.0136 synset-event-noun-1 32 0.0136 synset-country-noun-2 31 0.0132 synset-group-noun-1 28 0.0119 synset-economy-noun-1 25 0.0106 synset-organization-noun-1 25 0.0106
Repository Statistics java -classpath Filter.jar eu.fbk.dkm.filterrepository.statistics Cooking_creation Produced_food synset-nutriment-noun-1 16 0.3019 synset-baked_goods-noun-1 11 0.2075 synset-foodstuff-noun-2 11 0.2075 synset-food-noun-1 4 0.0755 synset-organism-noun-1 3 0.0566 synset-fluid-noun-1 2 0.0377 synset-aging-noun-2 1 0.0189 synset-article-noun-2 1 0.0189 synset-pasta-noun-2 1 0.0189 synset-physical_phenomenon-noun-1 1 0.0189 synset-plant_part-noun-1 1 0.0189 synset-structure-noun-1 1 0.0189
Results
Results Output: Sense Repository OWL representation and tools for filtering and counting statistics are freely available online; Sense Repository can be used in other applications for semantic role labeling. Issues solved: 1. FrameNet 1.5 OWL vs. XML representation; 2. WordNet 3.0 Synset to SynsetId mapping, hyponymy; 3. Update in case of new versions of resources and Repository. Work done: Wrote a script that converts the Sense Repository to OWL/RDF format. Implemented filtering and counting statistics. Wrote documentation. Made a presentation.
Literature and Links V. Bryl, S. Tonelli, C. Giuliano, L. Serafini. A Novel FrameNet based Resource for the Semantic Web. In Proceedings of SAC 2012, pages 360 365. J. Scheffczyk, C. F. Baker and S. Narayanan. Ontology based Reasoning about Lexical Resources. In Proceedings of OntoLex 2006 Workshop, 2006. A. G. Nuzzolese, A. Gangemi, and V. Presutti. Gathering lexical linked data and knowledge patterns from framenet. In Proceedings of K CAP 2011, pages 41 48, 2011. J. Ruppenhofer, M. Ellsworth, M. R. Petruck, C. R. Johnson, and J. Scheffczyk. FrameNet II: Extended Theory and Practice. 2010. M. Dean and G. Schreiber, eds. OWL Web Ontology Language Reference. W3C Recommendation, 10 Feb 2004. M. K. Smith, C. Welty and D. L. McGuinness, eds. OWL Web Ontology Language Guide. W3C Recommendation 10 Feb 2004. F. Manola and E. Miller, eds. RDF Primer. W3C Recommendation, 10 Feb 2004. M. Assem, A. Gangemi and G. Schreiber. RDF/OWL Representation of WordNet. W3C Working Draft, 19 June 2006. M. Horridge, H. Knublauch, A. Rector, R. Stevens and C. Wroe. A Practical Guide To Building OWL Ontologies Using The Protégé OWL Plugin and CO ODE Tools Edition 1.0. The University Of Manchester, 2004. WordNet: http://wordnet.princeton.edu/ WordNet 3.0 in RDF: http://semanticweb.cs.vu.nl/lod/wn30/ FrameNet: http://framenet.icsi.berkeley.edu/
Thank you!