Linguistic Values on Attribute Subdomains in Vague Database Querying

Similar documents
THE ANNALS OF DUNAREA DE JOS UNIVERSITY OF GALATI FASCICLE III, 2005 ISSN X ELECTROTEHNICS, ELECTRONICS, AUTOMATIC CONTROL, INFORMATICS

IPMU July 2-7, 2006 Paris, France

A framework for fuzzy models of multiple-criteria evaluation

THE ANNALS OF DUNAREA DE JOS UNIVERSITY OF GALATI FASCICLE III, 2005 ISSN X ELECTROTEHNICS, ELECTRONICS, AUTOMATIC CONTROL, INFORMATICS

Lecture notes. Com Page 1

Granular Computing: A Paradigm in Information Processing Saroj K. Meher Center for Soft Computing Research Indian Statistical Institute, Kolkata

GEOG 5113 Special Topics in GIScience. Why is Classical set theory restricted? Contradiction & Excluded Middle. Fuzzy Set Theory in GIScience

Fuzzy Set-Theoretical Approach for Comparing Objects with Fuzzy Attributes

FUZZY SQL for Linguistic Queries Poonam Rathee Department of Computer Science Aim &Act, Banasthali Vidyapeeth Rajasthan India

INFORMATION RETRIEVAL SYSTEM USING FUZZY SET THEORY - THE BASIC CONCEPT

Chapter 4 Fuzzy Logic

A METHOD OF NEUTROSOPHIC LOGIC TO ANSWER QUERIES IN RELATIONAL DATABASE

CHAPTER 4 FREQUENCY STABILIZATION USING FUZZY LOGIC CONTROLLER

Intelligent flexible query answering Using Fuzzy Ontologies

Fuzzy Set Theory and Its Applications. Second, Revised Edition. H.-J. Zimmermann. Kluwer Academic Publishers Boston / Dordrecht/ London

Figure-12 Membership Grades of x o in the Sets A and B: μ A (x o ) =0.75 and μb(xo) =0.25

1. Fuzzy sets, fuzzy relational calculus, linguistic approximation

CHAPTER 5 FUZZY LOGIC CONTROL

SIMILARITY MEASURES FOR MULTI-VALUED ATTRIBUTES FOR DATABASE CLUSTERING

What is all the Fuzz about?

Introduction to Fuzzy Logic. IJCAI2018 Tutorial

NOTES TO CONSIDER BEFORE ATTEMPTING EX 1A TYPES OF DATA

What is all the Fuzz about?

Review of Fuzzy Logical Database Models

Fuzzy Sets and Systems. Lecture 1 (Introduction) Bu- Ali Sina University Computer Engineering Dep. Spring 2010

Multi Attribute Decision Making Approach for Solving Intuitionistic Fuzzy Soft Matrix

XI International PhD Workshop OWD 2009, October Fuzzy Sets as Metasets

FUZZY BOOLEAN ALGEBRAS AND LUKASIEWICZ LOGIC. Angel Garrido

Introducing fuzzy quantification in OWL 2 ontologies

Generalized Implicative Model of a Fuzzy Rule Base and its Properties

Accommodating Imprecision in Database Systems: Issues and Solutions

Fuzzy Logic Approach towards Complex Solutions: A Review

A Model of Machine Learning Based on User Preference of Attributes

TOPSIS Modification with Interval Type-2 Fuzzy Numbers

Model theoretic and fixpoint semantics for preference queries over imperfect data

Extending a Hybrid CBR-ANN Model by Modeling Predictive Attributes using Fuzzy Sets

Reducing Quantization Error and Contextual Bias Problems in Object-Oriented Methods by Applying Fuzzy-Logic Techniques

Fuzzy Analogy: A New Approach for Software Cost Estimation

FUZZY INFERENCE SYSTEMS

ARTIFICIAL INTELLIGENCE. Uncertainty: fuzzy systems

FUZZY SPECIFICATION IN SOFTWARE ENGINEERING

CHAPTER 4 FUZZY LOGIC, K-MEANS, FUZZY C-MEANS AND BAYESIAN METHODS

Introduction. Aleksandar Rakić Contents

FUZZY LOGIC TECHNIQUES. on random processes. In such situations, fuzzy logic exhibits immense potential for

Genetic Tuning for Improving Wang and Mendel s Fuzzy Database

Ranking Fuzzy Numbers Using Targets

Exponential Membership Functions in Fuzzy Goal Programming: A Computational Application to a Production Problem in the Textile Industry

Optimization with linguistic variables

Chapter 2 Fuzzy Queries

Chapter 3. Uncertainty and Vagueness. (c) 2008 Prof. Dr. Michael M. Richter, Universität Kaiserslautern

FUNDAMENTALS OF FUZZY SETS

HAPPY VALENTINE'S DAY 14 Feb 2011 JAIST

Application of fuzzy set theory in image analysis. Nataša Sladoje Centre for Image Analysis

Improving the Wang and Mendel s Fuzzy Rule Learning Method by Inducing Cooperation Among Rules 1

Chapter 2: FUZZY SETS

Management Science Letters

Why Fuzzy? Definitions Bit of History Component of a fuzzy system Fuzzy Applications Fuzzy Sets Fuzzy Boundaries Fuzzy Representation

A NEW MULTI-CRITERIA EVALUATION MODEL BASED ON THE COMBINATION OF NON-ADDITIVE FUZZY AHP, CHOQUET INTEGRAL AND SUGENO λ-measure

An Approach to Image Retrieval on Fuzzy Object-Relational Databases using Dominant Color Descriptors

3.7 Denotational Semantics

Learning from uncertain image data using granular fuzzy sets and bio-mimetic applicability functions

Fuzzy Integrity Constraints for Native XML Database

SMART TARGET SELECTION IMPLEMENTATION BASED ON FUZZY SETS AND LOGIC

Neural Networks Lesson 9 - Fuzzy Logic

Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language

ISSN: Page 320

ECLT 5810 Clustering

Fuzzy if-then rules fuzzy database modeling

Resolving the Conflict Between Competitive and Cooperative Behavior in Michigan-Type Fuzzy Classifier Systems

Integration of Fuzzy Shannon s Entropy with fuzzy TOPSIS for industrial robotic system selection

INCONSISTENT DATABASES

Best proximation of fuzzy real numbers

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

CHAPTER 3 FUZZY RULE BASED MODEL FOR FAULT DIAGNOSIS

The Use of Fuzzy Logic at Support of Manager Decision Making

Why Fuzzy Fuzzy Logic and Sets Fuzzy Reasoning. DKS - Module 7. Why fuzzy thinking?

ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY

Assessment of Human Skills Using Trapezoidal Fuzzy Numbers

SOME TYPES AND USES OF DATA MODELS

Knowledge Engineering in Search Engines

CPS331 Lecture: Fuzzy Logic last revised October 11, Objectives: 1. To introduce fuzzy logic as a way of handling imprecise information

Membership Functions for a Fuzzy Relational Database: A Comparison of the Direct Rating and New Random Proportional Methods

Learning Directed Probabilistic Logical Models using Ordering-search

780 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 12, NO. 6, DECEMBER José Galindo, Angélica Urrutia, Ramón A. Carrasco, and Mario Piattini

Inference in Hierarchical Multidimensional Space

Keywords. Ontology, mechanism theories, content theories, t-norm, t-conorm, multiinformation systems, Boolean aggregation.

A Decision-Theoretic Rough Set Model

The Travelling Salesman Problem. in Fuzzy Membership Functions 1. Abstract

OPTIMAL PATH PROBLEM WITH POSSIBILISTIC WEIGHTS. Jan CAHA 1, Jiří DVORSKÝ 2

ECLT 5810 Clustering

Ranking Fuzzy Numbers Based on Ambiguity Degrees

Part I: Structured Data

About Summarization in Large Fuzzy Databases

Solving a Decision Making Problem Using Weighted Fuzzy Soft Matrix

Learning Fuzzy Rules Using Ant Colony Optimization Algorithms 1

Equi-sized, Homogeneous Partitioning

LP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Design of Neuro Fuzzy Systems

Qualitative Comparative Analysis (QCA) and Fuzzy Sets. 24 June 2013: Code of Good Standards. Prof. Dr. Claudius Wagemann

Transcription:

Linguistic Values on Attribute Subdomains in Vague Database Querying CORNELIA TUDORIE Department of Computer Science and Engineering University "Dunărea de Jos" Domnească, 82 Galaţi ROMANIA Abstract: - Intelligent database interfaces must be able to interpret and evaluate imprecise criteria in queries, containing certain vague terms, currently used in natural language speaking. The simplest vague selection criterion is expressed by a linguistic value, defined as fuzzy set on a database attribute domain. Generally, when a vague query is processed, the definitions of such possible vague terms must already exist in a knowledge base. There are also cases when linguistic variables must be dynamically defined, in accordance with the intermediate results of a query evaluation. A particular type of query is discussed and an evaluating procedure is proposed. Key-Words: - Artificial Intelligence, Database, Flexible Query, Vague Criteria, Fuzzy Logic, Linguistic Variable Introduction The access to databases is possible in the following two ways: operating with application programs, when a limited set of predetermined functions are available and operating directly on data, using relational command languages. The second one is unavoidable when an occasional operation, in particular terms, is performed. The most usual situation is database querying about various selection criteria. Two major limitations occur in such a database querying access: the rigid formal language syntax and the difficulty to realize and express precise criteria to locate the information. This happens because humans do not always think and speak in precise terms. So, it is very useful to provide intelligent interfaces to databases, able to understand natural language queries and more important, able to interpret and evaluate imprecise criteria in queries. Including vague criteria in a database query may have two advantages: the flexibility of the query expression the possibility to refine the results, assigning to each tuple the corresponding degree of criteria satisfaction. We particularly focus on the possible vagueness of the selection criterion, which involve certain vague terms, currently used in natural language speaking. So, all the discussion in that direction is inspired from the usual necessities of our final database users, expressed in many various linguistic forms. The fuzzy set theory is already established as the adequate framework to model and to manage vague expressions, or in other words, to evaluate vague queries sent to relational database. The selection vague criteria may be very simple, but it may also be very complex. We consider mainly the linguistic complexity, not the logical one. The linguistic complexity of the criterion is coming from various categories of vague terms with different semantic effects on the selection criterion, hence the logical complexity. A review of several categories of linguistically terms with vague meaning, their fuzzy model and specific operations are presented in [4], [5], and many others..66.33 2 3 4 5 6 7 8 9 Fig..The linguistic domain for the attribute

~ 23 2 23 25 Fig. 2. The definition of a fuzzy predicate by a fuzzy number STUDENT Name... Mark Age... John 7 2 Mary 8 22 George 25 Helen 9 23 Robby 7 24 Michael 8 22 Dan 4 25 Paul 9 2 Justin 6 2 Jerry 5 2 Mandy 7 26 Table. A relational table years For example, if the table and the definitions in figures and 2 are considered, the response to the query Retrieve the students around 23 years having or will be Name Mark Age ~ 23 Helen 9 23 Mary 8 22.66.5.5 Robby 7 24.33.5.5 Michael 8 22.66.5.5 Table 2. Ranked result of a vague query In order to evaluate this query, each vague term is considered according to its fuzzy model (or its definition). Generally, before evaluating a query, a knowledge base containing such term definitions must already exist. In the following, a particularly case is described: when a dynamic definition of vague terms is needed. 2 Crisp and Linguistic Domain The linguistic label is a word (usually coming from natural language) that designates a fuzzy entity. A linguistic label may be assigned to a fuzzy set or fuzzy quantity, suggesting a vague term from usual language, typical to the application area where our model is working. Although it is relying on the same representation style, the linguistic label may have various meaning, depending on the problem nature. For example, it may indicate: A gradual property ( for a student), when the membership degree of the fuzzy set is a function defined on an attribute domain (the [,] interval for the attribute), generally a monotonous function; the value of the membership function expresses the intensity of the property: between the degree (that is a not ), and the degree (that is an absolutely ). A category of objects, eventually having a certain property ( intelligent student ), when the membership function is not always monotonous and its value expresses the closeness degree of the current object to the object considered as the representative one for that category. The linguistic label stands for the semantic model of the fuzzy entity to which it is assigned. Therefore, in order to build the knowledge model for a given application domain using the fuzzy sets formalism, both aspects must be taken into account: the linguistic representation and the numerical representation of the knowledge pieces. If some fuzzy sets are defined on the same referential domain, with different membership functions, then a fuzzy set family is formed. If a linguistic label is assigned to each fuzzy set, then the set of these labels may be the definition set of a linguistic variable, and the labels are named linguistic values. The linguistic variables is the quadruple: (V, E(V), U, M) where V is the name of the linguistic variable E(V) is a set of linguistic values for the linguistic variable V U is the crisp referential domain of the linguistic variable V M is a mapping E(V) F (U) which maps a fuzzy set on U for each linguistic values of V. The choice of numerical representation of a linguistic term is seldom obvious. However, at the qualitative level, the term is well understood and well semantically placed with respect to other linguistic expressions. Thus, an order relation p on E(V) is easy to define; for example: little p intermediate p big Obviously, p is a semantic order relation. In most cases, the set of the linguistic values for a linguistic variable

E(V) = { e i }, i =,..., n e is an ordered set ( e i p e j for i j ) having an odd cardinality. The definition of the intermediate value is generally situated in the centre of the referential domain, while the others are placed symmetrically on the two sides (like in figure 3). Some possible operators manipulating linguistic values may be ([3]):. NEG( e i ) = e j, j = n e - i 2. MAX(e i, e j ) = e i, if e j p e i 3. MIN(e i, e j ) = e i, if e i p e j This intuitive order relation p between linguistic values is the correspondent at the semantic level of a pre-order relation, established between fuzzy sets defining the linguistic values (this relation is defined by Ulrich Bodenhofer in []) little intermediat e Fig. 3. Definitions of a set of linguistic values In figure 3, the membership functions for three fuzzy sets corresponding to linguistic values little, intermediate, big are represented on the same axis. It is easy to show that the relation at the linguistic level: little p intermediate p big is transferred at the numerical level: M(little) M(intermediate) M(big) The property of a set of linguistic values to preserve the order of the linguistic values at the fuzzy sets definitions too is named interpretability and is defined in [2]. Let be V a linguistic variable defined on the domain D of the table attribute A. The linguistic values of the V variable form the linguistic domain of the A attribute. So, in a vague query context, the crisp domain (the domain attribute, according to the relational model theory) and the linguistic domain must be defined for each table attribute. For example, the crisp domain D=[,] and the linguistic domain L={, almost,,almost, } may be associated to the attribute of the STUDENT table (table ). big β α Fig. 4. A set of linguistic values on a referential domain In most applications, defining the linguistic values set covers almost uniformly the referential domain. There are usually 3, or 5 or another odd number of linguistic values. Starting from this general idea, we have already proposed a software interface ( FuzzyKAA system, [6]), able to assist the user for linguistic values defining in a database context. The system proposes a uniform partitioning of the attribute domain, by trapezoidal membership functions, like in figure 4. The correspondence between β and α parameters is implicit, but it can be changed any time. The definitions implicitly obtained, can be adjusted either by changing numerical coordinates of graphical points, or by directly manipulating of them. This semi-automate method to create the knowledge base containing vague terms definition for vague query evaluation has a great advantage: details regarding effective attribute domain limits, or distributions of the values, can be easy obtained thanks to directly connecting to the database. 3 Linguistic values defined on a subdomain The humans use an enormous number of expression variants in their common language. This fact determined us to study the possibility to find the most accurate model for the query sent to the database, thus the computational treatment and the response to be as adequate as possible. The study has found a new class of problems, which require a partitioning of a limited subset of an attribute domain, but not the whole domain, in order to model the linguistic labels. The query contains a fuzzy selection criterion addressed to one group of database objects, which requires a domain adjustment, in accordance to the group members; at its turn the group is already obtained by another fuzzy selection applied to the whole database (or table).

Let s consider the query addressed to the STUDENT table (table ), as follows: Retrieve the students within the young ones Fig. 5. The young linguistic label definition The query evaluation procedure respects the following steps:. The selection criterion age young is evaluated, taking into account the definition in the figure 5; a tuple group as intermediate result is obtained, where the condition young > is satisfied (table 3). In other words, the students not at all young are eliminated. Table 3. 2. The interval containing the values for the selected students forms the subdomain [5,9]; it is considered later, instead [,]. 3. The linguistic value set {, almost,, almost, } will partition this subdomain (figure 6)..5 Name Mark Age George 25 Helen 9 23 Paul 9 2 Mary 8 22.66 Michael 8 22.66 almost young 2 23 25 almost years 5 6 7 8 9 Fig. 6. Linguistic values defined on a subdomain 4. The global criterion satisfaction degree will result for each tuple, taking into account the satisfaction degree for the criterion (fig 6) and the degree of the young age for the student (table 3). The table 4 contains the computed global degrees. Name Mark Age young Paul 9 2 Justin 6 2 John 7 2 Jerry 5 2 Mary 8 22.75.5.5 Michael 8 22.75.5.5 Helen 9 23.5.5 Robby 7 24.25 Table 4. We propose a discussion, in order to validate the response provided by the proposed procedure, if it respects the query meaning. Other expression variants, more or less similar to the previous example, will be analysed, and the differences, at both numerical and semantically levels, will be pointed out. i. The simple criterion query Retrieve the students can be classically processed by the fyzzy logic, taking into account the linguistic values defined on the whole attribute domain (figure 7)..66.33 almost almost Fig. 7. Linguistic values defined on the domain The response contains any age students, respecting the criterion (table 5). George, for example, not at all young, has got the best. Name Mark Age young Paul 9 2 Justin 6 2 John 7 2 Jerry 5 2 Mary 8 22.75 Michael 8 22.75 Helen 9 23.5 Robby 7 24.25 Table 5. 2 3 4 5 6 7 8 9 ii. A query containing a conjunction by two simple criteria, for example: Retrieve the AND young students will return the students both and young, ranked by the global satisfaction degree (table 6).

It is obvious that table 4 contains a more restrictive response (lower satisfaction degrees), because the criterion is applied on a limited students group, not on the original table. One can re that the selected tuples respect the same order in the all three discussed queries. Name Mark Age young Paul 9 2 Justin 6 2 John 7 2 Jerry 5 2 Mary 8 22.75.66.66 Michael 8 22.75.66.66 Helen 9 23.5.5 Robby 7 24.25 George 25 Dan 4 25 Mandy 7 26 Table 6. iii. A quite different query expression may be: Retrieve the best students within the young ones In the precise query context, the best criterion applied on a tuples group is equivalent to the aggregation MAX function and returns the (only one) student having the greatest. But in the imprecise context, the young students group is a fuzzy set too (a membership degree is attached to each student). Here the best can correspond to any student, but not necessary to the youngest one. So, a ranked list of the students satisfying the criterion, but also taking into account their young age degree, is the most adequate response to the above query. In other words, we consider this query can be assimilated with the initially discussed one (Retrieve the students within the young ones), so it can be submitted to the same evaluation procedure. Moreover, it may be even more suggestive for the database user, and semantically adequate to the response in table 4. In conclusion, the solution given by the proposed query evaluation procedure consists in a relative selection of the database tuples, where the selection criterion is not an absolute one, but it is relative to a subclass of already selected tuples. Similar procedures can be used to evaluate more complex queries, including dynamical aggregation for example, such as: How many best students are within the young ones? 4 Conclusion A particular type of vague query sent to a database is discussed in this paper and a procedure for evaluate it is proposed. The main idea is to dynamically define sets of linguistic labels on limited attribute domains, determined by previous fuzzy selections. After a comparative analysis including other similar query types, well known as fuzzy model, we conclude that: the evaluation procedure, proposed by this paper, provides an accurate model for the discussed vague expression, with respect to query semantic. References: [] U. Bodenhofer, A general framework for ordering fuzzy alternatives with respect to fuzzy orderings, 8th Int. Conf. on Information Processing and Management of Uncertainty (IPMU 2), Madrid, 2, pp. 7-77 [2] U. Bodenhofer and P. Bauer, Towards an Axiomatic Treatment of Interpretability, 6th International Conference on Soft Computing, Iizuka, 2, pp. 334-339 [3] E. Herrera-Viedma, An Information Retrieval System with Ordinal Linguistic Weighted Queries Based on Two Weighting Semantics, 8th International Conference on Information Processing and Management of Uncertainty in Knowledge-Bases Systems(IPMU 2), Madrid, 2 [4] C. Tudorie, Cercetări privind aplicarea tehnicilor de inteligenţă artificială pentru interogarea bazelor de date, Scientific Report, University Dunărea de Jos, Galaţi, 23 [5] C. Tudorie, Vague criteria in relational database queries, In Bulletin of Dunarea de Jos University of Galaţi, III/23, pp. 43-48, [6] C. Tudorie, Contribuţii la realizarea unei interfeţe inteligente pentru interogarea bazelor de date, Scientific Report, University Dunărea de Jos, Galaţi, 23