A Granular Computing Approach. T.Y. Lin 1;2. Abstract. From the processing point of view, data mining is machine
|
|
- Doris McCormick
- 5 years ago
- Views:
Transcription
1 Data Mining and Machine Oriented Modeling: A Granular Computing Approach T.Y. Lin 1;2 1 Department of Mathematics and Computer Science San Jose State University, San Jose, California tylin@cs.sjsu.edu 2 Berkeley Initiative in Soft Computing Department of Electrical Engineering and Computer Science University of California, Berkeley, California tylin@cs.berkely.edu Abstract. From the processing point of view, data mining is machine derivation of interesting properties (to human) from the stored data. Hence, the notion of machine oriented data modeling is explored: An attribute value, in a relational model, is a meaningful label (a property) of a set of entities (granule). A model using these granules themselves as attribute values (their bit patterns or lists of members) is called a machine oriented data model. The model provides a good database compaction and data mining environment. For moderate size databases, nding association rules, decision rules, and etc., can be reduced to easy computation of set theoretical operations of granules. In the second part, these notions are extended to real world objects, where the universe is granulated (clustered) into granules by binary relations. Data modeling and mining with such additional semantics are formulated and investigated. In such models, data mining is essentially a machine "calculus" of granules -granular computing. 1 Introduction What is data mining? We will explore it from the processing point of view. Roughly, data mining is a reverse of database processing. Database processing mainly concerns with organizing and storing massive data according to their known semantics, for example various normal forms. On the other hand, data mining mainly concerns with discovering and extracting previously unknown semantics of stored data. Discovering and extracting are machine derivations of interesting properties, called patterns, from the mathematical structure of stored data. What would be the proper primitives for machine processing? In database theory, attribute values are used as primitives to describe entities. We termed such a set of descriptions a knowledge representation. Attribute values are meaningful primitives (properties of entities) to human. However, to machine, they
2 are merely bits and bytes; human's intuition provides no special aids to the processing. In fact, attribute values are often cumbersome to process, because they are semantically interrelated. Ideally, all primitives should be independent from each other. So we take the entities as primitives, just opposite to the database processing. An attribute value is regarded as a name or label of a set of entities (granules). This leads to the consideration of using these granules themselves, or more precisely, their bit patterns or lists of members, as labels. Such labels are termed canonical names or canonical labels; they are attribute values encoded with machine semantics. So a relational model using canonical labels as its attribute values is called a machine oriented data model. The model provides a compact representation of a database. It reduces some classical data mining methods, such as nding association rules, decision rules, and etc., to simple set theoretical operations. This paper is divided into two parts. The rst part is a machine oriented relational theory. It is a theory of equivalence relations, an extended rough set theory. Data mining in this model is machine processing of elementary sets (equivalence classes). In the second part, the modeling is extended to real world objects. The universe, consisting of interrelated objects, is more than a set; it is granulated (clustered) by some binary relations. The granules are called elementary neighborhoods [9], an extension of elementary sets. A "relational" theory with such additional semantics, is not new, has been formulated and examined for approximate retrievals, e.g., [16, 3, 5, 7, 19]. Data mining in this extended theory is machine processing of elementary neighborhoods. The computational theory to handle such granulated spaces is called granular computing- a new eld inspired by Zadeh [25] and labeled by this author [20]. From computational point of view, data mining is one form of granular computing. PartI Machine Oriented Relational Theory In this part, we re-develop relational database theory from data mining point of view. It is an extensional database theory ([6], pp.90). As in classical relational theory, the universe of entities and attribute domains are all classical sets. Roughly data are discrete, not clustered. 2 Single Column Representations and Partitions In this section, we will give a detail illustration on the simplest relational model. Let V be the universe which is a set of entities. Let C, called the elementary concept space, be a set of elementary concepts (attribute values).
3 2.1 Single Column Representations. A map from the universe V to the collection C of elementary concepts, A : V! C, is called a single column (knowledge) representation. We will be interested only in the attribute values currently using. In other words, we assume A is an onto map. Such a C is called an active domain by database theorists([21], p11); we may also denote C by ADom(A). Intuitively each element in C is a label of certain property of existing entities; it represents an elementary concept. Partitions and Quotient Sets. Let c be an element in C. The inverse image of c under A, in symbols A 1 (c), is the set of all those entities whose image is c; i.e., A 1 (c) = fu j A(u) = cg It is clear that these inverse images A 1 (c); 8 c 2 C forms a partition P A on V. Note that a partition induces an equivalence relation and vice versa. By abuse of notation, we will use P A to denote the equivalence relation too. Each A 1 (c) is an equivalence class. The collection of all equivalence classes constitutes a set, called the quotient set and denoted by V=A. Canonical Representations. An equivalence class plays two roles, one as an element of the quotient set V=A, another as a subset of the universe V. We can regard the element as the canonical name or label of the subset. In other words, the quotient set is a set that consists of canonical names. We will use CNAME( ) to denote the canonical name. So the map, u! [u]! CNAME([u]) is called single column canonical representation, where, as usual, [u] denotes the equivalence class containing u. This representation will be denoted by CN AM E too, that is, CNAME(u) = CNAME([u]). The graph (u; CNAME(u)) is called single column canonical information table. Note that as far as computer systems are concerned, both canonical name and meaningful name are all bits and bytes. Examples. We will illustrate the idea by examples. Let the universe V = fid 1 ; ID 2 ; :::ID 9 g be a set of 9 restaurant owners, and the attribute values be the locations of their restaurants. For comparison, we combine two single column representations into one "table;" see Table 1 and Table 2. For human, the meaningful names inform that some restaurants are located in West Wood, West LA, and Brent Wood, while the canonical names add no information to human. However, for machines, both are bits and bytes; either choice gives rise to the same mathematical structure of the stored data. In fact, canonical names reveal the machine semantics and may speed up machine processing. In Table 1, ordinary subset notations are used while in Table 2, we use bit representations; a bit is on if and only if the corresponding object is belonging to the subset.
4 PLEASE put Table 1 here. PLEASE put Table 2 here. 2.2 Single Column Machine Oriented Relational Models Let us rst summarize the previous discussions into a theorem Theorem. 1. There is a one-to-one correspondence between single column canonical representations and partitions. 2. Each attribute value is dened by one and only one equivalence class. 3. A logical formulas of attribute values is dened by and only by a set theoretical relationship among equivalence classes. We refer to Theorem , Item 2 and Item 3 as machine semantics of attribute values and logical formulas respectively. One should note that this theorem is valid, even when we consider a collection of partitions. Next, let us consider the following factorization of the representation A: V! V=A! C, Recall that the quotient set is a set of canonical names, so the rst map, which maps each entity to the canonical name of its equivalence class, is the canonical representation. The second map is called the naming map, which sends each canonical name to a meaningful name. First, we note that A is a single column relational data model. We factor it into a pair of maps: The rst map represents the universe of entities by labels encoded with machine semantics. Data mining will focus on such encoded labels. The second map translates the encoded labels to human understandable terms; the primary use is to output the discovered patterns. We will call the pair or the triple (V; V =A; C) machine oriented relational model. In table representations, since the rst map is implicitly in the encoded labels, so we only need to display two columns; see examples below. Perhaps, we should note that the triple has been called granular structure in our earlier papers (Section 8.1). Examples In Table 1 or Table 2, there are six elements in the column of canonical names. However, there are only three distinct ones in either table. So we have the condensed forms of machine oriented models: PLEASE put Table 3 here. The next two representations condenses all the information in the Table 1 and 2, into a very compact form. It is the named bit and list representations of a partition. PLEASE put Table 5 here. PLEASE put Table 4 here.
5 3 Multiple Column Representations and Information Tables It is clear the results in previous section can be easily generalized to multiple column knowledge representations. Its graph is called an information table; see Appendix. We shall illustrate the notion by examples. 3.1 Examples Let V be a set of 9 restaurant owners. Its elementary concept spaces are denoted by ADom attribute ; see Section 2.1. It has three single column representations: 1. TYPE: V! ADom T Y P E 2. LOCATION: V! ADom LOCAT ION 3. PRICE: V! ADom P RICE Information Table: These three representations form a multiple column knowledge representation. Its graph is in the following information table, Table 6. Put Table 6 here Bit Table: Each single column representation induces a partition. If we label each equivalence class by its bit patterns, we have the canonical information table, Table 7; we skip the list representations. Put Table 7 here 4 Machine Oriented Relational Models These three attributes induce three partitions; we will treat them as one multiple partition. Pawlak called it a knowledge base [22]. Roughly, a machine oriented model is a multiple partition, in which each equivalence classes is given a meaningful names. Table 6 and 7 can be condensed to the following named multiple partition: Put Table 8 It is obvious what a list model looks like; we skip the details
6 5 Data Mining on Relational Databases We will formulate some classical data mining notions in our models. Let c, d be attribute values of a relational database. Let P, Q be equivalence classes corresponding to c and d. In other words, c = NAME(P ) and d=name(q). Let Card(-)be the cardinal number of a set. 1. Association rule: A pair (c; d) is an association rules, if Card(P \ Q) threshhold [1]. 2. Decision rule: A formula c! d is a decision rule, if P Q [22]. 3. Robust decision rule: A formula c! d is a robust decision rule, if P Q and Card(P ) threshhold [12]. 4. Soft decision rule (strong rule): A formula c! d is a soft decision rule(strong rule), if Card(P n (P \ Q)) threshhold [27]. 5.1 Discovering Decision Rules Let us examine how the following decision rule can be discovered from Table 6. The rule can be expressed in several formats: 1. "if-then" format: "If cuisine TYPE is American, then the PRICE is inexpensive," 2. logic formula: "American! inexpensive," or 3. set theoretical formula: "American inexpensive." In classical model, we scan through the TYPE and PRICE columns of Table 6 to check if the attribute value "American" is consistently associated only with "inexpensive;" In machine oriented approach, we only need to verify the inclusion of two equivalence classes. It can be readily veried by bit operations, if the database is of moderate size. For example, if the database has one millions rows, it requires 2 20 =32 words (32bit=word) operations, namely, 32K words operations that is considerable less than one database access. In this particular example, we only need two assembly instructions, "and" and "compare". "American" T "inexpensive" =T Y P E( ) T P RICE( ) =T Y P E( )= "American" Note that the attributes are referred to partitions, so there is no eect on the bit patterns. From this simple example, one can see machine oriented model seems a more desirable approach; the details will be in the future papers.
7 5.2 Discovering Association Rules There are many literatures on association rules, we will defer the comparisons study of our approach to various well known algorithms, e.g., [1, 2] to future papers [8]. In this section, we will illustrate, by simple examples, that, for moderate size databases, we have a viable approach. Based on the model in Section 4, to nd if "French" and "expensive" form an association rule (with support 3), we only need to compute their intersection. Let Card(-) denote the cardinal number of a set. Card("French" T "expensive") = Card(T Y P E( ) T P RICE( )))= Card(( )&&( ))= Card(( ))=4 3. Based on our modeling, nding association rules is reduced to compute the intersections. When database is small, bit patterns are useful representations. However, when database is larger, we may need a good balance between list and bit representations. Roughly, we need a clever way to take bit-intersections (using bit patterns to compute the intersections). First, let us do some terminology translation. A 1- itemset is a label of an equivalence class, we shall, by abuse of language refer to it as an equivalence class. A 2-itemset is the intersection of two equivalence classes, we will abbreviate it as 2-intersetion. In general a k-itemset is a k-intersection. A k-item set is large if k-intersection is large. To nd all the association rules, rst, we nd all the large 1-itemsets by pure counting. Next, we nd all large 2-itemsets by computing the 2-intersections of large 1-itemsets. In general, we nd the large k-itemsets by computing the k-intersections of large (k-1)-itemsets. However, we do not want to compute the k-intersection unless all its (k-1)-sub-intersections are all large. The idea is similar to Aprori; but slightly dierent. As soon as the cardinal number of k-intersections get smaller, we may shift from the bit representation to list representation of a k-intersection; see forth coming papers. PartII Machine Oriented Models for Real World Data In database processing, though relational model is very eective its mathematical structure does not adequately reect the semantics of real world data. To capture some of these semantics in data mining, relational model needs to be extended. In relational theory, the universe of entities is assumed to be a classical set. In other words, there is no interaction among entities; we do know, however, there
8 are interactions among real world objects. There are similarities among events, distance in space, hierarchy in company positions, and etc. What should be the proper extra structures? In formal logic, Tarski imposes a relational structure to each world model. In fuzzy theory, Zadeh implicitly imposes a granular structure; see 6. Since both structures are "generated" by crisp or fuzzy binary relations, in this part, we will assume { the universe is, not a classical set as relational theory postulated, is a set with granulation (clustering) imposed by crisp binary relations. Data modeling and mining for such a universe is the main focal points of this part. 6 Granulation and Binary Relations Let us quote the following from [10]: "According to Lot Zadeh [25], { " information granulation involves partitioning a class of objects(points) into granules, with a granule being a clump of objects (points) which are drawn together by indistinguishability, similarity or functionality." By observing some technical points, we translate it to a formal denition, called granular structure [9]. The structure is some constraints imposed by some forms of crisp or fuzzy binary relations. In this part, we will focus on a subset of it. 6.1 Binary Relations In [9], we formulate the theory in two universes, however, in this paper, we will be interested only in a single universe V, the object space. Let B V V be a binary relation on V. For each object p 2 V, we associate a subset B p = fu j pbug, called elementary B-neighborhood B p or elementary neighborhood. The collection B = fb p j 8 p 2 V g is called a binary B-neighborhood system (BNS). The association denes a map B : V! 2 V : p! B p is called a binary B-granulation or simply granulation; it is clear the map B and the set fb p g determine each other. Suppose a binary neighborhood system B p is given, a binary relation can easily bedened by B = f(p; u) j u 2 B p g.
9 So we conclude this subsection with a Proposition. There is a one-to-one correspondence between binary neighborhood systems, binary granulations and binary relations. Since binary relation, binary neighborhood systems and binary granulation are essentially the same concept. We will treat them as synonyms and use them interchangeably. In fact, by abuse of notation, we have used the same notation B for all of them. If the binary relation B is an equivalence relation E, then the binary granulation is a partition. Each equivalence class is the elementary E-neighborhood of its members. 6.2 Neighborhood System Spaces The pair (V; B) is called a binary neighborhood system space (BNS-space). In this paper, we may simply refer it as a neighborhood system space(ns-space). Note that, strictly speaking, a binary neighborhood system space is a neighborhood space, but not vice versa [11, 9, 10]. An NS-space is a space with multilevel or multiple granulations, while BNS has only one single level of granulation. An NS-space is a pre-topological space; it is a variant of Frechet(V)-space [23]. In the case that B is an equivalence relation E, (V; E) is a clopen topological space [13]. Let B 0 be another binary relation Denition. 1. A subset X is called a denable B-neighborhood, if X is a union of elementary neighborhoods of B; 2. A subset X p is called a denable B-neighborhood of p if, further, the union contains the elementary B-neighborhood of p. 3. The set of all denable B-neighborhoods at p is denoted by BS(p); in BNSspace, there is at most one elementary neighborhood B p in BS(p) at each p. The set of all denable B-neighborhoods is denoted by BS(U). 4. Let X be a subset of V. NEIGH(X) = S pjinx B (p) is called the elementary B-neighborhood of X; note that it is a denable neighborhood. 5. B 0 strongly depends on B, denoted by B =) B 0, i every elementary B 0 - neighborhood is a denable B-neighborhood. 6. If B =) B 0, we will say B is denably ner than B 0 or B 0 is denably coarser than B. Strongly dependence is an elaborate extension of renement of equivalence relations. The obvious extension, which says every B 0 -neighborhood is a subset of B-neighborhood, does not have the desirable properties of "functional" dependency (or knowledge dependency of Pawlak). We requires every elementary B 0 -neighborhood is a union of B-neighborhoods. We recall some specic binary neighborhood systems from [9] Denition.
10 1. (V; B) is serial, if 8p; B p is non-empty, 2. (V; B) is reexive, if 8p; p 2 B p ; 3. (V; B) is symmetric, if 8p; 8q; q 2 B p =) p 2 B q ; 4. (V; B) is transitive, if 8p; 8q; 8r; q 2 B p and r 2 B q =) r 2 B p ; 5. (V; B) is Euclidean, if q 2 B p, and r 2 B p =) r 2 B q ; 6. (V; B) is clopen, if B is reexive, symmetric, and transitive. [13] 7 Single Column Granular Representations Suppose we are given a universe V, a binary neighborhood system B, and an elementary concept space (an active domain of attribute values; see Section 2.1). Then, the 3-tuple (V; B; C) is called a granular structure; see Section 8.1. Let us consider the map GN : p! B p! NAME(B p ). where the st map is the granulation B (Section 6.1), and the second map is a naming map. GN is called a single column granular representation. Its graph (p, GN(p)) is a single column granular table. As before, we will call it canonical single column granular table or simply canonical granular table, if we use the canonical names. Note that the rst map B induces a partition on V, we will denote it by P B. It is clear GN can be factored through V=P B. p! [p]! B p! NAME(B p ). Note that in the relational case, the middle map is an identity. 7.1 Examples Let V be the set of restaurant owners as given in Table 1. We will suppress the ID from ID i, so the set of restaurant owners is V = f1; 2; 3; 4; 5; 6; 7; 8; 9g. Further, V has a new "attribute" dened as follows: Each restaurant owner is associated with a group of major investors in his restaurant; the investors are members in V. Each group has a registered name, such as, bronze, silver, gold, or platinum groups. Note that American restaurant is too expensive for the owner to be a major investor. So the group associated to ID 1 or ID 2 does not include the owners. Technically, a group is an elementary neighborhood. Here are the lists of these groups of investors. 1. B 1 = B 2 = f3; 4; 5; 6; 7; 8; 9g, 2. B 3 = f1; 2; 3g, 3. B 4 = B 5 = f4; 5g 4. B 6 = B 7 = B 8 = B 9 = f6; 7; 8; 9g We name each neighborhood as follows:
11 1. NAME(B 1 )=NAME(B 2 )= platinum, 2. NAME(B 3 )= bronze, 3. NAME(B 4 )=NAME(B 5 )= silver, 4. NAME(B 6 )=NAME(B 7 )=NAME(B 8 )=NAME(B 9 )= gold 7.2 Single Column Granular Table. Let C = f bronze, silver, gold, platinum g and consider the single column granular table for INVESTROS; see Table 9. Note that B induces a binary relation B C on C: Denition Let c=name(p ) and d=name(q) be two elements in C. c B C d i 9p 2 P and 9q 2 Q such that p B q. Table 10 is such a binary relation for INVESTORS-attribute. Please put Table 9 here Please put table 10 here 7.3 Single Column Machine Oriented Granular Models In relational theory, single column machine oriented model is a named partition; each elementary set(equivalence class) is named. For real world data, it is a named binary granulation (binary neighborhood system); each elementary neighborhood is named. One can represent it in a table format; see Table 11,or Table 12, where we have grouped the owners together if their neighborhoods are the same. In fact, the grouping is the partition P B, see the beginning paragraphs of Section 7. Since the members of an elementary neighborhood are explicitly listed, there is no need to display the semantic relations. Put Table 11 here. Put Table 12 here. 8 Multiple Column Granular Representations It is rather easy to generalize the single column representation theory to a multiple column representations. So this section will be rather brief and formal. 8.1 Granular Structures A binary granular structure consists of 3-tuple (V; B j ; C j ; j = 1; 2; : : :; n) where
12 1. V is the universe, called the object space. 2. Each B j is a binary neighborhood system(a binary relation), j=1,2,: : :n 3. Each B j consists of elementary neighborhoods B j p ; 8 p 2 V. 4. For each elementary neighborhood a meaningful name is given, that is, C j p = NAME(B j p); j = 1; 2; : : :n and p 2 V. 5. C j is the elementary space that consists of all the names of elementary neighborhoods in B j ; j = 1; 2; : : :n. It is also referred to as an active domain, namely C j = ADom Bj ; see Section 2.1 A collection of single column granular representations forms one multiple column granular representation or simply a granular representation. Its graph will be called multiple column granular table, or simply granular table; it was called extended information table [10]. Perhaps, once again, we should caution the readers that unlike the case of relation theory, the entries in the granular table are not semantically independent in its respective domain; see Table 9, and Continuous Functions Let us consider the following relation, Table 13, that is derived from Tabel 9 and Table 6. If we had forgotten the semantic relation in the active domain of INVESTORS, then one would think that there were an extensional functional dependency INVESTOR! TYPE in Table 13. However, for example, bronze and platinum are B C -related, but their images, American and Chinese, are not related. So the map INVESTOR! TYPE does not respect the semantic relation. We will not treat it as a functional dependency. We require the functional dependency respect such semantic relations. So we dene Denition 1. A map F, which is dened on a neighborhood of p, is continuous at p, if F (NEIGH(p)) NEIGH(q), where q = F (p); see Section C j is continuously functionally depended on C h, if there is a map F : C j! C h that is continuous at every point p 2 C j The only continuous functional dependency in Table 13 is T Y P E! INV EST ORS. Put Table 13 9 Machine Oriented Granular Models A multiple column machine oriented model is a collection of single column machine oriented models. In other words, it is a collection of named binary granulation (binary neighborhood system); each elementary neighborhood is named. The machine oriented model for Table 13 is simply the union of Table 12, and part of Table 8; see Table 14. Put Table 14
13 10 Data Mining on Clustered Data We will extend many classical notions of various rules to granular tables. Recall that each elementary concept space (active domains) is an NS-space; see Section 6.2. Let (V; B j ; C j ; j = 1; 2; : : :; n) be a granular structure. The elementary neighborhood B j p of p 2 V will be denoted by NEIGH B j (p). Note that there is an induced binary relation on C j ; see 7.1 Denition. So there is an elementary neighborhood for each element in C j. Write c=name(p ) and d=name(q), where P and Q are elementary neighborhoods in B 1 and B 2, respectively. Note that NEIGH B j(p ), or simply NEIGH(P ) if B j is understood, means the union of NEIGH B j(p) 8 p 2 P. Also note that NEIGH(c)is an elementary neighborhood in the elementary concept space C j. 1. Soft association rule: A pair (c; d) is a soft association rule, if Card(NEIGH(P )\NEIGH(Q)) threshold. 2. Soft decision rules: A formula c! d is a continuous decision rule, if P NEIGH(Q) [18]. 3. Continuous decision rules: A formula c! d is a continuous decision rule, if P Q and NEIGH(c) NEIGH(d). 4. Softly robust continuous decision rule: A formula c! d is a softly robust continuous decision rule, if N EIGH(c) NEIGH(d) and Card(NEIGH(P ) \ NEIGH(Q)) threshhold [12]. 5. (Softly robust)high level continuous decision rules: Suppose P and Q are two denably coarser granular structures of B 1 and B 2. A formula c! d is a (softly robust) high level continuous decision rule, if NEIGH(P ) NEIGH(Q) (and Card (NEIGH(P ) \ NEIGH(Q)) threshhold) [14, 4, 9]. Some applications will be reported in the future papers. 11 Conclusion We started with two notions, 1. data mining is machine derivation of interesting (to human) properties from the underlying mathematical structure of the stored data. 2. the universe of real world objects are granulated (clustered). and set forth to develop data models suitable for mining real world data. Machine oriented data models, in which attribute values are encoded with machine semantics (knowledge), are introduced. The model eectively provides machine the necessary information for mining various forms of rules. Data mining in such models is reduced to set theoretical operations of granules, which is machine calculus of granules - granular computing.
14 For relational theory, granules are equivalence classes; the computation are ecient; it is faster than usual approaches. Applications are on the way; they will be reported soon. For clustered data, substantial research is still needed; Currently, the semantic relations on the attribute domains are supplied by human (concept hierarchy [4, 17] is a special case). Some automations of building such semantic relations are needed for large scaled applications. We will report our exploration in future papers. 12 Appendix-Information Tables and Relations The syntax of information tables is very similar to relations in relational databases. Entities are also represented by tuples of attribute values. However, the representation may not be faithful, namely, entities and tuples may not be one to one correspondence. where An information table is a 4-tuple (V; A; Dom; ), 1. V = fu; v; : : :g is a set of entities. 2. A is a set of attributes fa 1 ; A 2 ; : : :A n g. 3. dom(a i ) is the set of values of attribute A i Dom = dom(a 1 ) dom(a 2 ) : : : dom(a n ) 4. : V A! Dom, called description function, is a map such that (u; A i ) is in dom(a i ) for all u in V and A i in A. The description function induces a set of maps Each image forms a tuple: t = (u; ) : A! Dom. t = ((u; A 1 ); (u; A 2 ); ::::; (u; A i ); ::(u; A n )) Note that the tuple t is associated with object u, but not necessarily uniquely. In an information table, two distinct objects could have the same tuple representation that is not permissible in relational databases. A decision table is an information table (V; A; Dom; ) in which the attribute set A = C [ D is a union of two non-empty sets, C and D, of attributes. The elements in C are called conditional attributes. The elements in D are called decision attributes. Each row is a decision rule. The notion of a relation in relational theory consists of
15 1. V = fx; y; : : :g is an implicit set of entities, which is not appear in the formal model. 2. A is a set of attributes fa 1 ; A 2 ; : : :A n g. 3. Dom(A i ) is the set of values of attribute A i. Dom = dom(a 1 ) S dom(a 2 ) S : : : S dom(a n ) 4. Implicitly, to each entity u we associate a mapping t u : A! Dom, where t(a) 2 dom(a i ) for each A i 2 A. A relation consists of mappings t u : A! Dom, Informally, one can view relation as a table consists of rows of elements. Each row represents an entity uniquely. References 1. R. Agrawal, T. Imielinski, and A. Swami, "Mining Association Rules Between Sets of Items in Large Databases," in Proceeding of ACM-SIGMOD international Conference on Management of Data, pp , Washington, DC, June, R. Agrawal, R. Srikant, "Fast Algorithms for Mining Association Rules," in Proceeding of 20th VLDB Conference SanTiago, Chile, S. Bairamian, Goal Search in Relational Databases, California State Univeristy- Northridge, Thesis, Y.D. Cai, N. Cercone, and J. Han. "Attribute-oriented induction in relational databases," in Knowledge Discovery in Databases, pages AAAI/MIT Press, Cambridge, MA, W. Chu and Q. Chen, "Neighborhood and associative query answering," Journal of Intelligent Information Systems, vol 1, , C. J. Date, Introduction to Database Systems 3rd, 6th editions, Addision-Wesely, Reading, Massachusetts, 1981, T. Gaasterland, Generating Cooperative Answers in Deductive Databases, University of Maryland, College Park, Maryland, Dissertation, T. Y. Lin and Eric Louie, "Finding Association Rules by Computing Bits" Data Mining and Knowledge Discovery: Theory, Tools, and Technology II (or29) April 2000, Orlando, Florida USA 9. T. Y. Lin, "Granular Computing of Binary relations I: Data Mining and Neighborhood Systems," in Rough Sets and Knowledge Discovery, edited by Polkowski and Skowron, Physica-Verlag, , T. Y. Lin, "Granular Computing of Binary relations II: Rough Set Representations and Belief Functions," in Rough Sets and Knowledge Discovery, edited by Polkowski and Skowron, Physica-Verlag, , T. Y. Lin, "Neighborhood Systems -A Qualitative Theory for Fuzzy and Rough Sets," in Advances in Machine Intelligence and Soft Computing, Volume IV, edited by Paul Wang, , T. Y. Lin, "Rough Set Theory in Very Large Databases," in Proceedings of Symposium on Modeling, Analysis and Simulation, IMACS Multi Conference (Computational Engineering in Systems Applications), Lille, France, July 9-12, Vol. 2 of 2, , 1996.
16 13. T. Y. Lin, "Topological and Fuzzy Rough Sets," in Decision Support by Experience - Application of the Rough Sets Theory, edited by R. Slowinski, Kluwer Academic Publishers, , T. Y. Lin, "Neighborhood Systems and Approximation in Database and Knowledge Base Systems," in Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems, Poster Session, October 12-15, 1989, T. Y. Lin, "Neighborhood Systems and Relational Database," in Proceedings of 1988 ACM Sixteen Annual Computer Science Conference, February 23-25,1988, T. Y. Lin, "Topological Data Models and Approximate Retrieval and Reasoning," in Proceedings of 1989 ACM Seventeenth Annual Computer Science Conference, February 21-23, Louisville, Kentucky, 1989, T. Y. Lin and M. Hadjimichael, "Non-Classicatory Generalization in Data Mining," in Proceedings of The Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery, November 6-8, 1996, Tokyo, Japan, T. Y. Lin, and Y. Y. Yao, "Mining Soft Rules Using Rough Sets and Neighborhoods," in Proceedings of Symposium on Modeling, Analysis and Simulation, CESA'96 IMACS Multiconference (Computational Engineering in Systems Applications), Lille, France, 1996, Vol. 2 of 2, B. Michael and T. Y. Lin, "Neighborhoods, Rough sets, and Query Relaxation," in Rough Sets and Data Mining: Analysis of Imprecise Data, Kluwer Academic Publisher, edited by T. Y. Lin and N. Cercone, , (Final version of the paper presented in Workshop on Rough Sets and Database Mining, March 2, L.A. Zadeh, "Some Reections on Soft Computing, Granular Computing and Their Roles in the Conception, Design and Utilization of Information/Intelligent Systems, " in Granular Computing: Fuzzy sets, Fuzzy Logic and Applications to Information/Intelligent Systems, edited by T. Y. Lin, Y. Y. Yao, and L. Zadeh, Physica- Verlag, to appear. 21. D. Meyer, The Theory of Relational Databases, Computer Science press, 1983 (6th printing 1988). 22. Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic, Dordrecht, W. Sierpenski and C. Krieger, General Topology, University of Torranto Press L.A. Zadeh, "Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic," Fuzzy Sets and Systems, 90, , Lot Zadeh, "The Key Roles of Information Granulation and Fuzzy logic in Human Reasoning," in Proceedings of 1996 IEEE International Conference on Fuzzy Systems, September 8-11,1996, L.A. Zadeh, " Fuzzy Sets and Information Granularity," in Advances in Fuzzy Set Theory and Applications, edited by M. Gupta, R. Ragade, and R. Yager, North- Holland, Amsterdam, 3-18, W. Ziarko, R. Golan, and D. Edwards, "An Application of DataLogic/R Knowledge Discovery Tool to Identify Strong Predictive Rules in Stock Market Data," in Proceedings of AAAI-93 Workshop on Knowledge Discovery in Databases, Washington, DC, This article was processed using the LaT E X macro package with LLNCS style
17 Restaurant owners LOCATIONs : : : Restaurant owners Canonical Names ID 1 West Wood : : : ID 1 fid 1; ID 2; ID 3g ID 2 West Wood : : : ID 2 fid 1; ID 2; ID 3g ID 3 West Wood : : : ID 3 fid 1; ID 2; ID 3g ID 4 West LA : : : ID 4 fid 4; ID 5g ID 5 West LA : : : ID 5 fid 4; ID 5g ID 6 Brent Wood : : : ID 6 fid 6; ID 7; ID 8; ID 9g ID 7 Brent Wood : : : ID 7 fid 6; ID 7; ID 8; ID 9g ID 8 Brent Wood : : : ID 8 fid 6; ID 7; ID 8; ID 9g ID 9 Brent Wood : : : ID 9 fid 6; ID 7; ID 8; ID 9g Table 1. Two Tables of Single Column Information Tables Restaurant owners LOCATIONs : : : Restaurant owners Canonical Names ID 1 West Wood : : : ID 1 B( ) ID 2 West Wood : : : ID 2 B( ) ID 3 West Wood : : : ID 3 B( ) ID 4 West LA : : : ID 4 B( ) ID 5 West LA : : : ID 5 B( ) ID 6 Brent Wood : : : ID 6 B( ) ID 7 Brent Wood : : : ID 7 B( ) ID 8 Brent Wood : : : ID 8 B( ) ID 9 Brent Wood : : : ID 9 B( ) Table 2. Bit Representation of a Single Column Information Table Restaurant Natural Canonical names Naming Meaningful names Owner projection (encoded map (attribute Groups (partition) labels) values) ID 1; ID 2; ID 3 : : : B( ) : : : West Wood ID 4; ID 5! B( )! West LA ID 6; ID 7; ID 8; ID 9 : : : B( ) : : : Bent Wood Table 3. The First and Second Maps, a factorization of A
18 Canonical names Meaningful names (encoded labels) (attribute values) B( ) B( ) B( ) West Wood West LA Bent Wood Table 4. Single Column Machine Oriented Relational Bit Model; condensed form Canonical names Meaningful names (encoded labels) (attribute values) f1; 2; 3g West Wood f4; 5g West LA f6; 7; 8; 9g Bent Wood Table 5. Single Column Machine Oriented Relational List Model RESTAURANT OWNER TYPE LOCATION PRICE ID 1 American West wood inexpensive ID 2 American West wood inexpensive ID 3 Chinese West wood moderate ID 4 Japanese West LA moderate ID 5 Japanese West LA moderate ID 6 French Brent Wood expensive ID 7 French Brent Wood expensive ID 8 French Brent Wood expensive ID 9 French Brent Wood expensive Table 6. A Relational Restaurant Database
19 RESTAURANT CNAME(TYPE) CNAME(LOCATION) CNAME(PRICE) OWNER ID 1 T Y P E( ) LOCAT ION( ) P RICE( ) ID 2 T Y P E( ) LOCAT ION( ) P RICE( ) ID 3 T Y P E( ) LOCAT ION( ) P RICE( ) ID 4 T Y P E( ) LOCAT ION( ) P RICE( ) ID 5 T Y P E( ) LOCAT ION( ) P RICE( ) ID 6 T Y P E( ) LOCAT ION( ) P RICE( ) ID 7 T Y P E( ) LOCAT ION( ) P RICE( ) ID 8 T Y P E( ) LOCAT ION( ) P RICE( ) ID 9 T Y P E( ) LOCAT ION( ) P RICE( ) Table 7. Bit Representations of Relational Restaurant Database Canonical names (encoded labels) T Y P E( ) T Y P E( ) T Y P E( ) T Y P E( ) LOCAT ION( ) LOCAT ION( ) LOCAT ION( ) P RICE( ) P RICE( ) P RICE( ) Meaningful names (attribute values) American Chinese Japanese French West Wood West LA Brent Wood inexpensive moderate expensive Table 8. Multiple Attributes Machine Oriented Relational Model Objects INVESTORS ID 1 ID 2 ID 3 ID 4 ID 5 ID 6 ID 7 ID 8 ID 9 platinum platinum bronze silver silver gold gold gold gold Table 9. Single Column Granular Table; entries are semantically interrelated; see next table
20 INVESTORS INVESTORS platinum silver platinum gold platinum bronze gold gold gold platinum silver silver silver platinum bronze bronze bronze platinum Table 10. Semantic Relation BC Restaurant Binary Canonical names Meaningful names Owner Groups granulation (encoded labels) (attribute values) ID 1; ID 2! f3; 4; 5; 6; 7; 8; 9g platinum ID 3! f1; 2; 3g bronze ID 4; ID 5! f4; 5g silver ID 6; ID 7; ID 8; ID 9! f6; 7; 8; 9g gold Table 11. Machine Oriented List Granular Model; Canonical names spell out the semantics relation Restaurant Binary Canonical names Meaningful names Owner Groups granulation (encoded labels) (attribute values) ID 1; ID 2! B( ) platinum ID 3! B( ) bronze ID 4; ID 5! B( ) silver ID 6; ID 7; ID 8; ID 9! B( ) gold Table 12. Machine Oriented Bit Granular Model
21 RESTAURANT TYPE INVESTORS OWNER ID 1 American platinum ID 2 American platinum ID 3 Chinese bronze ID 4 Japanese silver ID 5 Japanese silver ID 6 French gold ID 7 French gold ID 8 French gold ID 9 French gold Table 13. A Granular Restaurant Database; INVESTORS attribute has a semantic relation, TYPE attribute has no semantic relation Restaurant Canonical names Meaningful names Owner Groups (encoded labels) (attribute values) ID 1; ID 2 IN V EST OR( ) platinum ID 3 IN V EST OR( ) bronze ID 4; ID 5 IN V EST OR( ) silver ID 6; ID 7; ID 8; ID 9 IN V EST OR( ) gold ID 1; ID 2 T Y P E( ) American ID 3 T Y P E( ) Chinese ID 4; ID 5 T Y P E( ) Japanese ID 6; ID 7; ID 8; ID 9 T Y P E( ) French Table 14. Machine Oriented Granular Model
Modeling the Real World for Data Mining: Granular Computing Approach
Modeling the Real World for Data Mining: Granular Computing Approach T. Y. Lin Department of Mathematics and Computer Science San Jose State University San Jose California 95192-0103 and Berkeley Initiative
More informationAssociation Rules with Additional Semantics Modeled by Binary Relations
Association Rules with Additional Semantics Modeled by Binary Relations T. Y. Lin 1 and Eric Louie 2 1 Department of Mathematics and Computer Science San Jose State University, San Jose, California 95192-0103
More informationRough Sets, Neighborhood Systems, and Granular Computing
Rough Sets, Neighborhood Systems, and Granular Computing Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract Granulation
More informationOn Generalizing Rough Set Theory
On Generalizing Rough Set Theory Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract. This paper summarizes various formulations
More informationA Generalized Decision Logic Language for Granular Computing
A Generalized Decision Logic Language for Granular Computing Y.Y. Yao Department of Computer Science, University of Regina, Regina Saskatchewan, Canada S4S 0A2, E-mail: yyao@cs.uregina.ca Churn-Jung Liau
More informationMining High Order Decision Rules
Mining High Order Decision Rules Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 e-mail: yyao@cs.uregina.ca Abstract. We introduce the notion of high
More informationSemantics Oriented Association Rules
Semantics Oriented Association Rules Eric Louie BM Almaden Research Center 650 Harry Road, San Jose, CA 95 120 ewlouie@almaden.ibm.com Abstract - t is well known that relational theory carries very little
More informationMathematical Foundation of Association Rules - Mining Associations by Solving Integral Linear Inequalities
Mathematical Foundation of Association Rules - Mining Associations by Solving Integral Linear Inequalities Tsau Young ( T. Y. ) Lin Department of Computer Science San Jose State University San Jose, CA
More informationA Logic Language of Granular Computing
A Logic Language of Granular Computing Yiyu Yao and Bing Zhou Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yyao, zhou200b}@cs.uregina.ca Abstract Granular
More informationQualitative Fuzzy Sets and Granularity
Qualitative Fuzzy Sets and Granularity T. Y. Lin Department of Mathematics and Computer Science San Jose State University, San Jose, California 95192-0103 E-mail: tylin@cs.sjsu.edu and Shusaku Tsumoto
More informationGeneralized Infinitive Rough Sets Based on Reflexive Relations
2012 IEEE International Conference on Granular Computing Generalized Infinitive Rough Sets Based on Reflexive Relations Yu-Ru Syau Department of Information Management National Formosa University Huwei
More informationValue Added Association Rules
Value Added Association Rules T.Y. Lin San Jose State University drlin@sjsu.edu Glossary Association Rule Mining A Association Rule Mining is an exploratory learning task to discover some hidden, dependency
More informationEfficient SQL-Querying Method for Data Mining in Large Data Bases
Efficient SQL-Querying Method for Data Mining in Large Data Bases Nguyen Hung Son Institute of Mathematics Warsaw University Banacha 2, 02095, Warsaw, Poland Abstract Data mining can be understood as a
More informationGranular Computing: A Paradigm in Information Processing Saroj K. Meher Center for Soft Computing Research Indian Statistical Institute, Kolkata
Granular Computing: A Paradigm in Information Processing Saroj K. Meher Center for Soft Computing Research Indian Statistical Institute, Kolkata Granular computing (GrC): Outline Introduction Definitions
More informationGranular Computing based on Rough Sets, Quotient Space Theory, and Belief Functions
Granular Computing based on Rough Sets, Quotient Space Theory, and Belief Functions Yiyu (Y.Y.) Yao 1, Churn-Jung Liau 2, Ning Zhong 3 1 Department of Computer Science, University of Regina Regina, Saskatchewan,
More informationGranular Computing on Binary Relations In Data Mining and Neighborhood Systems
Granular Computing on Binary Relations In Data Mining and Neighborhood Systems T. Y. Lin Department of Mathematics and Computer Science San Jose State University San Jose, California 95192-0103 And Department
More informationRough Set Approaches to Rule Induction from Incomplete Data
Proceedings of the IPMU'2004, the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Perugia, Italy, July 4 9, 2004, vol. 2, 923 930 Rough
More informationApproximation of Relations. Andrzej Skowron. Warsaw University. Banacha 2, Warsaw, Poland. Jaroslaw Stepaniuk
Approximation of Relations Andrzej Skowron Institute of Mathematics Warsaw University Banacha 2, 02-097 Warsaw, Poland e-mail: skowron@mimuw.edu.pl Jaroslaw Stepaniuk Institute of Computer Science Technical
More informationGranular Computing: Models and Applications
Granular Computing: Models and Applications Jianchao Han, 1, Tsau Young Lin 2, 1 Department of Computer Science, California State University, Dominguez Hills, Carson, CA 90747 2 Department of Computer
More informationInformation Granulation and Approximation in a Decision-theoretic Model of Rough Sets
Information Granulation and Approximation in a Decision-theoretic Model of Rough Sets Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan Canada S4S 0A2 E-mail: yyao@cs.uregina.ca
More informationKnowledge Engineering in Search Engines
San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2012 Knowledge Engineering in Search Engines Yun-Chieh Lin Follow this and additional works at:
More informationFormal Concept Analysis and Hierarchical Classes Analysis
Formal Concept Analysis and Hierarchical Classes Analysis Yaohua Chen, Yiyu Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {chen115y, yyao}@cs.uregina.ca
More informationGranular Computing. Y. Y. Yao
Granular Computing Y. Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca, http://www.cs.uregina.ca/~yyao Abstract The basic ideas
More informationA Set Theory For Soft Computing A Unified View of Fuzzy Sets via Neighbrohoods
A Set Theory For Soft Computing A Unified View of Fuzzy Sets via Neighbrohoods T. Y. Lin Department of Mathematics and Computer Science, San Jose State University, San Jose, California 95192-0103, and
More informationA Model of Machine Learning Based on User Preference of Attributes
1 A Model of Machine Learning Based on User Preference of Attributes Yiyu Yao 1, Yan Zhao 1, Jue Wang 2 and Suqing Han 2 1 Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada
More informationAvailable online at ScienceDirect. Procedia Computer Science 96 (2016 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 96 (2016 ) 179 186 20th International Conference on Knowledge Based and Intelligent Information and Engineering Systems,
More informationAttribute (Feature) Completion The Theory of Attributes from Data Mining Prospect
Attribute (Feature) Completion The Theory of Attributes from Data Mining Prospect Tsay Young ( T. Y. ) Lin Department of Computer Science San Jose State University San Jose, CA 95192, USA tylin@cs.sjsu.edu
More informationROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM
ROUGH SETS THEORY AND UNCERTAINTY INTO INFORMATION SYSTEM Pavel Jirava Institute of System Engineering and Informatics Faculty of Economics and Administration, University of Pardubice Abstract: This article
More informationCOMBINATION OF ROUGH AND FUZZY SETS
1 COMBINATION OF ROUGH AND FUZZY SETS BASED ON α-level SETS Y.Y. Yao Department of Computer Science, Lakehead University Thunder Bay, Ontario, Canada P7B 5E1 E-mail: yyao@flash.lakeheadu.ca 1 ABSTRACT
More informationA Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values
A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values Patrick G. Clark Department of Electrical Eng. and Computer Sci. University of Kansas Lawrence,
More informationRough Approximations under Level Fuzzy Sets
Rough Approximations under Level Fuzzy Sets W.-N. Liu J.T. Yao Y.Y.Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: [liuwe200, jtyao, yyao]@cs.uregina.ca
More informationApproximation Theories: Granular Computing vs Rough Sets
Approximation Theories: Granular Computing vs Rough Sets Tsau Young ( T. Y. ) Lin Department of Computer Science, San Jose State University San Jose, CA 95192-0249 tylin@cs.sjsu.edu Abstract. The goal
More informationOn Reduct Construction Algorithms
1 On Reduct Construction Algorithms Yiyu Yao 1, Yan Zhao 1 and Jue Wang 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {yyao, yanzhao}@cs.uregina.ca 2 Laboratory
More informationSets with Partial Memberships A Rough Set View of Fuzzy Sets
Sets with Partial Memberships A Rough Set View of Fuzzy Sets T. Y. Lin Department of Mathematics and Computer Science San Jose State University, San Jose, California 9592-3 E-mail: tylin @ cs.sj st.l.edu
More informationEFFICIENT ATTRIBUTE REDUCTION ALGORITHM
EFFICIENT ATTRIBUTE REDUCTION ALGORITHM Zhongzhi Shi, Shaohui Liu, Zheng Zheng Institute Of Computing Technology,Chinese Academy of Sciences, Beijing, China Abstract: Key words: Efficiency of algorithms
More informationGranular Computing II:
Granular Computing II: Infrastructures for AI-Engineering Tsau Young (T. Y.) Lin, Member, IEEE, Abstract What is granular computing? There are no well accepted formal definitions yet. Informally, any computing
More informationSemantics of Fuzzy Sets in Rough Set Theory
Semantics of Fuzzy Sets in Rough Set Theory Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan Canada S4S 0A2 E-mail: yyao@cs.uregina.ca URL: http://www.cs.uregina.ca/ yyao
More informationXI International PhD Workshop OWD 2009, October Fuzzy Sets as Metasets
XI International PhD Workshop OWD 2009, 17 20 October 2009 Fuzzy Sets as Metasets Bartłomiej Starosta, Polsko-Japońska WyŜsza Szkoła Technik Komputerowych (24.01.2008, prof. Witold Kosiński, Polsko-Japońska
More informationA Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set
A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set Renu Vashist School of Computer Science and Engineering Shri Mata Vaishno Devi University, Katra,
More informationAlgebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee. The Chinese University of Hong Kong.
Algebraic Properties of CSP Model Operators? Y.C. Law and J.H.M. Lee Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, N.T., Hong Kong SAR, China fyclaw,jleeg@cse.cuhk.edu.hk
More informationRough Connected Topologized. Approximation Spaces
International Journal o Mathematical Analysis Vol. 8 04 no. 53 69-68 HIARI Ltd www.m-hikari.com http://dx.doi.org/0.988/ijma.04.4038 Rough Connected Topologized Approximation Spaces M. J. Iqelan Department
More informationWeb page recommendation using a stochastic process model
Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,
More informationA Rough Set Approach to Data with Missing Attribute Values
A Rough Set Approach to Data with Missing Attribute Values Jerzy W. Grzymala-Busse Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA and Institute
More informationInduction of Strong Feature Subsets
Induction of Strong Feature Subsets Mohamed Quafafou and Moussa Boussouf IRIN, University of Nantes, 2 rue de la Houssiniere, BP 92208-44322, Nantes Cedex 03, France. quafafou9 Abstract The problem of
More informationGranular Computing: Examples, Intuitions and Modeling
Granular Computing: Examples, Intuitions and Modeling Tsau Young (T. Y.) Lin, Member; IEEE, Abstract- The notion of granular computing is examined. Obvious examples, such as fuzzy numbers, infinitesimal
More informationData Analysis and Mining in Ordered Information Tables
Data Analysis and Mining in Ordered Information Tables Ying Sai, Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Ning Zhong
More informationA Graded Meaning of Formulas in Approximation Spaces
Fundamenta Informaticae 60 (2004) 159 172 159 IOS Press A Graded Meaning of Formulas in Approximation Spaces Anna Gomolińska Department of Mathematics University of Białystok ul. Akademicka 2, 15-267 Białystok,
More informationFUZZY SPECIFICATION IN SOFTWARE ENGINEERING
1 FUZZY SPECIFICATION IN SOFTWARE ENGINEERING V. LOPEZ Faculty of Informatics, Complutense University Madrid, Spain E-mail: ab vlopez@fdi.ucm.es www.fdi.ucm.es J. MONTERO Faculty of Mathematics, Complutense
More informationHierarchical Online Mining for Associative Rules
Hierarchical Online Mining for Associative Rules Naresh Jotwani Dhirubhai Ambani Institute of Information & Communication Technology Gandhinagar 382009 INDIA naresh_jotwani@da-iict.org Abstract Mining
More informationDocument Image Restoration Using Binary Morphological Filters. Jisheng Liang, Robert M. Haralick. Seattle, Washington Ihsin T.
Document Image Restoration Using Binary Morphological Filters Jisheng Liang, Robert M. Haralick University of Washington, Department of Electrical Engineering Seattle, Washington 98195 Ihsin T. Phillips
More informationOn Fuzzy Topological Spaces Involving Boolean Algebraic Structures
Journal of mathematics and computer Science 15 (2015) 252-260 On Fuzzy Topological Spaces Involving Boolean Algebraic Structures P.K. Sharma Post Graduate Department of Mathematics, D.A.V. College, Jalandhar
More informationDefinition 2.3: [5] Let, and, be two simple graphs. Then the composition of graphs. and is denoted by,
International Journal of Pure Applied Mathematics Volume 119 No. 14 2018, 891-898 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu ON M-POLAR INTUITIONISTIC FUZZY GRAPHS K. Sankar 1,
More informationThunks (continued) Olivier Danvy, John Hatcli. Department of Computing and Information Sciences. Kansas State University. Manhattan, Kansas 66506, USA
Thunks (continued) Olivier Danvy, John Hatcli Department of Computing and Information Sciences Kansas State University Manhattan, Kansas 66506, USA e-mail: (danvy, hatcli)@cis.ksu.edu Abstract: Call-by-name
More informationA technique for adding range restrictions to. August 30, Abstract. In a generalized searching problem, a set S of n colored geometric objects
A technique for adding range restrictions to generalized searching problems Prosenjit Gupta Ravi Janardan y Michiel Smid z August 30, 1996 Abstract In a generalized searching problem, a set S of n colored
More informationA mining method for tracking changes in temporal association rules from an encoded database
A mining method for tracking changes in temporal association rules from an encoded database Chelliah Balasubramanian *, Karuppaswamy Duraiswamy ** K.S.Rangasamy College of Technology, Tiruchengode, Tamil
More informationAvoiding Fake Boundaries in Set Interval Computing
Journal of Uncertain Systems Vol.11, No.2, pp.137-148, 2017 Online at: www.jus.org.uk Avoiding Fake Boundaries in Set Interval Computing Anthony Welte 1, Luc Jaulin 1, Martine Ceberio 2, Vladik Kreinovich
More informationOptimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C
Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We
More informationRevised version, February 1991, appeared in Information Processing Letters 38 (1991), 123{127 COMPUTING THE MINIMUM HAUSDORFF DISTANCE BETWEEN
Revised version, February 1991, appeared in Information Processing Letters 38 (1991), 123{127 COMPUTING THE MINIMUM HAUSDORFF DISTANCE BETWEEN TWO POINT SETS ON A LINE UNDER TRANSLATION Gunter Rote Technische
More informationA Nim game played on graphs II
Theoretical Computer Science 304 (2003) 401 419 www.elsevier.com/locate/tcs A Nim game played on graphs II Masahiko Fukuyama Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba,
More information1. Fuzzy sets, fuzzy relational calculus, linguistic approximation
1. Fuzzy sets, fuzzy relational calculus, linguistic approximation 1.1. Fuzzy sets Let us consider a classical set U (Universum) and a real function : U --- L. As a fuzzy set A we understand a set of pairs
More informationthe application rule M : x:a: B N : A M N : (x:a: B) N and the reduction rule (x: A: B) N! Bfx := Ng. Their algorithm is not fully satisfactory in the
The Semi-Full Closure of Pure Type Systems? Gilles Barthe Institutionen for Datavetenskap, Chalmers Tekniska Hogskola, Goteborg, Sweden Departamento de Informatica, Universidade do Minho, Braga, Portugal
More informationAPPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES
APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES A. Likas, K. Blekas and A. Stafylopatis National Technical University of Athens Department
More informationBOOLEAN ALGEBRA AND CIRCUITS
UNIT 3 Structure BOOLEAN ALGEBRA AND CIRCUITS Boolean Algebra and 3. Introduction 3. Objectives 3.2 Boolean Algebras 3.3 Logic 3.4 Boolean Functions 3.5 Summary 3.6 Solutions/ Answers 3. INTRODUCTION This
More informationA GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY
A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY KARL L. STRATOS Abstract. The conventional method of describing a graph as a pair (V, E), where V and E repectively denote the sets of vertices and edges,
More informationData with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction
Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction Jerzy W. Grzymala-Busse 1,2 1 Department of Electrical Engineering and Computer Science, University of
More informationPSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets
2011 Fourth International Symposium on Parallel Architectures, Algorithms and Programming PSON: A Parallelized SON Algorithm with MapReduce for Mining Frequent Sets Tao Xiao Chunfeng Yuan Yihua Huang Department
More informationCombined Intra-Inter transaction based approach for mining Association among the Sectors in Indian Stock Market
Ranjeetsingh BParihar et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol 3 (3), 01,3895-3899 Combined Intra-Inter transaction based approach for mining Association
More informationIntroduction to Sets and Logic (MATH 1190)
Introduction to Sets and Logic () Instructor: Email: shenlili@yorku.ca Department of Mathematics and Statistics York University Dec 4, 2014 Outline 1 2 3 4 Definition A relation R from a set A to a set
More informationLecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture
More informationROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY
ALGEBRAIC METHODS IN LOGIC AND IN COMPUTER SCIENCE BANACH CENTER PUBLICATIONS, VOLUME 28 INSTITUTE OF MATHEMATICS POLISH ACADEMY OF SCIENCES WARSZAWA 1993 ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING
More informationMA651 Topology. Lecture 4. Topological spaces 2
MA651 Topology. Lecture 4. Topological spaces 2 This text is based on the following books: Linear Algebra and Analysis by Marc Zamansky Topology by James Dugundgji Fundamental concepts of topology by Peter
More informationMolodtsov's Soft Set Theory and its Applications in Decision Making
International Journal of Engineering Science Invention ISSN (Online): 239 6734, ISSN (Print): 239 6726 Volume 6 Issue 2 February 27 PP. 86-9 Molodtsov's Soft Set Theory and its Applications in Decision
More informationJohns Hopkins Math Tournament Proof Round: Point Set Topology
Johns Hopkins Math Tournament 2019 Proof Round: Point Set Topology February 9, 2019 Problem Points Score 1 3 2 6 3 6 4 6 5 10 6 6 7 8 8 6 9 8 10 8 11 9 12 10 13 14 Total 100 Instructions The exam is worth
More informationreasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap
Storage-Ecient Finite Field Basis Conversion Burton S. Kaliski Jr. 1 and Yiqun Lisa Yin 2 RSA Laboratories 1 20 Crosby Drive, Bedford, MA 01730. burt@rsa.com 2 2955 Campus Drive, San Mateo, CA 94402. yiqun@rsa.com
More informationT. Background material: Topology
MATH41071/MATH61071 Algebraic topology Autumn Semester 2017 2018 T. Background material: Topology For convenience this is an overview of basic topological ideas which will be used in the course. This material
More informationdetected inference channel is eliminated by redesigning the database schema [Lunt, 1989] or upgrading the paths that lead to the inference [Stickel, 1
THE DESIGN AND IMPLEMENTATION OF A DATA LEVEL DATABASE INFERENCE DETECTION SYSTEM Raymond W. Yip and Karl N. Levitt Abstract: Inference is a way tosubvert access control mechanisms of database systems.
More informationApplying Fuzzy Sets and Rough Sets as Metric for Vagueness and Uncertainty in Information Retrieval Systems
Applying Fuzzy Sets and Rough Sets as Metric for Vagueness and Uncertainty in Information Retrieval Systems Nancy Mehta,Neera Bawa Lect. In CSE, JCDV college of Engineering. (mehta_nancy@rediffmail.com,
More informationMining of Web Server Logs using Extended Apriori Algorithm
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational
More informationAC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery
: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center {hcheng3, hanj}@cs.uiuc.edu,
More informationAlgebra of Sets. Aditya Ghosh. April 6, 2018 It is recommended that while reading it, sit with a pen and a paper.
Algebra of Sets Aditya Ghosh April 6, 2018 It is recommended that while reading it, sit with a pen and a paper. 1 The Basics This article is only about the algebra of sets, and does not deal with the foundations
More informationSkill. Robot/ Controller
Skill Acquisition from Human Demonstration Using a Hidden Markov Model G. E. Hovland, P. Sikka and B. J. McCarragher Department of Engineering Faculty of Engineering and Information Technology The Australian
More informationTransforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm
Transforming Quantitative Transactional Databases into Binary Tables for Association Rule Mining Using the Apriori Algorithm Expert Systems: Final (Research Paper) Project Daniel Josiah-Akintonde December
More informationDisjunctive and Conjunctive Normal Forms in Fuzzy Logic
Disjunctive and Conjunctive Normal Forms in Fuzzy Logic K. Maes, B. De Baets and J. Fodor 2 Department of Applied Mathematics, Biometrics and Process Control Ghent University, Coupure links 653, B-9 Gent,
More informationPerformance Analysis of Apriori Algorithm with Progressive Approach for Mining Data
Performance Analysis of Apriori Algorithm with Progressive Approach for Mining Data Shilpa Department of Computer Science & Engineering Haryana College of Technology & Management, Kaithal, Haryana, India
More informationLecture 2 Wednesday, August 22, 2007
CS 6604: Data Mining Fall 2007 Lecture 2 Wednesday, August 22, 2007 Lecture: Naren Ramakrishnan Scribe: Clifford Owens 1 Searching for Sets The canonical data mining problem is to search for frequent subsets
More informationLocalization in Graphs. Richardson, TX Azriel Rosenfeld. Center for Automation Research. College Park, MD
CAR-TR-728 CS-TR-3326 UMIACS-TR-94-92 Samir Khuller Department of Computer Science Institute for Advanced Computer Studies University of Maryland College Park, MD 20742-3255 Localization in Graphs Azriel
More informationWhat is a Graphon? Daniel Glasscock, June 2013
What is a Graphon? Daniel Glasscock, June 2013 These notes complement a talk given for the What is...? seminar at the Ohio State University. The block images in this PDF should be sharp; if they appear
More informationTilings of the Euclidean plane
Tilings of the Euclidean plane Yan Der, Robin, Cécile January 9, 2017 Abstract This document gives a quick overview of a eld of mathematics which lies in the intersection of geometry and algebra : tilings.
More information.Math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in .
0.1 More on innity.math 0450 Honors intro to analysis Spring, 2009 Notes #4 corrected (as of Monday evening, 1/12) some changes on page 6, as in email. 0.1.1 If you haven't read 1.3, do so now! In notes#1
More informationA study on lower interval probability function based decision theoretic rough set models
Annals of Fuzzy Mathematics and Informatics Volume 12, No. 3, (September 2016), pp. 373 386 ISSN: 2093 9310 (print version) ISSN: 2287 6235 (electronic version) http://www.afmi.or.kr @FMI c Kyung Moon
More informationMath 190: Quotient Topology Supplement
Math 190: Quotient Topology Supplement 1. Introduction The purpose of this document is to give an introduction to the quotient topology. The quotient topology is one of the most ubiquitous constructions
More informationAdaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks
60 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 3, NO. 1, MARCH 2002 Adaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks
More informationGraph Based Approach for Finding Frequent Itemsets to Discover Association Rules
Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules Manju Department of Computer Engg. CDL Govt. Polytechnic Education Society Nathusari Chopta, Sirsa Abstract The discovery
More informationSOME TYPES AND USES OF DATA MODELS
3 SOME TYPES AND USES OF DATA MODELS CHAPTER OUTLINE 3.1 Different Types of Data Models 23 3.1.1 Physical Data Model 24 3.1.2 Logical Data Model 24 3.1.3 Conceptual Data Model 25 3.1.4 Canonical Data Model
More informationAn Approach to Intensional Query Answering at Multiple Abstraction Levels Using Data Mining Approaches
An Approach to Intensional Query Answering at Multiple Abstraction Levels Using Data Mining Approaches Suk-Chung Yoon E. K. Park Dept. of Computer Science Dept. of Software Architecture Widener University
More informationA Note on Fairness in I/O Automata. Judi Romijn and Frits Vaandrager CWI. Abstract
A Note on Fairness in I/O Automata Judi Romijn and Frits Vaandrager CWI P.O. Box 94079, 1090 GB Amsterdam, The Netherlands judi@cwi.nl, fritsv@cwi.nl Abstract Notions of weak and strong fairness are studied
More informationVertex Deletion games with Parity rules
Vertex Deletion games with Parity rules Richard J. Nowakowski 1 Department of Mathematics, Dalhousie University, Halifax, Nova Scotia, Canada rjn@mathstat.dal.ca Paul Ottaway Department of Mathematics,
More informationLinguistic Values on Attribute Subdomains in Vague Database Querying
Linguistic Values on Attribute Subdomains in Vague Database Querying CORNELIA TUDORIE Department of Computer Science and Engineering University "Dunărea de Jos" Domnească, 82 Galaţi ROMANIA Abstract: -
More informationClassification with Diffuse or Incomplete Information
Classification with Diffuse or Incomplete Information AMAURY CABALLERO, KANG YEN Florida International University Abstract. In many different fields like finance, business, pattern recognition, communication
More informationExpressions that talk about themselves. Maarten Fokkinga, University of Twente, dept. INF, Version of May 6, 1994
Expressions that talk about themselves Maarten Fokkinga, University of Twente, dept. INF, fokkinga@cs.utwente.nl Version of May 6, 1994 Introduction Self-reference occurs frequently in theoretical investigations
More information