Fuzzy Cognitive Maps application for Webmining Andreas Kakolyris Dept. Computer Science, University of Ioannina Greece, csst9942@otenet.gr George Stylios Dept. of Communications, Informatics and Management, TEI of Epirus, Greece gstylios@yahoo.gr Voula Georgopoulos Dept. of Speech and Language Therapy, Technological and Educational Institute of Patras, Patras, Greece Email: voulag@otenet.gr Abstract This work examines and proposes a new method for web mining inference based on Fuzzy Cognitive Maps. The web mining inference and knowledge extraction consists of two phases. At the first phase the a priori algorithm is used for web mining and a collection of Association rules is inferred. In the second phase, the set of Association Rules is transformed into a Fuzzy Cognitive Map (FCM). It investigates a new methodology for developing Fuzzy Cognitive Maps so that they can be suitable for web mining inference. Keywords: Web mining, Association rules, Fuzzy Cognitive Maps. 1 Introduction This research work proposes a new methodology for web mining inference, knowledge extraction and representation. This method consists of two stages, at the first phase the a priori algorithm is used for web mining and a collection of Association rules is inferred. These rules have the form Α Β, where Α and Β are objects or a set of objects and each rule describes the possibility to visit B if it is known that the user has previously visited A. Association rules is a convenient method to represent knowledge because they describe the behavior of users and their preferences. Association rules contain great information, but they are not well organized and it is quite difficult to understand much of the containing knowledge and information, which is not expressed and explained. Thus, there is a need for further processing. Here is proposed the transformation of Association Rules into FCMs so that to present in an adequate way, hidden knowledge and the inferred results. FCM is a great tool to simulate and represent the behavior of users and their preferences. The proposed methodology performs web mining to web log files. The structure and usage of the methodology is described and the way it analyzes web mining results, offering a description concerning users navigation behavior within a web site, is explained. 2 Web usage mining Web Usage Mining aims to satisfy the needs of the owners and creators of web resources. The goal is to extract knowledge about internet users by discovering patterns regarding users characteristics, preferences and activities in certain web locations. The objective of web usage mining is the pattern discovery regarding navigation, characteristics and preferences of web site users, whose activity is recorded by the corresponding web servers. The data collected by the servers are stored into web server access or referrer logs. These data are called Internet secondary data [1]. The first phase of web usage mining consists of two main steps: Step1. Preprocessing o The original data set (web server log files) is cleaned of unused information and errors (Data Cleaning) o Remaining data are organized to transactions, according to the needs of the physical problem 593
and the pattern discovery method used (Transaction Recognition) Transactions are determined by the type of resource (Missing Transactions) or the time the user spent on the resource (Time Window Approach) Transactions of the same user are grouped into a single record for more efficient processing (Grouping Transactions) Step2: Pattern Discovery o Association rules are extracted from transactions using the apriori algorithm (Association rule extraction) 3 Association Rule Extraction After the preprocessing stage, the next step is the extraction of existing patterns from the dataset and transformation into Association rules [2].This method aims to discover associations among transaction sets and to define the exact rules that govern these relations. Association rules have the form X Y with support A% and confidence B%. Where X and Y are transaction sets. Here X is a set of items found in transactions. Y is a set consisting of just one item, not found in X. Thus, if Τ={t 1,...,t n } the set of items found in all transactions, X T and Y= t i T-X. The support A% value represents the percentage of transactions that contain X and the confidence B% represent the percentage of transactions that verify the rule. In order to extract the Association rules, we apply the Apriori algorithm, which is simple, fast and easy to comprehend [3]. The concept of this algorithm is to find sets of items in the transactions that are frequent [4]. The Association rule extraction procedure according to Apriori algorithm is divided: 1. Discovery itemsets with support over a specified threshold. These itemsets are called large or frequent itemsets. 2. Extract the Association Rule based on the frequent itemsets. Definition 1 Frequent Itemsets are the items with Support value A over the defined threshold min_support. The Min_support is a defined parameter of the algorithm, which is modified by the user. This threshold value depends on each specific problem, the corresponding dataset and so it is experimentally defined. If the threshold value is too high, some important rules may be ignored. On the other hand, if the threshold value is too low, many of the extracted rules may be of little importance. Assume there is the following record in the table of transactions: A B Α D C A B C. This means that the user accessed these urls using the defined order. Subsets like {A B}, {A C}, {A B C} appear twice in the record but subsets like {B A} and {D C} appear only once. The items are chronologically ordered so the subsets are created by the union of an item with items to its right in each transaction. The algorithm uses multiple passes over the dataset to discover the frequent itemsets with increasing number of items. In the first pass the support of individual items (1-itemsets) is calculated. In the following passes new candidate frequent itemsets are created by combining the ones found in the previous pass. The apriori algorithm uses the property that if a k-itemset is frequent, all of its (k- 1)-subsets are also frequent. The opposite rule does not apply. If all of the subsets are frequent the set is not necessarily frequent, but may be frequent. Let itemset Y with support(y) = s. For every itemset X we can a priori say that: If X Y, then support(x) s If X Y, then support(x) s The key idea is that in the k th pass the algorithm will generate all k-itemsets from the frequent (k-1)- itemsets found in the previous pass. After a pass over the original dataset we find which of these candidate k-itemsets are indeed frequent. There are no other frequent sets, because another conclusion as a result of the property mentioned above is that if a set X is not frequent, then none of its supersets can be frequent. The algorithm terminates when no new frequent sets are generated. The rule extraction is a quite straightforward process after the generation of the frequent sets. Rules are found in k-itemsets for k 2. Let a frequent itemset Y= {Ι 1,,I k }. The rule to be verified is I 1,, I k-1 I k.. In order to calculate how powerful the rule is, it is necessary to see the ratio support(y)/support(x), where Χ=Υ-I k is calculated. This ratio is called the confidence value for this rule. If the measured confidence is greater than the min_confidence 594
threshold then this rule is considered to be important, otherwise it is ignored. The value of min_confidence threshold value depends on the specific problem, the corresponding data and it is experimentally defined by the user. There is no need to examine rules with other forms than I j+1,, I k I j. Only these rules will be valid. All others will not have enough support, because of the way frequent sets are generated. 4. Fuzzy Cognitive Maps representing web user behavior Fuzzy Cognitive Maps (FCMs) is a soft computing technique that follows an approach similar to the human reasoning and human decision-making process. Fuzzy Cognitive Maps (FCMs) have been used successfully for the modeling of complex systems by describing them using related concepts. A FCM consists of nodes (concepts) that illustrate the different aspects of the system s behaviour. These nodes (concepts) interact with each other showing the dynamics of the model. Fig. 1 illustrates a graphical representation of a FCM [5]. E 1-->1-0.05 C1 0.670 +0.35 E1-->3 E 1-->2 +0.85 E3-->2 C3 0.012-0.62 C2 0.444 E 2-->3 +0.45 Fig1. Simple Fuzzy Cognitive Map representation A fuzzy cognitive map is a graph consisting of a collection of nodes where each node has a value whose meaning depends on its representation and this value usually belongs to the interval [0, 1]. The FCM nodes are interconnected with weighted edges. These edges show the causal relation between nodes. Edges have weights w taking values in the interval w [-1, 1] R. These weights indicate with what degree one concept influences another. Positive weight value indicates positive causality, while negative weights indicate negative causality. The purpose of the creation of such a mapping is to obtain more complex conclusions than with a set of rules, so the knowledge extraction process is greatly amplified. Due to the dynamic properties of the FCM created by these rules, it can be used for the simulation of user behavior by changing some of the initial system parameters [6] [7]. 4.1 Developing the FCM-web The Fuzzy Cognitive Map consists of a set of nodes and directed edges between them. The nodes of the FCM-web represent the itemsets that were found applying the Apriori algorithm for developing the association rules. For example, if there were found an association rule url1 url2 url3, then two nodes are added to the FCM-web: (url1 url2) (url3) Each node of the FCM-web has a value attached to it, which stands for the support that the corresponding itemset has in the original dataset. This support value has been calculated when the corresponding association rule was extracted and its value is in the interval [0, 1] by definition. The FCM-web consists of the nodes that are found in the association rules plus those that make up itemsets with more than one items. When the nodes of FCM-web have been determined, the next step is the determination of the causal weighted interconnections among concepts. Two kinds of edges of FCM-web are defined: The first kinds of edges stand for the direct relation between two nodes exactly the same as in the corresponding association rule. The weight of this edge is the confidence value of that rule. Although in the general definition of the FCM weighted edges have values in [-1, 1] in this case the interval is limited in the [0, 1] because confidence takes values in that interval. We will refer to these edges as edges type 1. The second kind of edges is used for associating a composite node. We define as composite node the node that represents an itemset consisting of more than one items urls. For this kind of edges, there is not an easily derived weight value from the Association Rule. We will refer to them as type 2 edges. 595
Example 4.1 Assuming that the web mining procedure has produced the following Association rules: A B A C AB C B A B D ABC D BDC A. The nodes of the FCM-web for this case are easily derived from Association rules, consisted of 7 nodes, which are the itemsets appearing at least one time either to the right or the left part of the Association rule. For these nodes the type 1 edges, are easily derived, and they appear in Fig. 2 in solid line. The rule AB C says that if AB appears in the dataset, then at next step C will appear with a certain amount of confidence. This means that if the support value of AB changes so will do the support value of C. This kind of relationship between nodes is represented by edges of type 1. AB C ABC and simulate the behavior of web users. The FCMweb consisting of edges of type 1 and 2 is illustrated in Fig.2. A node of the FCM-web contains the following information: i. The node id for node identification: id ii. A pointer to the list of items (urls) of the node: *items iii. A pointer to the set of type 1 edges (only for nodes with one item) that indicate which nodes connect to this node: *e1 iv. A pointer to the set of type 2 edges (only for nodes with more than one items) that indicate which nodes with one item build the current node: *e2 v. The number of items the node has: items_number vi. The current value of the node: value vii. The previous value of the node, before the last computation (if any): old_value viii. The number of appearances of the itemset this node represents in the transaction set: apps ix. The ratio among items of a node with more than one item. This ratio is computed upon the creation of the FCM and remains constant: L x. A pointer to the next node of the list: *next 5 FCM-web Usage and Simulation A D B BDC Fig. 2. The FCM-web illustration with edges of type 1 and 2. If the Association rules had only one itemset in each part, this would be sufficient. But in fact, rules are more complicated and they have combined itemsets consisting of more than one itemsets. For this case, the type 1 edges are not sufficient, which are suitable when the consequent of an Association rule is only one itemset. In the example AB C there is no way of calculating the value of node (AB) when the values of (A) and (B) change. To counter this problem we introduce edges of type 2. With this way, any interaction between all nodes of FCM-web is permitting that create a powerful tool to model When the FCM-web has been created, it can be used to describe the web-user behavior and it is possible to simulate user s behavior, altering the initial values of the nodes. Initially, values of nodes having certain support and appearances values. The goal of the simulation is to see how these values are affected when some of them are altered. When the support values of some nodes are changed, the FCM calculates new values for all nodes according to the previous values, to the connections between nodes and the user input. Because the support values may be misleading for correct conclusions, the number of expected appearances of the itemsets may be used instead. Different scenarios on the usage of FCM examines what will be the result on the other nodes if the support value of one node changes (times visiting the corresponding url). The FCM-web simulation consists of three phases. First the user chooses some nodes, to which he gives new support values. At the second step, node values are adjusted to satisfy logical limitations and at third step, the FCM-web gives the new expected values to all nodes. 596
5.1 User Input (first simulation step) It is assumed that the nodes chosen by the user are only those with one itemset (url ). The increase or decrease of the support value of a node has the meaning that there are more visits to that url in the log file. For a real web-server, in order to increase the measured support value of the url /home/products we could, for example, make the web link to the corresponding url more visible in the main page or add references to it. On the other hand, there is no meaning in directly changing the values of nodes with more than one item. Let s assume there is a node ABCD. A change in the support value of this itemset ABCD, cannot be done directly. An increase in the support value of node ABCD would mean that there was a way to persuade users to access urls with that order, which is not very realistic. 5.2 Value Adjustment (second simulation step) When the values of some notes are changed by the user of the FCM-web, the appropriate adjustment to the values of all nodes have to be done. This is due to the fact that the sum of the support value of all items in the dataset has to be to 1. But in the FCMweb not all itemsets are presented so that the sum of all the support values is 1. When the user changes the support of one or more nodes, the support value of the rest must be adjusted accordingly. Physically, an increase to the support of a url means that new references to that url are added to the dataset (it contains more references to that url). A decrease means that references are removed from the dataset (it contains less references to that url). Old_value and new_value are the previous and the current value of one node of the FCM-web, whose support is changed by the user. Prev_Transactions and Current_Transactions are the number of transactions calculated before and after the change of the support value of one node. Apps is the number of appearances that a node has in the dataset with transaction number Prev_Transactions. When the value of a node is changed in reality the number of appearances of the corresponding itemset in the dataset is changed. So the new support value of this node have to be calculated using (1), where x is the number of transaction added or deducted in order to get the new support value. apps x new _ + value = Pr ev _ Transactions + x (1) Therefore, the current total transactions number will be: Current _ Transactions = Pr ev _ Transactions + x (2) and the appearances of the node in the dataset: apps = apps + x (3) For the rest nodes the new support values are: apps value = (4) Current _ Transactions But in the FCM-web there are nodes representing more than one itemset, that have to be adjusted too, because the values of the items that constitute this combined itemset have changed. In this case the edges of type 2 are used and the L ratio is introduced. Consider the node representing the itemset (Ι 1,...,Ι n ). The support value for this itemset as well as the support of the individual items was calculated during the web mining stage. When a combined node is created, the ratio L is calculated as: sup port( I1,..., I n ) L = (5) sup port( I )*...*sup port( I ) 1 This ratio must remain constant, so that the ratio between the combined itemset and its items that was calculated in the original dataset will be preserved at all times. The previous support value of the itemset is not important. Its value is depended only on the current support values of its items. The new value for this combined node will be: new value = L* v *...* (6) _ 1 where v i the support value of individual nodes. Example 5.1 Consider the items A, B, C, D in a dataset of 10 transactions. For each item the support and appearances are known from the web mining procedure and are respectively provided in the brackets Item (support, Apps): A (0.2, 2), B(0.2, 2), C(0.5, 5) and D (0.1, 1) Let s suppose the user changes the support value of A from 0.2 to 0.5. Then, using (eq.1) we calculate that x=6 transactions must be inserted to obtain the new support value, so there are 6+2=8 appearances of A and a total of Current_Transactions=16. v i n 597
Next, the new support values for the rest of the nodes with 1 item are calculated: Β= 2/16= 0.125, C= 5/16= 0.3125 and D= 1/16= 0.0625 The sum of all supports is: 0,5+0,125+0,3125+0,0625=1. With this way the support values are normalized when the user changes some values of the concepts or there are some new computations for the FCM-web. 5.3 Calculation of new values for FCM-web (third simulation step) After the values of all nodes have been adjusted, at the third step new values for the FCM-web are calculated. In the FCM-web, the edges of type 1 affect nodes with one item. New_value is the new value of the node, current_value the value that a node currently has, conf is the confidence value of the edge between nodes and v is the support change of nodes affecting the current node. The new value is calculated using: new _ value = ( v * conf ) + current _ value (7) The computation steps are the following: i) Initially, for every node with one item the new value is calculated by the effect of edges type 1 of the other nodes with one item. A value adjustment is done as seen before. ii) For every node with more that one items the new value is calculated using the edges of type 2. iii) Finally, for every node with one item the final values are calculated from the effect of nodes with more than one items using edges of type 1. Again there is the appropriate value adjustment at the end s in step 1. The pseudocode that demonstrates the use and operation of the system is the following: 1. The user chooses a node and changes its value 2. Value adjustment of the rest of the nodes for consistency 3. Calculation of new values for the nodes with value adjustment after each step a. New values for nodes with one item b. New values for nodes with more than one item c. New values for nodes with one item affected by those with more than one item 4. Return to 1 6 Conclusions This research work presents a novel approach to develop a web user model describing the behavior of users visiting a web server. The methodology utilizes web mining method to extract the Association Rules from the log files. For better representation of information and knowledge existing in Association Rules, the use of Fuzzy Cognitive Maps is introduced. A novel augmented FCM-web is introduced suitable for web mining knowledge. There are defining two kinds of nodes for the FCM-web and two kinds of weighted interconnections among nodes. The use of FCM-web for web user behavior is described. Acknowledgments Funding for this research was provided by EPEAEK II: Archimedes -Research Support in TEI, Ministry of National Education & Religious Affairs Greece. References [1] R. Cooley, M. Mobasher, J. Srivastava, Data preparation for mining world wide web browsing patterns Knowledge and Information Systems,Vol 1, 1999, pp. 5-32. [2] J. Srivastava, R. Cooley, M. Deshpande, P.-N. Tan, Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, Technical Report, Depart. of Computer Science and Engineering, University of Minnesota (1999) [3] R. Agrawal, R. Srikant, Fast algorithms for mining association rules. Proc. Of the International Conference on Very Large Databases, pp. 487-499 (1994) [4] R. Agrawal, T. Imielinski and A Swami, Mining association rules between sets of items in large databases. In: Proc. of ACM SIGMOD Intl. Conf. on Management of Data, 1993. [5] C. Stylios, P. Groumpos and V. Georgopoulos An Fuzzy Cognitive Maps Approach to Process Control Systems. J. of Advanced Computational Intelligence,; Vol. 3, 1999, pp. 409-417. [6] K. C. Lee, J. S. Kim, N. H. Chung, S. J. Kwon, Fuzzy Cognitive Map Approach to web mining inference amplification, Expert Systems with Applications, vol. 22, 2002, pp. 197-211 [7] G. Meghabghab, Mining user s web searching skills: fuzzy cognitive state map vs. markovian modeling,journal of Computational Cognition, Vol.1, 2003, pp. 51 92, 2003. 598