PatternRank: A Software-Pattern Search System Based on Mutual Reference Importance

Size: px
Start display at page:

Download "PatternRank: A Software-Pattern Search System Based on Mutual Reference Importance"

Transcription

1 PatternRank: A Software-Pattern Search System Based on Mutual Reference Importance Atsuto Kubo, Hiroyuki Nakayama, Hironori Washizaki, Yoshiaki Fukazawa Waseda University Department of Computer Science and Engineering Okubo, Shinjuku, Tokyo, Japan {a.kubo,h-nakayama}@fuka.info.waseda.ac.jp {fukazawa,washizaki}@waseda.jp Abstract There are currently a large number of digitized documents on software patterns (hereafter patterns ) published on the Web. However, there are no search systems available that specialize in pattern documents, so in order to find a desired pattern, users must resort to manual searches using generalized search systems. When using a generalized search system the search result also includes documents that are not pattern documents, and if a large number of results are returned, randomly searching for the desired pattern can be very inefficient. We propose a pattern search system as a solution to this problem. The proposed method operates on a set of pattern documents gathered in a repository, attaches and importance value to them based on relationships between them, and uses it together with sorting and keyword searching. We conducted search experiments using the proposed system, applied to 131 pattern documents published on the Web, in order to experiment the effectiveness of the system. The experiment results confirmed that patterns can be studied and investigated effectively using the proposed system. 1. Introduction The term Software Patterns (hereafter patterns ) refers to solutions or guidelines for certain types of problems that occur repeatedly in various situations in software development, taking into account the various constraints that must be considered when solving the problem [15, 20]. Patterns can be used to solve problems that occur frequently in an efficient manner, by reducing the amount of analysis and design work required in software development. Studying patterns can also provide a deeper understanding of the types of problems handled by each pattern and the underlying software-development techniques. For example, since the Abstract Factory pattern is used in the design of Graphical User Interfaces (GUI), studying the pattern provides insight into design policies used in GUI application frameworks as well as awareness of one of the problems arising in the implementation of GUIs using object-oriented techniques. It can also provide direction on issues like use of abstract classes in object-oriented design. Since the release of Design Patterns by Gamma et al. [3], the number of available patterns has increased through the activities of the pattern community, so that now there are a large number of patterns published in the literature or in electronic-document format on the Web [9]. As this number has increased, the need for a search system that can efficiently find a reference pattern conforming to the user s needs and intentions from among a large number of pattern documents has also increased. A search system specialized for patterns will be able to improve the efficiency of pattern use in software development, and of the study of patterns. Our proposing method is built onto the repository contains only the pattern documents. As an example, the Portland Pattern Repository [1] is an experimental Web site, gathering pattern documents and allowing searches to be performed on them. Supposing that we are developing a system and would like a pattern for designing windows in our GUI, we could enter the word window as a query into the Portland Pattern Repository. Of the 32,334 documents in the repository, this query yields 844 documents that are presented in order of document name (as of March 2007). From the perspective of pattern usefulness this name order is essentially random, and the results also include documents that are not patterns (for example, articles about patterns), so the user may have to read up to 844 documents in order to find the required pattern. In this paper we propose a method which automatically extracts relationships between patterns from the pattern documents and computes an importance value for each

2 pattern based on these inter-pattern relationships. We call this method the PatternRank method. Computation of an importance value in the PatternRank method is similar to the importance value computed for Web pages in the Page Rank method [11] or for software components in the Component Rank method [7]. The PatternRank method is somewhat different, however, in that it must take into consideration information particular to patterns, such as the fact that the same pattern may be expressed differently in different documents, or belong to several different catalogs. We also propose an application for the PatternRank method, which is a search system that applies the pattern importance values from the PatternRank method to search results and presents them in order of importance. The proposed system is able to compute the importance values against the set of patterns returned by a search query. 2. Important patterns and the need for a search system In this section we explain the format that is generally used to describe patterns, and discuss what sorts of important patterns are needed by users. In the discussion below, when writing relationship, we are referring to any sort of relationship that may exist among multiple elements, not limited to simple pattern relationships. We write inter-pattern relationships to indicate specifically a relation among multiple patterns. The term Related is only used to indicate Related Patterns item in a pattern description Pattern characteristics Generally, patterns are described in terms of a collection of items like pattern name, assumed conditions and requirements, problems, forces (constraints that must be considered), solutions, and related patterns [20]. A description of a common design pattern called the Singleton pattern [3] is shown in Figure 1 as an example of a pattern description. The Pattern Name item gives the name of the pattern as Singleton, the Problem item describes the problem that the pattern solves, and the Solution item describes the solution provided by the Singleton pattern. The Related Patterns item, shows that fact that the Singleton pattern references the Abstract Factory pattern [3]. As shown in Figure 1, the Singleton pattern is described in the GoF design pattern catalog. A pattern catalog is a collection of patterns, organized and classified according to their characteristics and with their mutual relationships. Patterns are not practical individually, but form pattern systems together with other patterns, and by working in cooperation with these related patterns, form solutions to large Pattern Name: Singleton... Related Patterns: The Singleton design pattern can be used with the Abstract Factory design pattern to guarantee at most one... Figure 1. Excerpted pattern description of Singleton pattern problems by gathering together the small solutions represented by each individual pattern [14]. For this reason, patterns in the same catalog are frequently used together Important patterns Problems occurring in software development can be solved efficiently by using patterns. Patterns are also useful for presenting the types of problems and solutions that occur frequently in software development and for studying software development techniques like programming and object-oriented design [12]. Several other pattern search methods have been proposed [1, 8, 15, 17], but although some managed to improve precision by searching over a repository, they do not apply an ordering to the search result, so search-result effectiveness drops if there are many candidate patterns conforming or related to the user s search query. As a result, it is necessary to automatically apply an ordering to the patterns in the repository according to some index. It is desirable to have an ordering for the pattern collection in which the more-important patterns come before less-important patterns. From the perspective of pattern-use, with the following characteristics are more important: Easy to use in software development Used often in software development Often searched-for on the Web Referenced often in other pattern documents. Sorting pattern collections based on how frequently they are used in software development would be extremely effective from a pattern reuse perspective. Patterns that have achieved a high rank through use in software development in the past have an extremely high probability of being used in future software development as well. However, data on past pattern use must be gathered on each past software development project, so this is extremely difficult to implement practically.

3 Sorting a pattern collection based on how frequently patterns are searched-for on the Web could be effective from a pattern-reuse perspective. Pattern documents on the Web are likely to be searched-for frequently to solve problems occurring in software development, so patterns used often on the Web are likely to be used in the future for software development. However, patterns unrelated to software development that are searched-for frequently could achieve a high rank in spite of not being useful for software development. This approach would also require gathering data from various Web sites publishing patterns, making it very difficult to implement. Sorting pattern documents based on the frequency with which they are referenced within pattern documents could also be useful from a pattern re-use point of view. Patterns that are referenced frequently by other patterns would be general-purpose, basic patterns, and would achieve high rank. These patterns provide general solutions to many problems, and so highly-ranked patterns are more likely to be used in future software development as well. This approach also has the benefit of being relatively easy to implement compared to the other methods. However, since the ordering is done based on the viewpoint of three pattern creator rather than the intentions or requirements of the user, it may not reflect the user s needs well. Considering usefulness and practicality, the PatternRank method uses the frequency of references in other pattern documents as a part of the pattern-importance calculation. A pattern-search system with the ability to sort the pattern set based on the frequency with which it is referenced by other pattern documents will be a useful support tool for using patterns effectively. When studying patterns, it is important to consider the educational effectiveness of patterns, or which pattern among many should be studied first for the best and most efficient understanding. Studying a collection of patterns in order of importance from the perspective of education would help reduce the study time required and provide a deeper understanding compared to studying patterns in random order. Students will be able to study patterns more effectively if they refer to patterns they already know than otherwise. So, when studying patterns in order to learn the development methods behind them, it may be more effective to study the patterns that are referenced more first, allowing later patterns to be studied more efficiently. For example, if the references between patterns are as shown in Figure 2, the reference from pattern p 3 to p 1 indicates that p 1 is used as part of the construction of p 3, so p 1 is likely given as a related pattern in the document for pattern p 3. If we are studying p 3, and p 1 has been learned previously, it will be possible to understand the explanation of p 3 in a shorter amount of time. However, if p 2 is studied first, the benefit of learning Figure 2. Interpattern relationships p 1 is not available. Thus, in cases like Figure 2, pattern p 1 is more important than p 2 in terms of learning about pattern p Pattern Search based on mutual reference importance In this chapter, we describe the PatternRank method, which automatically computes a pattern importance value, based on inter-pattern relationships, for each pattern selected from a collection of pattern documents. Inter-pattern relationships are classified into reference relationships, which indicate that one pattern makes reference to another, same-catalog relationships, which indicate that the patterns belong to the same pattern catalog, and project-use relationships, indicating that the patterns were used together in the same software project. We propose a search system which presents the set of patterns from a user s search query in an order based on the importance values derived from the PatternRank method. With the proposed system, application of the PatternRank method allows the user to use the more-important patterns effectively even when the search result contains a very large number of candidates Computing importance value based on inter-pattern relationships The Pattern-Rank method is a scheme which computes an importance value for elements based on relationships between these elements, drawing on the Page Rank method [11], which computes importance values for Web pages, and the Component Rank method [7], which does so for software components. The Page Rank method handles hyperlink reference between pages and uses a rule that says Web pages referenced by important web pages are also important. The PatternRank method specializes the Page Rank method for use on patterns. Pattern documents in the PatternRank method correspond to Web pages in the Page Rank method, and references to other patterns correspond to hyper-links in the Page Rank method. Also, as described in section 2.2, the PatternRank

4 method defines importance according to the following definition: Rule 1: The more patterns that reference a pattern, the more important it is. Rule 2: The more important are the patterns referencing a pattern are, the more important the pattern itself is. Through rule 1, an ordering appropriate for application or for studying is given to a group of patterns using the importance value for each pattern, so the higher the importance value of a pattern, the more likely it is a general-use pattern and a base for other patterns, and the more likely it is used together with other patterns. For example, The Factory Method pattern in the GoF design-pattern catalog references the Template Method pattern in the implementation of its solution. Here, from both the application and the learning perspectives, if the Template Method pattern is learned first, the Factory Method pattern will be easier to understand. Hence, the Template Method can be considered to be more important. Rule 2 allows the importance of the referencing patterns to be reflected in the importance of the referenced pattern. As an example, if rule 2 is not used and importance is only calculated based on the simple number of references, then the difference between a pattern, p 1, reverenced only by unimportant patterns and a pattern, p 2, referenced by the same number of patterns of much higher importance, would not be expressed. Applying rule 2 here correctly shows that pattern p 2 is more important than p 1. An example of computing importance using the Pattern- Rank method is shown in Figure 3. Pattern p 1 references pattern p 2 and pattern p 4, while pattern p 3 references p 4. Pattern p 1 has importance value 0.5, while p 3 has importance 0.1. In the PatternRank Method, first a weighting for each reference edge is computed from the importance of the referencing pattern and the number of reference edges. For example, the weights of reference edges e 12 and e 14 from the referencing pattern p 1 are both 0.25, because there are two, and 0.5 / 2 = Similarly, the weighting for reference edge e 34 from p 3 is 0.1. This distribution of the pattern importance as the weightings on the reference edges is called the distribution ratio. Then the importance of each pattern is computed as the total of the weightings of incoming reference-edges. For example, the importance for pattern p 4 is = 0.35, and for p 2 it is The propagation of importance values through the PatternRank computation is shown in Figure 4. The importance values of all patterns are initially set to total one, with this importance allocated evenly to all patterns, so in Figure 4, each pattern is initially assigned an importance of Then, the reference edge weightings are calculated by dividing the importance of the referencing pattern by the number Figure 3. A step of importance propagation Figure 4. Convergence process of importance of reference edges. The State pattern references two other patterns, so the weights for the two edges are both 0.25/2 = Then the pattern importance values are re-calculated from the reference edges by totaling the weightings of the in-coming reference edges. In Figure 4, the Fixed Sized Buffer pattern has two incoming edges, so its importance is calculated as = This process is repeated, and the importance value for each pattern converges on a fixed value. In Figure 4, the State pattern converges to , and Fixed Sized Buffer to In the example in Figure 4, all of the patterns reference others, but if there are patterns that do not reference other patterns, the importance weighting from those patterns does not propagate. Weighting does propagate to these patterns from those that reference them, though, so the total weighting decreases by the weighting of these patterns with each iteration, preventing the computation from converging. We discuss how this problem is handled in detail in section

5 3.1.1 Pattern-Specific importance correction With the Page Rank and Component Rank methods, importance is computed based on only hyper links or dependency relationships between components. However, if pattern importance is computed based only on inter-pattern reference relationships, the following two problems arise. If the same pattern document is published on different Web sites, it will be treated as a different pattern. In cases where patterns in the same catalog are used together frequently even though there is no explicit reference relationship between them, they will be treated as though there is no relationship between them. For the former, we take the approach of unifying pattern documents that can be considered the same before performing the importance computation. We discuss details of how this is done in section For the latter, concerning patterns in the same catalog, we classify inter-pattern relationships handled by the PatternRank method into two types as described below. Pattern-Reference Relationship A relationship between patterns arising from one pattern referencing another and described in the pattern document by the author. For two patterns that have a pattern reference relationship, the pattern making the reference called the referrer and the one being referred-to is called the referee. When one pattern references another pattern, it could indicate that the solutions the patterns provide are similar in software structure, that one pattern is a derivative of the other (handling a more specialized problem), or that one pattern uses the other to provide its solution. In this paper, if a pattern p 1 makes reference to another pattern p 2, we say that a pattern reference relationship exists from p 1 to p 2. Note that we do not limit this to items mentioned in the Related Patterns section of the pattern format, but also consider any pattern names appearing in the entire text of the document to indicate a pattern reference relationship. (see the transcript file for additional information) Same-Catalog Relationship If two patterns, p 1 and p 2 belong to the same pattern catalog, we say they have a Same-Catalog Relationship. Pattern catalogs can be seen as collections of patterns that have mutual relationships among them. Even if patterns in the same catalog do not have an explicit relationship with each other, they are often used together. For example, there is no explicit relationship between the Composite and Proxy patterns from the GoF design-pattern catalog [2][5] but they are often used together. The Pattern- Rank method treats patterns in the same catalog as being related for this reason Computation model for the Pattern- Rank Method Below, we provide a mathematical description of the pattern-importance calculation described above. We express a set of patterns as a directed graph, G = (P, E), consisting of a set, P, of vertices, p, each representing a pattern, and a set, E, of directed edges, e, each representing a reference relationship between patterns. The importance value for each pattern is expressed as the weight, w(p) of the corresponding vertex, p. By definition we set the total weight of all vertices in the graph to 1, and 0 < w(p) 1 for all vertices at all times. b We express the weighting of edge e from vertex p i to vertex p j as, w (e ) = d w(p i ) The distribution ratios, d satisfy d = 1 and i 0 d 1 The weight of each vertex, p i, is set equal to the sum of the weights of all edges e i in the set IN(p i ) of edges terminating at p i. This can be expressed as: w(p i ) = e ki IN(p i ) w (e ki ) Combining this with the definition of distribution ratios gives the weighting for each vertex as: w(p i ) = d ki w(p k ) e ki IN(p i) Next, we express the weights of all vertices as a vector, W : w(p 1 ) w(p 2 ) W =. w(p n ) And the matrix, D, of all distribution ratios as: (see the transcript file for additional information) d 11 d d 1n d 21 d d 2n W = d n1 d n2... d nn The computation of the weightings for the entire graph can be expressed in terms of W and D as W t+1 = D t

6 W t. D t is the transverse of matrix D, and the subscript t indicates the iteration number of the computation. The above calculation is the same as computing the eigenvectors of a system of simultaneous equations in linear algebra, and can be computed iteratively. An iterative method is used in which the initial vector, x 0 is set to some reasonable value, x (t+1) = A y(t), and y (t) = x (t) /c (t). As t =, x converges to the eigenvector with the largest eigenvalue, and c converges to that largest eigenvalue [21]. Here, A is an N-dimensional square matrix. From the Peron-Frobenius theorem, the absolute value of the maximum eigenvalue of a transition probability matrix is 1, so y (t) = x (t) and x (t+1) = Ax (t). In the PatternRank method, the importance values for each pattern are given by the values of each point in vector W when it has converged to a fixed value iterating as t =. Since an unlimited number of iterations is not possible, the computation is terminated when w t w t 1 w t 1 s for a particular threshold value, s. w t are the pattern vertex weights for the t th iteration of the calculation. Appropriate values for threshold value s are derived experimentally. The distribution ratios, d, are computed taking into account same-catalog relationships and independent patterns that lack pattern-reference relationships. First, the distribution ratios based only on patternreference relationships, d S, are given by: d S = 1/d sum Here d s um is the number of reference edges originating from the pattern vertex p i. Then, an overall expression for distribution ratios accounting for same-catalog relationships, d C, is given by: { d d C c =, where p i and p j belong to a same pattern catalog. d c, otherwise. d c = d S + d c i j d c d c = c = (1 c) d S d S ik/(c sum 1) k {C} Then overall distribution ratios considering matching patterns, d M, are given by: d M = d mo d mi d m d mo = k {m}d C kj /m sum d mi = d m k {m}d C ik = d C Distribution ratios, d i j, taking into consideration patterns that are not referenced by other patterns, and patterns that do not reference other patterns: { d r d =, wherep i hasanyreferencetoanotherpattern d r, otherwise d r = (1 r) d M + d d r = (1 r) d + d d = 1/n d = r/n The terms in each equation are discussed further in the following sections Pattern-catalog distribution ratios Even if there is no explicit relationship between patterns in pattern documents that belong to the same pattern catalog, these patterns are often used together. While we can conceive such a relationship even if there is not explicit relationship, calculations for methods like Page Rank and Component Rank that are based on relationships between elements are not able to reflect relationships like this in the importance value. The PatternRank approaches this problem by assigning pseudo-edges, e c between patterns that belong to the same pattern catalog, and giving these pseudo-edges a weighting that is weaker than regular reference edges. The weightings assigned to these pseudo-edges are created by subtracting a fixed proportion of the weighting from the weightings of reference edges going from patterns inside the catalog to patterns outside the catalog (Figure 5). The proportion of weighting decrease is defined as the pattern catalog distribution ratio. The overall distribution ratios for samecatalog relationships between patterns in a catalog are d C, the reference-relation distribution ratios between patterns with same-catalog relationships are d c, and distribution ratios for references to patterns outside the catalog are d c. Also, for distribution ratios in the distribution ratio matrix in the equation, d S, where p i and p j are referenced by a same pattern., where p, based on pattern-reference relation- i and p j refer a same pattern., otherwise. ships only, we define those on which pattern catalog distribution ratio computation was done as the pattern catalog distribution matrix. The distribution ratios taking same-catalog relationships into account are shown below.

7 Figure 5. Edges and Pseudo-edges Table 1. Calculated importance without pattern catalog distribution ratios Rank Pattern Catalog Importance 1 p 2 C p 5 C p 1 C p 6 C p 7 C p 3 C p 4 C The overall distribution ratios, d C are: { d d C c =, where p i and p j belong to a same pattern catalog. d c, otherwise. The distribution ratio d C for when p i and p j belong to the same pattern catalog is: d c = d S + d c i j The distribution ratio d c for pseudo-edges ec is: d c = c k {C} d S ik/(c sum 1) The distribution ratio, d c for points p i and p j with no same-catalog relationship is: d c = (1 c) d S d S is the distribution ratio calculated based only on pattern reference relationships, d c is the distribution ratio for pseudo-edge e c, c is the pattern catalog distribution ratio, c sum is the total number of patterns in the pattern catalog, and k c d ik is the total of distribution ratios to patterns that are not in the same catalog. Since the distribution rates for reference relationships between patterns that belong to the same catalog are shared as part of the total weight, the condition i d = 1 is still satisfied after distribution ratios are allocated. As an example of computing distribution ratios with same-catalog relationships, consider the weighting w(e c 14) from pattern p 1 to pattern p 4 in Figure 5. If the pattern catalog distribution ratio is 0.25, the weight for the pseudoedges is w(e c 14) from the reference edge equation w(e c ) = w(p i ) d S +w(ec ). Using the equations ec 14 = w(p 1 ) d c 14 and d c 14 = /(3 1) = 0.125, the weight of the pseudo-edge e c 14 is e c 14 = = Considering the weight w(e c 43) of the reference edge from pattern p 4 to pattern p 3 in the same way, we have equation w(e c 43) = w(p 4 ) d S 43 + w(e c 43). From w(p 4 ) d S 43 = = 0.05 and d c 43 = /(3 1) = , we have e c 43 = Table 2. Calculated importance with pattern catalog distribution ratios Rank Pattern Catalog Importance 1 p 5 C p 3 C p 7 C p 6 C p 1 C p 3 C p 4 C w(p 4 ) d c 43 = = so the reference edge weight, w(e c 43)isw(e c 43) = = By applying the distribution ratios taking same-catalog relationships into account to the pattern importance computation, the importance of other patterns in pattern catalogs containing many important patterns is raised overall. For example, if the importance computation is performed till convergence without using pattern catalog distribution ratios for Figure 5, the importance values for each pattern would be as shown in Table 1. Table 2 shows the corresponding importance values when using pattern catalog distribution ratios (with pattern catalog distribution ratio, c = 0.15, correction distribution ratio, r = 0.15, and threshold value s = 10 8 ). Table 1 shows that pattern catalog C 2 has more patterns with high importance than does pattern catalog C 1. Also, looking at both Table 1 and 2, use of the pattern catalog distribution ratio raised the level of patterns p 6 and p 7 in pattern catalog C Consideration of patterns from different sources that are the same For more well-known patterns, there are cases where several different pattern documents written by different authors have been published for the same pattern. In the PatternRank method, the importance calculation is done after pattern documents written by different authors

8 for the same pattern have been merged (Figure6). The overall distribution ratio considering patterns that are the same is d M, the ratio when the referenced patterns are the same is d mi, and when the referencing patterns are the same is dmo. When neither the referencing nor the referenced patterns are the same, the distribution ratio is d m. Expressions defining these are given below. Overall distribution ratio considering patterns that are the same (matching), d M d M = d mo, where p i and p j are referenced by a same pattern. d mi, where p i and p j refer a same pattern. d m, otherwise. Figure 6. Concatenation of pattern nodes Reference edge distribution ratios, d mo, when multiple referencing patterns, p i, match: d mo = k {m} d C kj/m sum Reference edge distribution ratios, d mi, when multiple referenced patterns, p i, match: d mi = k {m} d C ik Reference edge distribution ratio, d m, when neither referencing nor referenced patterns match. = d C k {m} dc kj is the total of all reference-edge distribution ratios when multiple referencing patterns, p i, match, k {m} dc ik is the total of all reference-edge distribution ratios when multiple referenced patterns, p j, match, and m sum is the number of patterns that match. Consider pattern p 3 in Figure 6 as an example of computing the distribution ratios taking matching patterns into account. Pattern p 1 references the pattern document containing pattern p 3, and pattern p 2 references a different document containing p 3. In the PatternRank method, first the patterns that are the same and associated reference edges are merged. The new p 3 has two incoming edges with weights of 0.5 and 0.1, and the new weighting of p 3 is the total of these two, or 0.6. The method for determining whether two patterns are the same is described in detail in Section 4. d m Patterns that are not referenced For patterns that are not referenced by other patterns the total weight of incoming reference edges is zero, so their importance value goes to zero. For example, since the weighting for a pattern like pattern p 1 is computed from the total Figure 7. Importance propagation including zero importance weighting of all incoming reference edges, its importance value will become zero. If there are patterns with an importance value of zero, the weighting for any reference edges originating from that pattern will also be zero. For the pattern reference relationships shown in the example in Figure 7, the total of reference-edge weights for patterns p 1 and p 2 are both 0.1, but since p 2 has two references from patterns with importance zero, this contradicts the first rule of the PatternRank method, that the more references a pattern has, the higher its importance. To account for this problem, the PatternRank method adds pseudo-edges, e, between all patterns, and gives these pseudo-edges a small weighting. The weights for these pseudo-edges are obtained by deducting a fixed proportion from the total weight of all of the pattern documents. Figure 8 shows how, by adding pseudo-edges to the whole pattern set, references are added to pattern p 2, which did not have any earlier. In this way, there are no longer any unreferenced patterns, and thus no patterns with importance of zero. The proportion deducted from the total weighting is called the correction distribution rate. We define the final over all distribution ratios, taking into consideration patterns that are not referenced by other patterns, and patterns that do not reference other patterns, as d. Expressions for distribution ratios taking unreferenced patterns into consideration are given below. Distribution ratios, d, taking into consideration pat-

9 Figure 8. fixing patterns without incoming reference Figure 9. fixing patterns without outgoing reference terns that are not referenced by other patterns, and patterns that do not reference other patterns: { d r d =, wherep i hasanyreferencetoanotherpattern d r, otherwise The distribution ratio, d, when p i references p j is: d r = (1 r) d M + d If there are patterns with an importance value of zero, the weighting for any r The distribution ratio, d, for pseudo-edge e is: d = r/n n is the total number of patterns in the set of pattern documents for the computation, d is the distribution ratio for pseudo-edge e, and r is the correction distribution ratio, satisfying 0 r 1. In this way, patterns that would otherwise have an importance value of zero are given a non-zero importance value, but if the correction distribution ratio, r, is made small enough, these weights converge to values that are small enough not to effect the relative rank of the patterns Patterns with no references If there are patterns that do not reference other patterns, the distribution ratios on reference edges from other patterns to these patterns go to zero, so that the i d = 1 constraint no longer holds, and the overall weighting computation as described in section 3.2 will no longer converge. For example, if a pattern, p 1, does not reference other patterns, it gets the total weighting of all edges referencing p 1, but the distribution ratios for any edges originating at p 1 are zero, so with each iteration, the total weighting decreases by the amount of all edges ending at p 1. The approach taken by the PatternRank method for this problem is to add pseudo-edges, e, from patterns that reference no other patterns to all other patterns. In Figure 9, pattern p 2 has no references to other patterns, so pseudoedges, e are added from pattern p 2 to all other patterns in the pattern set. By adding these pseudo-edges there are no longer any patterns that do not reference other patterns, making it possible to avoiding the problem of a decrease in the overall weight. The distribution ratio for reference edges on patterns that reference other patterns are defined as d r, and for patterns that do not reference other patterns, as d r. Expressions for distribution ratios taking patterns with no references into consideration are given below. Distribution ratio for pattern, p i, which references other patterns: d r = (1 r) d M + d Distribution ratio for pattern, p i, which does not reference other patterns: d d r = (1 r) d + d is the distribution ratio of e. These expressions impose a relatively strong relationship between patterns that would otherwise not be related, but we avoid affecting any particular pattern by giving all pseudo-edges equivalent weighting while representing the state where there are no references to other patterns. These computations, taking into consideration patterns that are not referenced or that do not reference others, are applied to the distribution matrix discussed in section 3.1. We call the matrix with these corrected distribution ratios applied the corrected distribution ratio matrix Software pattern importance computation process The distribution ratios, d, in the distribution ratio matrix are obtained by first computing the distribution ratios for pattern reference relationships and then applying pattern catalog distribution ratios, same-pattern, and corrected distribution ratio processing Search system using the PatternRank method The process for using the proposed system, which applies the PatternRank method, is shown below (Figure 10).

10 Figure 10. System Overview 1. The user selects multiple pattern catalogs or all pattern documents in the repository as the search space. 2. The system analyzes all pattern documents in the repository If there are patterns with an importance value of zero, the weighting for any r(the set of all pattern documents gathered) and extracts the pattern names, the names of the pattern catalogs they belong to, and the names of related patterns, and obtains the pattern relationships from the extracted data. If the same pattern appears with different names in the pattern documents, the system treats it as a pattern that can be referenced by any of the names that appear. 3. The user enters an arbitrary search query string into the system, appropriate to the pattern required. 4. The system performs the importance calculation on the collection of pattern documents, calculating the distribution ratio matrix, the pattern catalog distribution ratio matrix and the corrected distribution ratio matrix from the extracted pattern relationships. The result is an importance value for each pattern. 5. The system finds the pattern documents that contain the keyword string entered by the user. This is a simple full-text searching. 6. The system sorts the search results according to the importance values and presents them to the user. Patterns that have the same importance value are displayed in alphabetical order of the pattern name. Using the system, users can obtain results with a set of pattern documents that contain the search query and are sorted in order of importance, allowing them to find patterns efficiently that meet their objectives or provide useful references. For example, in Figure 10, a user needs to solve a problem related to generating an object. The user enters the search query, factory, to find a pattern that will provide a reference for solving this problem, and obtains a result like that shown in Figure 11. The proposed system presents Figure 11. System Overview 29 pattern documents, like the Abstract Factory, that contain the query string in order of importance as the search result. The first result, Abstract Factory, has several sources listed after from ; indicating that the pattern is published in several different pattern documents by several different authors. Also, the pattern name is a link to the Web site with the pattern, and the alternate sources listed under from are also links to the Web sites providing them. The proposed system can also present all of the patterns in the repository in order of the computed importance if no search query is entered. This provides an effective way for a user to examine and study all patterns in the repository. 4. Preliminary Experiment and Demo The proposed system was implemented and experiment was done to verify parameter values and evaluate the usefulness of the system Preliminary Experiments for Parameter verification Parameter verification experiment involved preliminary experiment to determine appropriate values for the various parameters used in the importance calculation of the PatternRank method. For experiment 1 to 3, we prepared answer set 1 with four patterns and answer set 2 with 12 patterns. Experiment 1: Determine an appropriate value for s in the importance-calculation convergence condition, w t w t 1 w t 1 s. Importance values for all patterns in each of the answer sets, 1 and 2, as well as the entire set of 131 pattern documents in the repository, gathered from the pattern catalogs from four Web sites [2, 5, 6, 18], were

11 computed using values of s starting from 0.1 and progressing through 0.01, 0.001, and so on. The value of the correction distribution ratio, r, was set to 0.15, the pattern catalog distribution ratio, c, to 0.15, and the calculation was done with no search query. In the experiment, the importance values were observed to stop changing when the value of s reached 10 5 for answer set 1, and 10 7 for answer set 2 and the set of 131 pattern documents. From the experiment, we conclude that the s value should be set to 10 8 or less for the importance value calculation to converge. Experiment 2: Determining an appropriate value for the correction distribution ratio, r. Importance values were calculated for the patterns in answer sets 1 and 2, setting r to values from 0.0 to 1.0. The stopping threshold, s, was set to 10 8, the pattern catalog distribution ratio, c, was set to 0.15, and the calculation was done with no search query. Large changes were observed in the resulting pattern importance values when the value of r was set to greater than 0.2. It is desirable that the effect on importance values from r be as little as possible, so we conclude that r must not be greater than 0.2. From the experiment we conclude that the value of r must be in the range 0.0 < r 0.2. Experiment 3: Determining an appropriate value for the pattern catalog distribution ratio, c. Importance values were calculated for the patterns in answer sets 1 and 2, setting the pattern catalog distribution ratio, c, to values from 0.0 to 1.0. The stopping threshold, s, was set to 10 8, the correction distribution ratio, r, was set to 0.15, and the calculation was done with no search query. The experiment results show that as the pattern catalog distribution rate is set higher, the importance of the three patterns in Catalog 1 decreases, while that of the other patterns increases. Also, from the results, it was clear that as the value of c changes, the patternimportance values change together according to which pattern catalog they belong to. The experiments confirmed that it is possible to change the importance of patterns according to catalog by changing the pattern catalog distribution ratio value. However, they did not indicate a clear appropriate value for c. c can be used to specify the importance of same-catalog pattern relationships Demonstration In demonstration 1 and 2, an existing Web site with a set of patterns was modified to perform searches using the proposed system. Pattern documents conforming to the search query were studied manually before hand and we used them as an answer set. We also investigated the amount of time required from the time the search query is entered to when the result is displayed for each demonstration (the program was run ten times and the average time in milliseconds was measured). The demonstration environment was a Pentium4, 3.2 GHz, 1.0 GB RAM PC with Microsoft Windows Home Edition. For the proposed method, we treat patterns as the same if the content of the pattern documents is the same. However, because it is difficult to determine whether the content of two documents is the same automatically, for the purposes of these experiments, we concluded that if the pattern names were the same, the patterns were also the same. Adopting this approach has two problems: (1) If there are patterns that are the same, but have different names, they will be treated as different patterns, and (2) If there are patterns that are different, but have the same name, they will be treated as the same pattern. In fact, however, most patterns are given a specific name that is particular to the problem area to which the pattern applies, so there are very few cases where patterns have the same name but different content. Among the 131 pattern documents used in demonstrations 1 and 2, there were 23 groups of patterns that were the same provided on multiple Web sites and in all cases where the name was the same, the content was also the same. In light of this, concluding that the patterns are the same if the pattern names are the same results in an easy and practical implementation and it can be expected to be reasonably correct due to the characteristics of pattern names. This approach was used in demonstrations 1 and 2. Demo 1: Assuming a scenario where a system developer requires patterns that can be used in designing GUI windows, we entered the search query window into the proposed system. For the demonstration, the search query window was entered against the set of 131 pattern documents in the repository, gathered from pattern catalogs on four Web sites [2, 5, 6, 18]. The pattern document set contains four pattern documents with the search query ( window ), and also two documents that do not contain the search query, but can be considered useful for GUI window design (Command pattern [2, 5], and Extensibility pattern [18]). The importance calculation was done using a stopping threshold, s, of 10 8, a correction distribution ratio, r, of 0.15, and a pattern catalog distribution ratio, c,

12 of The result is shown in Tables 3, respectively. The execution time was 156 ms. Table 3 shows the Decorator patterns typically used in GUI window design ranked highly by the method. For example, the Decorator pattern can be used when developing visual GUI components like scroll bars and frames. A few patterns related to network transport windows, such as the Receive Protocol Handler pattern also appeared in the results, but these received a lower rank because they had few references. We think the reason of lower rank of Abstract Factory pattern is because most of the patterns referencing Abstract Factory do not match the window search query. From the demonstration results, we can say that the proposed system can present the multiple-pattern result of a search query with no semantic information in a somewhat meaningful order. Without the ordering of results provided by the PatternRank method, users could end up starting their investigations with patterns like Transmit Protocol Handler or Dll Hell, which have low importance. The proposed system is promising as a way to improve software development efficiency by supporting the use of design patterns. Demo 2: In this demonstration, we assumed a scenario where techniques for embedded software to use memory efficiently are being studied, and a search is done using memory as the query on the proposed system. For the demonstration, the search query memory was entered against the set of 131 pattern documents in the repository, gathered from pattern catalogs on four Web sites [2, 5, 6, 18]. The pattern document set contains nine pattern documents containing the search query ( memory ), and also one document that does not contain the search query, but can be considered useful for improving efficiency of memory use in embedded software (Resource Manager pattern [6]). The importance calculation was done using a stopping threshold, s, of 10 8, a correction distribution ratio, r, of 0.15, and a pattern catalog distribution ratio, c, of The result is shown in Table 4. The execution time was 141 ms. In Table 4, several design patterns related to using memory efficiently, such as the Proxy and Flyweight patterns appear with high rank. For example, Virtual Proxy within the Proxy pattern is able to reduce memory use by not creating objects until they are needed. The Fly Weight pattern also saves memory by sharing it between objects of the same type that are used often. The Proxy pattern is referenced by many other patterns such as the Adapter and Decorator patterns, so it appears with high rank. In Table 4, several design patterns related to using memory efficiently, such as the Lazy Evaluation, Proxy, and Fly Weight patterns appear with high rank. For example, Virtual Proxy within the Proxy pattern is able to reduce memory use by not creating objects until they are needed. The Fly Weight pattern also saves memory by sharing it between objects of the same type that are used often. We think the reason of lower rank of Proxy pattern is because many of the patterns that reference the Proxy pattern are not in the set of patterns matching the memory search query. From the experiment results, the proposed system can present multiple patterns resulting from a search query from the user in a somewhat meaningful order. Without the ordering of results provided by the PatternRank method, users could end up learning about less important patterns first. The proposed system is promising as a way to improve the efficiency of learning development methods by supporting the study of design patterns. 5. Related Work Markus et al. have proposed a pattern search system that is limited to the field of security design [15]. By limiting the domain of the system, additional search conditions specialized to the domain could be added. However, our system is not limited to any particular domain and can handle any pattern. Kinashi et al. have proposed a tool that accepts a requirements analysis description and presents applicable design patterns to the user based on it [8]. With this tool, it is necessary to manually prepare requirements analysis description fragments and matching rules ahead of time for the design patterns being handled. However, our proposed system can handle any pattern in a design pattern, and beyond gathering the pattern documents, no additional manual work is required. The archetypal pattern repository on the Web is the Portland Pattern Repository [1]. It has a simple keyword-search function allowing entry of search queries, but the results are listed in order of pattern name, so this is essentially random from the perspective of usefulness of the patterns. The PatternRank method makes use of relationships between patterns to calculate an importance value for each element, drawing upon the concepts used in the Page Rank method [11] which calculates importance of Web pages, and the Component Rank method [7], which does so for software components. The Page Rank method uses the principle that a Web page referenced by an important Web page is also important. The Component Rank method uses the principle that components that are used often, or are used by important components are also important [7]. The Page

A Metric for Measuring the Abstraction Level of Design Patterns

A Metric for Measuring the Abstraction Level of Design Patterns A Metric for Measuring the Abstraction Level of Design Patterns Atsuto Kubo 1, Hironori Washizaki 2, and Yoshiaki Fukazawa 1 1 Department of Computer Science, Waseda University, 3-4-1 Okubo, Shinjuku-ku,

More information

A Metric of the Relative Abstraction Level of Software Patterns

A Metric of the Relative Abstraction Level of Software Patterns A Metric of the Relative Abstraction Level of Software Patterns Atsuto Kubo 1, Hironori Washizaki 2, and Yoshiaki Fukazawa 1 1 Department of Computer Science, Waseda University, 3-4-1 Okubo, Shinjuku-ku,

More information

Lecture #3: PageRank Algorithm The Mathematics of Google Search

Lecture #3: PageRank Algorithm The Mathematics of Google Search Lecture #3: PageRank Algorithm The Mathematics of Google Search We live in a computer era. Internet is part of our everyday lives and information is only a click away. Just open your favorite search engine,

More information

Agenda. Math Google PageRank algorithm. 2 Developing a formula for ranking web pages. 3 Interpretation. 4 Computing the score of each page

Agenda. Math Google PageRank algorithm. 2 Developing a formula for ranking web pages. 3 Interpretation. 4 Computing the score of each page Agenda Math 104 1 Google PageRank algorithm 2 Developing a formula for ranking web pages 3 Interpretation 4 Computing the score of each page Google: background Mid nineties: many search engines often times

More information

Algorithms, Games, and Networks February 21, Lecture 12

Algorithms, Games, and Networks February 21, Lecture 12 Algorithms, Games, and Networks February, 03 Lecturer: Ariel Procaccia Lecture Scribe: Sercan Yıldız Overview In this lecture, we introduce the axiomatic approach to social choice theory. In particular,

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

Web search before Google. (Taken from Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.)

Web search before Google. (Taken from Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.) ' Sta306b May 11, 2012 $ PageRank: 1 Web search before Google (Taken from Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.) & % Sta306b May 11, 2012 PageRank: 2 Web search

More information

Ingegneria del Software Corso di Laurea in Informatica per il Management. Design Patterns part 1

Ingegneria del Software Corso di Laurea in Informatica per il Management. Design Patterns part 1 Ingegneria del Software Corso di Laurea in Informatica per il Management Design Patterns part 1 Davide Rossi Dipartimento di Informatica Università di Bologna Pattern Each pattern describes a problem which

More information

INTRODUCING A MULTIVIEW SOFTWARE ARCHITECTURE PROCESS BY EXAMPLE Ahmad K heir 1, Hala Naja 1 and Mourad Oussalah 2

INTRODUCING A MULTIVIEW SOFTWARE ARCHITECTURE PROCESS BY EXAMPLE Ahmad K heir 1, Hala Naja 1 and Mourad Oussalah 2 INTRODUCING A MULTIVIEW SOFTWARE ARCHITECTURE PROCESS BY EXAMPLE Ahmad K heir 1, Hala Naja 1 and Mourad Oussalah 2 1 Faculty of Sciences, Lebanese University 2 LINA Laboratory, University of Nantes ABSTRACT:

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

2. CONNECTIVITY Connectivity

2. CONNECTIVITY Connectivity 2. CONNECTIVITY 70 2. Connectivity 2.1. Connectivity. Definition 2.1.1. (1) A path in a graph G = (V, E) is a sequence of vertices v 0, v 1, v 2,..., v n such that {v i 1, v i } is an edge of G for i =

More information

PageRank. CS16: Introduction to Data Structures & Algorithms Spring 2018

PageRank. CS16: Introduction to Data Structures & Algorithms Spring 2018 PageRank CS16: Introduction to Data Structures & Algorithms Spring 2018 Outline Background The Internet World Wide Web Search Engines The PageRank Algorithm Basic PageRank Full PageRank Spectral Analysis

More information

USING PRINCIPAL COMPONENTS ANALYSIS FOR AGGREGATING JUDGMENTS IN THE ANALYTIC HIERARCHY PROCESS

USING PRINCIPAL COMPONENTS ANALYSIS FOR AGGREGATING JUDGMENTS IN THE ANALYTIC HIERARCHY PROCESS Analytic Hierarchy To Be Submitted to the the Analytic Hierarchy 2014, Washington D.C., U.S.A. USING PRINCIPAL COMPONENTS ANALYSIS FOR AGGREGATING JUDGMENTS IN THE ANALYTIC HIERARCHY PROCESS Natalie M.

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

EXTREME POINTS AND AFFINE EQUIVALENCE

EXTREME POINTS AND AFFINE EQUIVALENCE EXTREME POINTS AND AFFINE EQUIVALENCE The purpose of this note is to use the notions of extreme points and affine transformations which are studied in the file affine-convex.pdf to prove that certain standard

More information

A CSP Search Algorithm with Reduced Branching Factor

A CSP Search Algorithm with Reduced Branching Factor A CSP Search Algorithm with Reduced Branching Factor Igor Razgon and Amnon Meisels Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84-105, Israel {irazgon,am}@cs.bgu.ac.il

More information

Clustering Algorithms for general similarity measures

Clustering Algorithms for general similarity measures Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

Social Network Analysis

Social Network Analysis Social Network Analysis Giri Iyengar Cornell University gi43@cornell.edu March 14, 2018 Giri Iyengar (Cornell Tech) Social Network Analysis March 14, 2018 1 / 24 Overview 1 Social Networks 2 HITS 3 Page

More information

Finding Context Paths for Web Pages

Finding Context Paths for Web Pages Finding Context Paths for Web Pages Yoshiaki Mizuuchi Keishi Tajima Department of Computer and Systems Engineering Kobe University, Japan ( Currently at NTT Data Corporation) Background (1/3) Aceess to

More information

EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING

EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING Chapter 3 EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING 3.1 INTRODUCTION Generally web pages are retrieved with the help of search engines which deploy crawlers for downloading purpose. Given a query,

More information

COUNTING AND PROBABILITY

COUNTING AND PROBABILITY CHAPTER 9 COUNTING AND PROBABILITY Copyright Cengage Learning. All rights reserved. SECTION 9.3 Counting Elements of Disjoint Sets: The Addition Rule Copyright Cengage Learning. All rights reserved. Counting

More information

Chapter 2 Basic Structure of High-Dimensional Spaces

Chapter 2 Basic Structure of High-Dimensional Spaces Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,

More information

Lecture #2 January 30, 2004 The 6502 Architecture

Lecture #2 January 30, 2004 The 6502 Architecture Lecture #2 January 30, 2004 The 6502 Architecture In order to understand the more modern computer architectures, it is helpful to examine an older but quite successful processor architecture, the MOS-6502.

More information

Comparative Evaluation of Programming Paradigms: Separation of Concerns with Object-, Aspect-, and Context-Oriented Programming

Comparative Evaluation of Programming Paradigms: Separation of Concerns with Object-, Aspect-, and Context-Oriented Programming Comparative Evaluation of Programming Paradigms: Separation of Concerns with Object-, Aspect-, and Context-Oriented Programming Fumiya Kato, Kazunori Sakamoto, Hironori Washizaki, and Yoshiaki Fukazawa

More information

packet-switched networks. For example, multimedia applications which process

packet-switched networks. For example, multimedia applications which process Chapter 1 Introduction There are applications which require distributed clock synchronization over packet-switched networks. For example, multimedia applications which process time-sensitive information

More information

CS/CE 2336 Computer Science II

CS/CE 2336 Computer Science II CS/CE 2336 Computer Science II UT D Session 20 Design Patterns An Overview 2 History Architect Christopher Alexander coined the term "pattern" circa 1977-1979 Kent Beck and Ward Cunningham, OOPSLA'87 used

More information

Software Engineering Prof. Rushikesh K.Joshi IIT Bombay Lecture-15 Design Patterns

Software Engineering Prof. Rushikesh K.Joshi IIT Bombay Lecture-15 Design Patterns Software Engineering Prof. Rushikesh K.Joshi IIT Bombay Lecture-15 Design Patterns Today we are going to talk about an important aspect of design that is reusability of design. How much our old design

More information

1 st Grade Math Curriculum Crosswalk

1 st Grade Math Curriculum Crosswalk This document is designed to help North Carolina educators teach the. NCDPI staff are continually updating and improving these tools to better serve teachers. 1 st Grade Math Curriculum Crosswalk The following

More information

CHAPTER-23 MINING COMPLEX TYPES OF DATA

CHAPTER-23 MINING COMPLEX TYPES OF DATA CHAPTER-23 MINING COMPLEX TYPES OF DATA 23.1 Introduction 23.2 Multidimensional Analysis and Descriptive Mining of Complex Data Objects 23.3 Generalization of Structured Data 23.4 Aggregation and Approximation

More information

Generalized trace ratio optimization and applications

Generalized trace ratio optimization and applications Generalized trace ratio optimization and applications Mohammed Bellalij, Saïd Hanafi, Rita Macedo and Raca Todosijevic University of Valenciennes, France PGMO Days, 2-4 October 2013 ENSTA ParisTech PGMO

More information

DISSOLUTION OF THE DILEMMA OR CIRCULATION PROBLEM USING THE ANALYTIC NETWORK PROCESS

DISSOLUTION OF THE DILEMMA OR CIRCULATION PROBLEM USING THE ANALYTIC NETWORK PROCESS DISSOLUTION OF THE DILEMMA OR CIRCULATION PROBLEM USING THE ANALYTIC NETWORK PROCESS Toshimasa Ozaki Nagoyagakuin University Nagoya, Aici, Japan E-mail: ozaki@ngu.ac.jp Shin Sugiura Meijo University Kani,

More information

Temperature Distribution Measurement Based on ML-EM Method Using Enclosed Acoustic CT System

Temperature Distribution Measurement Based on ML-EM Method Using Enclosed Acoustic CT System Sensors & Transducers 2013 by IFSA http://www.sensorsportal.com Temperature Distribution Measurement Based on ML-EM Method Using Enclosed Acoustic CT System Shinji Ohyama, Masato Mukouyama Graduate School

More information

Recognition, SVD, and PCA

Recognition, SVD, and PCA Recognition, SVD, and PCA Recognition Suppose you want to find a face in an image One possibility: look for something that looks sort of like a face (oval, dark band near top, dark band near bottom) Another

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

0001 Understand the structure of numeration systems and multiple representations of numbers. Example: Factor 30 into prime factors.

0001 Understand the structure of numeration systems and multiple representations of numbers. Example: Factor 30 into prime factors. NUMBER SENSE AND OPERATIONS 0001 Understand the structure of numeration systems and multiple representations of numbers. Prime numbers are numbers that can only be factored into 1 and the number itself.

More information

2-9 Operations with Complex Numbers

2-9 Operations with Complex Numbers 2-9 Operations with Complex Numbers Warm Up Lesson Presentation Lesson Quiz Algebra 2 Warm Up Express each number in terms of i. 1. 9i 2. Find each complex conjugate. 3. 4. Find each product. 5. 6. Objective

More information

V4 Matrix algorithms and graph partitioning

V4 Matrix algorithms and graph partitioning V4 Matrix algorithms and graph partitioning - Community detection - Simple modularity maximization - Spectral modularity maximization - Division into more than two groups - Other algorithms for community

More information

192. Design Patterns in Java Software

192. Design Patterns in Java Software 192. Design Patterns in Java Software Version 5.0 This course seeks to develop, for the experienced Java programmer, a strong, shared vocabulary of design patterns and best practices. The course begins

More information

Represent and solve problems involving addition and subtraction

Represent and solve problems involving addition and subtraction Operations and Algebraic Thinking Represent and solve problems involving addition and subtraction AR.Math.Content.1.OA.A.1 Use addition and subtraction within 20 to solve word problems involving situations

More information

Design Patterns. Hausi A. Müller University of Victoria. Software Architecture Course Spring 2000

Design Patterns. Hausi A. Müller University of Victoria. Software Architecture Course Spring 2000 Design Patterns Hausi A. Müller University of Victoria Software Architecture Course Spring 2000 1 Motivation Vehicle for reasoning about design or architecture at a higher level of abstraction (design

More information

Getting to Know Your Data

Getting to Know Your Data Chapter 2 Getting to Know Your Data 2.1 Exercises 1. Give three additional commonly used statistical measures (i.e., not illustrated in this chapter) for the characterization of data dispersion, and discuss

More information

NUMBER SENSE AND OPERATIONS. Competency 0001 Understand the structure of numeration systems and multiple representations of numbers.

NUMBER SENSE AND OPERATIONS. Competency 0001 Understand the structure of numeration systems and multiple representations of numbers. SUBAREA I. NUMBER SENSE AND OPERATIONS Competency 0001 Understand the structure of numeration systems and multiple representations of numbers. Prime numbers are numbers that can only be factored into 1

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

2.3 Algorithms Using Map-Reduce

2.3 Algorithms Using Map-Reduce 28 CHAPTER 2. MAP-REDUCE AND THE NEW SOFTWARE STACK one becomes available. The Master must also inform each Reduce task that the location of its input from that Map task has changed. Dealing with a failure

More information

Web Service Usage Mining: Mining For Executable Sequences

Web Service Usage Mining: Mining For Executable Sequences 7th WSEAS International Conference on APPLIED COMPUTER SCIENCE, Venice, Italy, November 21-23, 2007 266 Web Service Usage Mining: Mining For Executable Sequences MOHSEN JAFARI ASBAGH, HASSAN ABOLHASSANI

More information

Homework 1 Yang Zhang

Homework 1 Yang Zhang Homework 1 Yang Zhang Part 1: Using test-sm.nt as dataset for this part: (1) python dd.py to get Degree: Degree: 3, Frequency: 2 Degree: 4, Frequency: 3 Degree: 2, Frequency: 3 (2) python pr-d.py to get

More information

Web Structure Mining using Link Analysis Algorithms

Web Structure Mining using Link Analysis Algorithms Web Structure Mining using Link Analysis Algorithms Ronak Jain Aditya Chavan Sindhu Nair Assistant Professor Abstract- The World Wide Web is a huge repository of data which includes audio, text and video.

More information

Copyright 2000, Kevin Wayne 1

Copyright 2000, Kevin Wayne 1 Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges between pairs of nodes. Captures pairwise relationship between objects. Graph size parameters: n = V, m = E. Directed

More information

Chapter 3. Set Theory. 3.1 What is a Set?

Chapter 3. Set Theory. 3.1 What is a Set? Chapter 3 Set Theory 3.1 What is a Set? A set is a well-defined collection of objects called elements or members of the set. Here, well-defined means accurately and unambiguously stated or described. Any

More information

Common Core Standards for Mathematics. Grade 1. Operations and Algebraic Thinking Date Taught

Common Core Standards for Mathematics. Grade 1. Operations and Algebraic Thinking Date Taught Common Core Standards for Mathematics Operations and Algebraic Thinking Taught Retaught Reviewed Assessed Represent and solve problems involving addition and subtraction. 1.OA.1. Use addition and subtraction

More information

Chapter 2 Overview of the Design Methodology

Chapter 2 Overview of the Design Methodology Chapter 2 Overview of the Design Methodology This chapter presents an overview of the design methodology which is developed in this thesis, by identifying global abstraction levels at which a distributed

More information

9 abcd = dcba b + 90c = c + 10b b = 10c.

9 abcd = dcba b + 90c = c + 10b b = 10c. In this session, we ll learn how to solve problems related to place value. This is one of the fundamental concepts in arithmetic, something every elementary and middle school mathematics teacher should

More information

Web Page Recommender System based on Folksonomy Mining for ITNG 06 Submissions

Web Page Recommender System based on Folksonomy Mining for ITNG 06 Submissions Web Page Recommender System based on Folksonomy Mining for ITNG 06 Submissions Satoshi Niwa University of Tokyo niwa@nii.ac.jp Takuo Doi University of Tokyo Shinichi Honiden University of Tokyo National

More information

APPLYING DESIGN PATTERNS TO SCA IMPLEMENTATIONS

APPLYING DESIGN PATTERNS TO SCA IMPLEMENTATIONS APPLYING DESIGN PATTERNS TO SCA IMPLEMENTATIONS Adem Zumbul (TUBITAK-UEKAE, Kocaeli, Turkey, ademz@uekae.tubitak.gov.tr); Tuna Tugcu (Bogazici University, Istanbul, Turkey, tugcu@boun.edu.tr) ABSTRACT

More information

CS6200 Information Retreival. The WebGraph. July 13, 2015

CS6200 Information Retreival. The WebGraph. July 13, 2015 CS6200 Information Retreival The WebGraph The WebGraph July 13, 2015 1 Web Graph: pages and links The WebGraph describes the directed links between pages of the World Wide Web. A directed edge connects

More information

Math 6 Long Range Plans Bill Willis. Strand: NUMBER Develop number sense. Textbook: Math Makes Sense 6

Math 6 Long Range Plans Bill Willis. Strand: NUMBER Develop number sense. Textbook: Math Makes Sense 6 Math 6 Long Range Plans 2012-2013 Bill Willis Rationale: Based upon the mathematics program of studies, our learning environment will value and respect the diversity of students experiences and ways of

More information

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach Abstract Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in content-based

More information

CSCI Object Oriented Design: Frameworks and Design Patterns George Blankenship. Frameworks and Design George Blankenship 1

CSCI Object Oriented Design: Frameworks and Design Patterns George Blankenship. Frameworks and Design George Blankenship 1 CSCI 6234 Object Oriented Design: Frameworks and Design Patterns George Blankenship Frameworks and Design George Blankenship 1 Background A class is a mechanisms for encapsulation, it embodies a certain

More information

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015 University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 5:00pm-6:15pm, Monday, October 26th Name: ComputingID: This is a closed book and closed notes exam. No electronic

More information

Structured System Theory

Structured System Theory Appendix C Structured System Theory Linear systems are often studied from an algebraic perspective, based on the rank of certain matrices. While such tests are easy to derive from the mathematical model,

More information

An Analysis of Image Retrieval Behavior for Metadata Type and Google Image Database

An Analysis of Image Retrieval Behavior for Metadata Type and Google Image Database An Analysis of Image Retrieval Behavior for Metadata Type and Google Image Database Toru Fukumoto Canon Inc., JAPAN fukumoto.toru@canon.co.jp Abstract: A large number of digital images are stored on the

More information

C++ for System Developers with Design Pattern

C++ for System Developers with Design Pattern C++ for System Developers with Design Pattern Introduction: This course introduces the C++ language for use on real time and embedded applications. The first part of the course focuses on the language

More information

The Design Patterns Matrix From Analysis to Implementation

The Design Patterns Matrix From Analysis to Implementation The Design Patterns Matrix From Analysis to Implementation This is an excerpt from Shalloway, Alan and James R. Trott. Design Patterns Explained: A New Perspective for Object-Oriented Design. Addison-Wesley

More information

California Common Core State Standards Comparison - FIRST GRADE

California Common Core State Standards Comparison - FIRST GRADE 1. Make sense of problems and persevere in solving them. 2. Reason abstractly and quantitatively. 3. Construct viable arguments and critique the reasoning of others 4. Model with mathematics. Standards

More information

round decimals to the nearest decimal place and order negative numbers in context

round decimals to the nearest decimal place and order negative numbers in context 6 Numbers and the number system understand and use proportionality use the equivalence of fractions, decimals and percentages to compare proportions use understanding of place value to multiply and divide

More information

Extension, Abbreviation and Refinement -Identifying High-Level Dependence Structures Using Slice-Based Dependence Analysis

Extension, Abbreviation and Refinement -Identifying High-Level Dependence Structures Using Slice-Based Dependence Analysis Extension, Abbreviation and Refinement -Identifying High-Level Dependence Structures Using Slice-Based Dependence Analysis Zheng Li CREST, King s College London, UK Overview Motivation Three combination

More information

I How does the formulation (5) serve the purpose of the composite parameterization

I How does the formulation (5) serve the purpose of the composite parameterization Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)

More information

The PageRank Citation Ranking

The PageRank Citation Ranking October 17, 2012 Main Idea - Page Rank web page is important if it points to by other important web pages. *Note the recursive definition IR - course web page, Brian home page, Emily home page, Steven

More information

Particle Swarm Optimization applied to Pattern Recognition

Particle Swarm Optimization applied to Pattern Recognition Particle Swarm Optimization applied to Pattern Recognition by Abel Mengistu Advisor: Dr. Raheel Ahmad CS Senior Research 2011 Manchester College May, 2011-1 - Table of Contents Introduction... - 3 - Objectives...

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters

Types of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive

More information

Mathematics Background

Mathematics Background Finding Area and Distance Students work in this Unit develops a fundamentally important relationship connecting geometry and algebra: the Pythagorean Theorem. The presentation of ideas in the Unit reflects

More information

Community Detection. Community

Community Detection. Community Community Detection Community In social sciences: Community is formed by individuals such that those within a group interact with each other more frequently than with those outside the group a.k.a. group,

More information

9 abcd = dcba b + 90c = c + 10b b = 10c.

9 abcd = dcba b + 90c = c + 10b b = 10c. In this session, we ll learn how to solve problems related to place value. This is one of the fundamental concepts in arithmetic, something every elementary and middle school mathematics teacher should

More information

A Vector Space Equalization Scheme for a Concept-based Collaborative Information Retrieval System

A Vector Space Equalization Scheme for a Concept-based Collaborative Information Retrieval System A Vector Space Equalization Scheme for a Concept-based Collaborative Information Retrieval System Takashi Yukawa Nagaoka University of Technology 1603-1 Kamitomioka-cho, Nagaoka-shi Niigata, 940-2188 JAPAN

More information

Design and implement a program to solve a real-world problem using the language idioms, data structures,, and standard library.

Design and implement a program to solve a real-world problem using the language idioms, data structures,, and standard library. Course Outcome Second Year of B.Sc. IT Program Semester I Course Number: USIT301 Course Name: Python Programming Understanding basic fundamentals of programming using Python. Recognize and construct common

More information

Side-by-Side Comparison of the Texas Educational Knowledge and Skills (TEKS) and Louisiana Grade Level Expectations (GLEs) MATHEMATICS: Geometry

Side-by-Side Comparison of the Texas Educational Knowledge and Skills (TEKS) and Louisiana Grade Level Expectations (GLEs) MATHEMATICS: Geometry Side-by-Side Comparison of the Texas Educational Knowledge and Skills (TEKS) and Louisiana Grade Level Expectations (GLEs) MATHEMATICS: Geometry TEKS Comments Louisiana GLE (G.1) Geometric Structure. The

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if

POLYHEDRAL GEOMETRY. Convex functions and sets. Mathematical Programming Niels Lauritzen Recall that a subset C R n is convex if POLYHEDRAL GEOMETRY Mathematical Programming Niels Lauritzen 7.9.2007 Convex functions and sets Recall that a subset C R n is convex if {λx + (1 λ)y 0 λ 1} C for every x, y C and 0 λ 1. A function f :

More information

An array is a collection of data that holds fixed number of values of same type. It is also known as a set. An array is a data type.

An array is a collection of data that holds fixed number of values of same type. It is also known as a set. An array is a data type. Data Structures Introduction An array is a collection of data that holds fixed number of values of same type. It is also known as a set. An array is a data type. Representation of a large number of homogeneous

More information

This blog addresses the question: how do we determine the intersection of two circles in the Cartesian plane?

This blog addresses the question: how do we determine the intersection of two circles in the Cartesian plane? Intersecting Circles This blog addresses the question: how do we determine the intersection of two circles in the Cartesian plane? This is a problem that a programmer might have to solve, for example,

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

REGULAR GRAPHS OF GIVEN GIRTH. Contents

REGULAR GRAPHS OF GIVEN GIRTH. Contents REGULAR GRAPHS OF GIVEN GIRTH BROOKE ULLERY Contents 1. Introduction This paper gives an introduction to the area of graph theory dealing with properties of regular graphs of given girth. A large portion

More information

CHAPTER 4 PROPOSED ARCHITECTURE FOR INCREMENTAL PARALLEL WEBCRAWLER

CHAPTER 4 PROPOSED ARCHITECTURE FOR INCREMENTAL PARALLEL WEBCRAWLER CHAPTER 4 PROPOSED ARCHITECTURE FOR INCREMENTAL PARALLEL WEBCRAWLER 4.1 INTRODUCTION In 1994, the World Wide Web Worm (WWWW), one of the first web search engines had an index of 110,000 web pages [2] but

More information

Curriculum Map Grade(s): Subject: AP Computer Science

Curriculum Map Grade(s): Subject: AP Computer Science Curriculum Map Grade(s): 11-12 Subject: AP Computer Science (Semester 1 - Weeks 1-18) Unit / Weeks Content Skills Assessments Standards Lesson 1 - Background Chapter 1 of Textbook (Weeks 1-3) - 1.1 History

More information

Machine-Independent Optimizations

Machine-Independent Optimizations Chapter 9 Machine-Independent Optimizations High-level language constructs can introduce substantial run-time overhead if we naively translate each construct independently into machine code. This chapter

More information

Comment Extraction from Blog Posts and Its Applications to Opinion Mining

Comment Extraction from Blog Posts and Its Applications to Opinion Mining Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan

More information

Recurrences and Memoization: The Fibonacci Sequence

Recurrences and Memoization: The Fibonacci Sequence Chapter 7 Recurrences and Memoization: The Fibonacci Sequence Copyright Oliver Serang, 208 University of Montana Department of Computer Science The Fibonacci sequence occurs frequently in nature and has

More information

CHAPTER 4 BLOOM FILTER

CHAPTER 4 BLOOM FILTER 54 CHAPTER 4 BLOOM FILTER 4.1 INTRODUCTION Bloom filter was formulated by Bloom (1970) and is used widely today for different purposes including web caching, intrusion detection, content based routing,

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

Domain Mapping for Product-line Requirements

Domain Mapping for Product-line Requirements Computer Science Technical Reports Computer Science 2004 Domain Mapping for Product-line Requirements Kendra Schmid Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/cs_techreports

More information

Sparse Linear Systems

Sparse Linear Systems 1 Sparse Linear Systems Rob H. Bisseling Mathematical Institute, Utrecht University Course Introduction Scientific Computing February 22, 2018 2 Outline Iterative solution methods 3 A perfect bipartite

More information

COMPARATIVE ANALYSIS OF POWER METHOD AND GAUSS-SEIDEL METHOD IN PAGERANK COMPUTATION

COMPARATIVE ANALYSIS OF POWER METHOD AND GAUSS-SEIDEL METHOD IN PAGERANK COMPUTATION International Journal of Computer Engineering and Applications, Volume IX, Issue VIII, Sep. 15 www.ijcea.com ISSN 2321-3469 COMPARATIVE ANALYSIS OF POWER METHOD AND GAUSS-SEIDEL METHOD IN PAGERANK COMPUTATION

More information