Visualization of Pareto Data through Rank-By-Feature Framework

Size: px
Start display at page:

Download "Visualization of Pareto Data through Rank-By-Feature Framework"

Transcription

1 Visualization of Pareto Data through Rank-By-Feature Framework Dan Carlsen, Hao Jiang, Bo Yu (IST 597B Term Paper, Fall 2006) Correspondence: Dan Carlsen Hao Jiang Bo Yu Abstract Identifying performance trade-offs between various designs given a set of independent variables that define the design is of the utmost importance in understanding complex systems. These solutions to the multi-objective problem are known as Pareto optimum (Coello, 1999) or non-dominated and have the property that no other solution can be found that performs better for all objectives. Visualizing the Pareto-frontier in the performance space (objective function space) with more than three-objectives has been a great challenge to the optimization community. Our work decomposes these higher order Pareto sets into viewable (comprehensible) graphs with the overall goal of this project to come up with a way to support decision makers who have different interests while exploring this Pareto data in great detail. To achieve this goal, we place our work in a Rank-By-Feature framework by Seo & Shneiderman (2005) with new ranking criteria fit to Pareto data. These criteria include discontinuities, linearity, and shapes. In order to better support users while using the system, we added a focus + context technique with multiple focuses to the Rank-By-Feature Framework. Keywords Pareto Optimum, Rank-by-feature, Discontinuities, Linearity, Shapes, Focus + Context

2 I. Introduction Background Identifying performance trade-offs between various designs given a set of independent variables that define the design is of the utmost importance in understanding complex systems. Consider, for example the illustration shown below. Here the goal is to design a system that maximizes the life of a satellite while minimizing the cost to build it. Figure 1 represents a two-objective trade-off and the optimal solutions are A, B and C. These solutions to the multi-objective problem are known as Pareto optimum (Coello, 1999) or non-dominated and have the property that no other solution can be found that performs better for all objectives (A, B, and C are not dominated by each other or solutions D or E). Figure 1. Pareto Solutions In two dimensions, the set of optimal designs that define this trade-off are known, geometrically as a Pareto-frontier. In three-dimensions, the set is a Pareto-surface. In higher dimensions, visualization of interaction between these non-dominated designs becomes a challenge. The Problem Visualizing the Pareto-frontier in the performance space (objective function space) with more than three-objectives has been a great challenge to the optimization community. Although several suggestions have been made, they remain unwieldy or require extensive training. What is lacking from current 1

3 approaches is a methodology for extracting geometric features of higher order (greater than 3-dimensions) Pareto optimal data sets while maintaining a global context of all trade-offs considered in the complex system. In real world, such as automobile industry (Fujita et al, 1998), the optimum problems are usually involved with many objectives, and it may have many different solutions to these problems. After Pareto optimum calculation through various algorithms, it is possible to have a collection of solutions all of which are non-dominant points. That means we cannot say one solution in that solutions pool is better than any other in that solutions pool, because all of them are optimized solutions. However, for many reasons, different stakeholders or the same stakeholders in different situation may have different interests on certain objectives and care about the relations among some objectives. Think about this scenario. We have a Pareto data set, which has 30 objectives or dimensions for each solution. As this is Pareto data set, each point in this set is an optimal point, however, for some purpose, we would like pick certain points as our final choice based on certain criteria. For example, in the automobile design case, the designer may want to see the relation between horsepower and other objectives, and he or she wants to find out what objectives have negative relation with house power. In this situation, the designer can not tell the information he or she wants from the Pareto frontier itself, because Pareto data set only shows the final optimum. This problem exists in current Pareto data set visualization solution, such as HSDC (Agrawal et al, 2006). In present work, what we would like explore is the mechanism to maintain the global context while provides flexible means for exploring detailed information and relationship among dimensions (objectives). Related work Currently visualization in the performance space is limited to three objectives. Using colors, shapes, glyphs and other visual channels, it is possible to integrate more than three dimensions. These methods result in a busy display with complicated legends wherein decision makers must expend considerable energy and time to makes sense out of the chaos. Parallel coordinates (Inselberg, 1990) is one of the leading approaches used in industry to assist in the selection of optimal designs but the approach is not 2

4 intuitive for a large number of dimensions. Cloud visualization (Eddy & Lewis, 2002) is used to visualize Pareto data. Agrawal and his colleagues (2005, 2006) introduced a new Pareto data visualization technique called Hyper-Space Diagonal Counting (HSDC), in which multiple objectives or dimensions are agglomerated into a two or three coordinators plane. In three-dimension plane, one dimension is used to show the density of dots in a bin. All the solutions above focused on overall information visualization, but somehow are not convenient for explore detailed information, especially for digging the relationship and trends among objectives or dimensions. Our approach For this research, we would like to extend the work of Jinwook Seo and Ben Shneiderman and their Rank-by-Feature framework to aid in the understanding non-dominated sets in multi dimensions. Our strategy is that, we first deescalate high dimensional data set into two dimensional data pairs, and provide means to explore any two-dimensional relation. Case 1 Case 2 Figure 2 We will obtain examples of higher order Pareto data sets to use for feature identification and extraction. We will explore potential ranking criterion and develop (where needed) algorithms to identify and rank the given feature for any two-objective pair. Possible ranking criterions, related to the geometric properties of each two-objective Pareto-frontier include: the degree to which the solutions conflict with each other; the shape of the Pareto-frontier (normalized); the discontinuities exhibited by the frontier. The problem of identifying geometric primitives is complicated further by the fact that the optimization direction (maximum or minimum) for each objective will impact the ranking of the features and must be accounted for in a global context. For example, consider Case 1 in figure 2 where two objectives are both minimized. The Pareto-data set is a convex curve in the performance space. Now suppose that both 3

5 objectives are to be maximized (Case 2). The same data set may now consist of non-dominated solutions that form a concave curve or perhaps now exhibit discontinuous behavior. We must account for the optimization direction while identifying geometric properties so that they are understood in the global context of all 2D pairs of the n-dimensional Pareto set. Specifically in present work, in order to achieve our goal, we will place our work on rank-by-feature framework, and generate some specific ranking criteria that aid stakeholders discovering Pareto data set with particular interests. In general, we have three categories of ranking criteria other than the original ones presented in Seo & Shneiderman s (2005) work. These categories are discontinuity, linearity and curve shape. We will introduce each category in details in following sections. Besides new criteria we employ, we will make some modification in original work of rank-by-feature framework to enhance issues in usability and visualization. For example, we will apply focus + context technique to enable user having a thumbnail view on each two-dimensional pair in the global view window, and multiple-focus technique to enable user pour multi-focus in global view. As we place our work on rank-by-feature framework, we will introduce this framework by Seo & Shneiderman (2005) succinctly. Then issues of applicability and suitability for building our work on it will be discussed. After that we will focus on the details of criteria and concrete procedures for generating ranking information according to different criteria. Before conclusion we will introduce several enhancements for visualization and usability. Finally, we will close our discussion in conclusion section with some potential future work. Generation of Pareto Data In order to help simplify our project, one assumption we had to make was that we already had the Pareto data. In addition to already having the data, we assumed that the data was also already decomposed into 2-dimensional plots so that we could begin applying our ranking criteria. To generate the Pareto data examined for our project we used a free downloadable version of the NSGA-II, which is a Non-Dominating Sorting Genetic Algorithm. Non-Dominating meaning that the solutions it generates for a given problem are Pareto solutions (Pareto solutions dominate all other non-pareto solutions with respect to both objective functions in 2-dimensions). The problems which 4

6 generated our plots are discussed in A Fast and Elitist Multi-Objective Genetic Algorithm: NSGA-II. In total, 11 plots were generated by running this algorithm. The solutions generated allowed us to examine concave, convex, and discontinuous plots, as well as plots that exhibited both concave and convex sections and plots that contained nearly horizontally or vertically linear regions. II. Application of Rank-By-Feature Framework Introduction to Rank-By-Feature Framework Dealing with multidimensionality has been challenging to researchers in many disciplines for many years. In 1985, the statistician John Tukey proposed an approach called scagnostics (Tukey, 1985), and believed that displaying scatter plots with two of the many dimensions in a matrix was a comprehensible way to look at data. But he mentioned that there were often too many such 2D projections to examine in large data sets, so he proposed a few criteria for ranking scatterplots, but no one has implemented his ideas until now. The rank-by-feature framework (Seo & Shneiderman, 2005), developed by Jinwook Seo and Ben Shneiderman at the University of Maryland in the Hierarchical Clustering Explorer (HCE) software tool ( implements Tukey s vision in an open-ended manner to allow easy addition of new criteria. Figure 3. Rank-by-feature framework interface for scatterplots (2D) (After (Seo & Shneiderman, 2005) The rank-by-feature framework is designed for interactive feature detection in multidimensional data sets using axis-parallel projections in low dimensions (1D or 2D). It is believed that, by combining information visualization techniques (overview, coordination, and dynamic query) with ranking, summaries and statistical methods, users can systematically examine the most important 1D and 2D 5

7 axis-parallel projections, and develop a deeper understanding of the whole data sets. The Graphics, Ranking, and Interaction for Discovery (GRID) principles can be summarized as: (1) study 1D, study 2D, then find features ; (2) ranking guides insight, statistics confirm (Seo & Shneiderman, 2005). Based on the GRID principles, the rank-by-feature framework has a multiple components interface as shown in Figure 2.1. In the control panel (A), users can select a ranking criterion and rank low-dimensional projections (1D or 2D) of the multidimensional data set according to the strength of the selected feature in the projection. The score overview (B) is an m-by-m grid view where all dimensions are aligned in the rows and columns. Each cell of the score overview represents a scatterplot whose horizontal and vertical axes are dimensions at the corresponding column and row respectively. Each cell is color-coded by its score value of the selected ranking criterion. The ordering list (C) shows the result of ordering sorted by the ranking with scores color-coded on the background. A click on a cell in the score overview or an item in the ordering list will show the corresponding scatterplot in the scatterplot browser (D). The most valuable virtue of the rank-by-feature framework is the open-ended manner to incorporate new ranking criteria into the framework. Since the rank-by-feature framework can be considered as a telescope for high-dimensional data (Shneiderman, 2006), the ranking criteria can serve as different light filters to provide users different perspectives of the same data set. Although some common criteria, such as correlation coefficient, uniformity, are helpful in most conditions, it is necessary to develop new criteria to some specific problem domains. The rank-by-feature framework allows integrating novel statistical tests or new data mining algorithms into the current framework easily as plug-ins. Several criteria are implemented in the original framework, which include: correlation coefficient, least square error for curvilinear regression, quadracity, the number of potential outliers, the number of items in the region of interest, and uniformity of scatterplots. Suitability to Pareto Data There are a few important reasons why this Pareto data would be suitable for having a Rank-By-Feature Framework applied to it. One very significant reason is that multi-objective design/decision problems result in a vast number of data points (even if only Pareto points are considered). This immense amount 6

8 of data lends itself to having visualization techniques applied to examine it so that one would be able to discover interesting relationships and facts. A Rank-By-Feature Framework makes discovering relationships in large amounts of data relatively easy. Another very important reason is that higher order Pareto sets are not intuitive. In 2 dimensions, the Pareto set is a frontier (a curve), in 3 dimensions it is a surface, but what does it mean to have a Pareto set in 4 or more dimensions? This question can not be easily answered. Other techniques could possibly be used to visualize this higher order data without simplifying it, but those techniques themselves are not intuitive and require extensive training to begin to reap any benefit from them. Even once they are understood by the user, the user would find it quite difficult to explain what they see to another person. With these multi-objective design problems, any decisions that are made need to be explained to others in the group. The main reasons for using Rank-By-Feature for Pareto data are as follows. It decomposes the higher order Pareto sets into viewable (comprehensible) graphs, meaning the 2-dimensional plots are very easy to understand for anyone. It provides an overview of all objective function relationships. By ranking all of the plots it gives the designer or decision maker quantifiable criterion for making a selection. Finally, pictures (2D plots in this case) make storytelling easier, so that when the user needs to explain their choice to management or another stakeholder they have a plot to show how they made their decision in an easy manner. Application of Rank-By-Feature There are two things which need to be considered in order to apply the Rank-By-Feature Framework to the Pareto data. First, we need to create useful criteria to be evaluated (criteria that aids the decision process). Second, we need to decide on what attribute of that criteria to numerically rank. With this in mind, we created our new criteria. We created three criteria to evaluate the 2D Pareto plots: discontinuities, shape (i.e., concavity or convexity), and linearity (nearly horizontally or vertically linear regions in the plot). 7

9 III. Our Ranking Criteria A. Discontinuities In order to find discontinuities in the data we formed a simple algorithm. The discontinuities are found in the following manner: Find the linear distance (ld) between all data points. Using a sum of squares (Pythagorean Theorem). ld² = Δx² + Δy² Average ld between all data points. Each individual ld then compared to a multiple of that average. If the ld between two data points is greater than the multiple of the average, a discontinuity would be said to exist at that point. The user would determine the multiple to be used, likely based on the expected spread of the scatter plots. The 2-dimensional Pareto plots would be ranked by the number of discontinuities found in the plot. For example, the plot for fon in Figure 4.1 would exhibit no discontinuities unless a very small number (such as 1) for the multiple was used because the average ld between points is only The plot for zdt3 in Figure 4.2 would show discontinuities for even larger numbers of the multiple since the ld between the points is Discontinuities would be returned for the large gaps in the plot. The plot of zdt3 would score higher when counting discontinuities. Such discontinuity information would be quite useful for decision makers. Which designs are not feasible? Does other information indicate they would be a good choice? The decision maker could then investigate an earlier design step to find out why they are not feasible. Such information would not simply give an overall optimal solution, but would help the designer to figure out why certain designs are currently not feasible. 8

10 fon - Pareto Data zdt3 - Pareto Data 1.20E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E+00 Figure 4.1: Plot of Pareto Data for fon Figure 4.2: Plot of Pareto Data for zdt3 B. Linearity In order to find the nearly vertically or horizontally linear regions of the plot we applied another simple algorithm. We say nearly vertically or horizontally linear because a true horizontal or vertical line could not exist in the Pareto plot. For example, if a line were completely horizontal and goal was to get both objectives as close to zero as possible then the point furthest to the left of the horizontal line would dominate all other points on that line with respect to the objective on the x-axis. These regions were found by: Find the slope (s) between data points. s = Δy/Δx Average s between all data points. Each s then compared to a multiple of that average as well as the average divided by another user chosen number. If the s between two data points is greater than the multiple of the average it is said to be nearly vertically linear. If the s between two data points is less than the average divided by a chosen number is it said to be nearly horizontally linear. Just as in the discontinuity algorithm, the user would determine the multiple to be used as well as the 9

11 number used to divide. The plots are then ranked according to the percentage of points found to be nearly horizontally or vertically linear. For example, zdt5 in Figure 5.3 would show both vertically (depending on the multiple) and horizontally linear regions. The plot of pol in Figure 5.1 would show only a small vertical portion, but most would be considered nearly horizontal. Both pol and zdt5 would have a high score for linearity while zdt2 in Figure 5.2 would not. Knowledge of linear regions is very important. A highly linear region indicates that design choices vary greatly with respect to one objective function while varying very little with respect to the other. Much improvement can be made with respect to one objective function while losing very little with respect to the other. pol - Pareto Data zdt2 - Pareto Data Figure 5.1: Plot of Pareto Data for pol Figure 5.2: Plot of Pareto Data for zdt2 zdt5 - Pareto Data Figure 5.3: Plot of Pareto Data for zdt5 10

12 C. Shapes Along with discontinuities and linearity criteria, we now introduce the third category of the new criteria, the shapes of two-dimensional data in Pareto data set and their trends. For the purpose of providing users the means to explore the trends of any two-dimensional data, first, we need to get the proximal shapes of the data set. We use curve-fitting technique (Denison, Mallick & Smith, 1998) to get the approximate shape and corresponding polynomials that represent two-dimensional data. After that we can get the rank of each curve by calculating the coefficients of equations. In this category, we basically have eight criteria. They are linear increase and decrease, concave increase and decrease, convex increase and decrease, and U-shape and reversed U-shape. For U-shape and reversed U-shape criteria, because they have both increase and decrease branches, so we simply take the curvature of each curve as rank criterion, the bigger the curvature, the higher the rank. These ranking criteria, we think, can aid the process of decision making by enabling users to dig into the details of each two-dimensional pair, and particularly providing the information of shape of curves that represent the tendency of change. By such information, different users can easily find the objectives they are interested in and capture the relationship among objectives they care about. For example, in the automobile design case, there are about 30 objectives, and among them we have an objective representing longevity of cylinders. If a user (maybe designer, buyer or other potential decision maker) have special need on this dimension, he or she can explore the all of the data pairs with one dimension as cylinder longevity. This does not give the direct answer to the user about which solution is perfect for him or her, but it can aid the decision by providing means by which the user can explore detailed information among objectives. Some issues need to be clarified are 1) curve fitting is not supposed to generate the original equations which generates those curves, and in fact, curve fitting usually loss fidelity, comparing with the original data distribution, even if the fitting is the best fitting; 2) related to the first issues, fidelity losing is amplified by the fact that people can choose different fitting degree. For example, for the same data set, people can use quadratic or three degree polynomial to fit the curve. These problems also exist in our present work. However, because our goal is to provide estimated shapes and tendency for users, although higher accuracy or fidelity is important to us, it will not impair users exploring information too much, in 11

13 terms of approximate trends and shape. Furthermore, it can be a feature enhancement in that getting higher fitting coefficient of each fitting can be taken into account. Linear Fitting Linear fitting aims to capture the linear changing tendency of two-dimensional data from Pareto data set. Linear relation is the simplest relationship among two-dimensional data, and it represents a relation in which if value on one dimension, usually X-axis in the two-dimension coordinator plane, varies, value on another dimension, usually Y-axis, will change, and the rate of change always remains the same. In our categories, linear change has two criteria, linear increase and linear decrease. For these two criteria, we consider the rate of change as main rank criteria and the direction of change. Mathematically, rank of linear increase and decrease can be drawn from each one of them, which means if rank of linear increase is given, users can get the rank of linear decrease. However, we still explicitly provide these two criteria in our final design, because 1) if we just provide only one of them, calculation still is needed to get another rank result, so it will consume extra but unnecessary cognitive process; 2) Semantically, these two criteria are different. y = 3x y = 2x y = -2x Figure 6: Example of linear criteria Rank information for linear increasing and decreasing is relatively easily to get. Now, we present the 12

14 main procedures through which we generate ranking formation for linear criteria. As we mentioned before, we get the linear polynomial and corresponding fitting coefficient (L_fitting_coefficient) for each polynomial by using curve fitting technique, and the resultant equations will take the form as y = ax + b. We will use L_fitting_coefficient to communicate the information of how well is the fitting. After we get the equation of each curve, line in this case, we calculate the slop (L_slop = a) of each of them, which will be used in final ranking computing. If linear increase is specified as a rank criterion, the bigger the L_slop, the higher the rank; if linear decrease is selected, the bigger the L_slop, the higher the rank. Figure? shows the example of linear criteria. In linear case, both the actual increase and decrease situation will be presented in score overview panel with different color and gray scaling coding. That means, users can find linear decrease relations in the score overview panel, even though they select linear increase as rank criterion. Quadratic fitting Figure 7: Example of quadratic fitting Quadratic fitting aims to reveal the relations among those data which is not linear distributed, or in other words, having obvious curve with its distribution. Curve or quadratic relations are more complex than linear relations, because they contain much more information. For example, concave decrease curves have 13

15 high rate of decrease at the beginning period of increasing of value on X-axis (Figure 2, case 1); while curves with convex decrease have high rate of decrease at the latter part of increase of value on X-axis (Figure 2, case 2). Such information may be important to decision makers, because there is a threshold which distinguishes to situations. Besides linear criteria, we have other six are related to quadratic curve criteria. However, we will put U-shape and reversed U-shape into a separate category, because they have both increase and decrease branches. Now, we mainly focus on concave increase and decrease, convex increase and decrease. Figure 7 shows an example. To get the information for ranking, we first use curve-fitting technique to generate quadratic polynomial of each data sets and corresponding fitting coefficient (Q_fitting_coefficient) of each fitting. We will use L_fitting_coefficient to communicate the information of how well is the fitting. Because we use quadratic fitting, all of the resultant polynomials should have the form y = a*x*x + b*x + c. Based on the polynomials we get, we will get the largest curvature (Q_curvature) of each polynomial, which is used to calculate the final ranking. To get the shapes of real data in the Pareto data set, we need to know exact information of the position of real data on the generated curves. We compute the normal (Q_Normal = -b/ (2a)) of each curve. After we get this information, we can determine the position and shape of real data. We call the position information as Q_position. For each curve, if Q_Normal is at somewhere between minimal and maximal values of the data we have on X-axis, we give Q_position a value 0; we can tell the curve is U-shape (a > 0) or reversed U-shape (a < 0 ); if Q_Normal is less than minimal value of the data we have, we give Q_position a value -1; the curve is concave increase when a > 0, or the curve is convex decrease when a < 0; if Q_Normal is bigger than maximal value of the data we have, we give Q_position a value 1; the curve is concave decrease when a > 0, or the curve is convex increase when a < 0. After getting all the information above, we can compute the rank of each curve based on different criteria the users specify. If concave increase is selected as rank criterion, first, we drop the curves with Q_position = 0, and curves with a < 0, because these curves are convex or U-shape; second, we use equation R = Q_curvature* a/ a *Q_position, the bigger the R, the higher the rank. If concave decrease is selected as rank criterion, first, we drop the curves with Q_position = 0, and curves with a < 0, because these curves are convex or 14

16 U-shape; second, we use equation R = Q_curvature* a/ a *Q_position, the bigger the -R, the higher the rank. If convex increase is selected as rank criterion, first, we drop the curves with Q_position = 0, and curves with a > 0, because these curves are concave or U-shape; second, we use equation R = Q_curvature* a/ a *Q_position, the bigger the R, the higher the rank. If convex decrease is selected as rank criterion, first, we drop the curves with Q_position = 0, and curves with a > 0, because these curves are concave or U-shape; second, we use equation R = Q_curvature* a/ a *Q_position, the bigger the -R, the higher the rank. Through the steps shown above, we can generate the final ranking for each curve with criterion as concave increase or decrease, or convex increase or decrease. Beyond that, we can generate the U-shape and reversed U-shape rank by information above. For the case with U-shape as criterion, we only pick curves with Q_position = 0, because others are only concave or convex; then we computing the rank of each curve, using equation R = Q_curvature* a/ a, the bigger the R, the higher the rank. If the case with reversed U-shape is selected, we only pick curves with Q_position = 0, because others are only concave or convex; then we computing the rank of each curve, using equation R = Q_curvature* a/ a, the bigger the -R, the higher the rank. IV. Interface Improvement While meaningful ranking criteria provide users the chance to systematically examine the most important low-dimension projections for the Pareto data sets, appropriate information visualization techniques can help users explore the ranking results more effectively and therefore maximize the benefit of ranking. Based on the analysis of current visual interface in the rank-by-feature framework, we believe that further improvement could be proposed to better users understanding of the data. Analysis of current Rank-By-Feature interface The original rank-by-feature interface used in HCE toolkit has four parts (see Figure 2.1): control panel (A), score overview (B), ordering list (C), and scatter-plot browser (D). The control panel allows users to dynamically choose different ranking criteria and change the views both in the score overview and the ordering list. The score overview provides a color-coded overview of the ranking scores in a two-dimension matrix. The ordering list is a linear list with more detailed ranking information. The 15

17 scatter plot browser shows the actual scatter plot for a specific low-dimension (1D or 2D) projection. Views in these three visual components are linked together. When the user changes the focus in any of them, the other two components will change correspondingly. Generally, three visualization techniques can be identified in this interface: A. Overview + Details Generally, the rank-by-feature interface follows the visual information seeking mantra: overview first, zoom and filter, then details-on-demand. The score overview (B) provides an overview of the entire collection of data; the ranking-criterion and color-coding can help people filter out uninteresting items; the scatter-plot explorer (D) shows the detailed scatter-plot when a cell in the overview matrix is selected. The overview + details technique helps users explore large sets of data by keeping a view of whole data available, while pursuing detailed analysis of a part of it. B. Coordination The rank-by-feature interface also provides an enhanced level of interactivity by combining displays and allowing highlights to be broadcast from one to the other. The overview matrix and the score list show different views of the same ranking result. When users choose a focus in any of them, the focus in the other one will be changed accordingly. By using this technique, the system reduces the time for users to coordinate in different components. C. Dynamic query Besides, the framework provides dynamic feedbacks by allowing users to control the contents of the display. Users can quickly change the ranking criteria and the views in overview matrix and the score list will be updated by the system. In this way, users can compare the difference between different criteria, so that they can find the most helpful ones. Although these visualization techniques have been applied in current framework and provide users an interactive interface, we still believe that some problems have not been addressed and could be improved by integrating other techniques: (1) The details-on-demand component can only show one scatter plot at each time. However, when users study the multi-dimensional Pareto data sets, what they really want is to compare 16

18 characteristics in multiple dimensions at the same time. For the rank-by-feature frame, that means they want to compare several scatter plots in the same display, which is unfeasible in current interface. (2) When the number of dimensions grows very large, the score overview will become very crowd and the size of the cells will be very small. As a result, it will be difficult for users to choose the interested item. (3) The overview and details-on-demand components in this interface is separated as different windows. However, it is said that when information is broken into two displays, visual search and working memory consequences degrade performance (Larkin and Simon, 1987). Users need to change their focuses from the overview window to the detail window, which will increase the cognitive load of using the system. Integration with Focus + Context To address these problems in current framework, we propose to integrate Focus + Context strategy into the visual interface. The working hypothesis of this strategy is that it may be possible to create better cost structures of information by displaying more peripheral information at reduced detail in combination with the information in focus, dynamically varying the detail in parts of the display as the user s attention changes. Our method can be described as follow: (1) The background of cells in the overview matrix is colored using our ranking criteria. The colors serve as the clue for users to navigate in the matrix and help users find possible points of interests. This is the same as original interface. (2) When the mouse cursor moves to one of the cells, which means it becomes the focus, we amplify the area of it and show the scatter plot of these two dimensions directly in it. Also, the cells around this cell are amplified, but with a smaller scale. These cells can be used for users to find exact cell, which they may be interested in, when the number of cells becomes very large (See Figure 8(A)). 17

19 (3) When the user clicks on one of these cells, it will be amplified even when the mouse cursor moves away. The cell can be restored to the normal size when the user clicks it again. In this way, the user can specify multiple focuses, so that they can compare the details of several different combinations of dimensions at the same time (See Figure 8(B)). A problem that should be addressed in the multiple focuses is that if the multiple focuses are far away from each other, which means that they cannot be displayed in the same view, the cells among those focus cells will be compressed to a smaller scale than the normal size. Generally, the interface can benefit from this technique in three ways: (A) (B) Figure 8: Context + Focus in Score Overview (1) By showing the details-on-demand directly on the overview window, users can avoid the switch of focus from one window to another, so that the cognitive load to coordinate different views will be reduced. (2) The focus + context strategy allows the system to display large volume of data in a single view without undermining the understanding of the user. (3) The users can compare multiple details at the same time by using the multiple focuses. In our application of Pareto data sets, this is very important for the users because they usually need to investigate the data from more than two dimensions. 18

20 V. Conclusion The overall goal of this project was to come up with a way to support decision makers who have different interests while exploring Pareto data with respect to different objectives in great detail. To achieve this goal, we place our work in a Rank-By-Feature framework by Seo & Shneiderman (2005) with new ranking criteria fit to Pareto data. These criteria include discontinuities, linearity, and shapes. In order to better support users while using the system, we added a focus + context technique with multiple focuses to the Rank-By-Feature Framework. We discovered that while our method is a tool to help aid decision makers, it does not necessarily give the best design, but it provides a meaningful way to explore Pareto Data and add validation for design decisions. Of course, there is a lot of future work that could be done. For example, we found that our ranking criteria are not the only useful criteria; others could likely be developed by experts in various knowledge domains. Also, we plan to implement our work into the Rank-By-Feature software, and we would like to have our methods evaluated by actual decision makers to prove its usefulness. 19

21 Reference 1. Coello, C. A. C. (1999). A comprehensive survey of evolutionary-based multiobjective optimization. Knowledge and Information Systems, 1(3): Seo, Jinwook and Shneiderman, Ben (2005). A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data. Information Visualization Voll. 4, No. 2, Summer 2005, pp Deb, K. (2001). Multi-Objective Optimization using Evolutionary Algorithms, John Wiley & Sons, New York, 2001, pp Horn, J., Nafpliotis, N. and Goldberg, D. E. (1994). A niched pareto genetic algorithm for multiobjective optimization. In Proceedings of the First IEEE Conference on Evolutionary Computation, IEEE World Congress on Computational Computation, Volume 1, pages Abbass, Hussein A., Sarker, Ruhul, Newton, Charles (2001). PDE: A Pareto-frontier Differential Evolution Approach for Multi-objective Optimization Problems. Proceedings of the 2001 Congress on Evolutionary Computation CEC Agrawal, G., Bloebaum, C. L., Lewis, K. (2005). Intuitive Design Selection Using Visualized n-dimensional Pareto Frontier. 46 th AIAA/ASME/ASCE/AHS/ASC Structures, Agrawal, G., Lewis, K. E., Bloebaum, C. L. (2006). Intuitive Visualization of Hyperspace Pareto Frontier. 44th AIAA Aerospace Sciences Meeting and Exhibit; Reno, NV; USA; 9-12 Jan pp Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. (2002). A Fast and Elitist MultiobjectiveGenetic Algorithm: NSGA-II, IEEE Trans. on Evolutionary Computation: 6 (2002) Deb, Kalyanmoy and Saxena, Dhish Kumar (2005). On Finding Pareto-Optimal Solutions Through Dimensionality Reduction for Certain Large-Dimensional Multi-Ob jective Optimization Problems. KanGAL Report: No Seo, Jinwook and Shneiderman, B. (2006). Knowledge discovery in high-dimensional data: case studies and a user survey for the rank-by-feature framework. Visualization and Computer Graphics, IEEE Transactions on Volume 12, Issue 3, May-June 2006 Page(s): Shneiderman, B. (2006) A Telescope for High-Dimensional Data. Computing in Science & Engineering, vol. 8, no. 2, 2006, pp Pareto efficiency. 20

22 Contributors 1 Introduction Hao 1.1 Background 1.2 The Visualization Problem 1.3 Related Work 1.4 Our Approach 1.5 Structure of This Paper 2 Application of Rank-By-Feature Framework Bo & Dan 2.1 Introduction of Rank-By-Feature Framework 2.2 Why is it suitable for our problem? 2.3 What should we do to apply it in our context? Bo Dan Dan 3 Ranking Criteria Hao & Dan 3.1 Current criteria provided by the original work Hao 3.2 Our new criteria Discontinuties Dan Shapes Hao Linearity Dan 4 Interface Improvement Bo 4.1 Analysis of current Rank-By-Feature interface 4.2 Integration with Focus + Context 5 Conclusion All 6 Reference All 21

EVOLVE : A Visualization Tool for Multi-Objective Optimization Featuring Linked View of Explanatory Variables and Objective Functions

EVOLVE : A Visualization Tool for Multi-Objective Optimization Featuring Linked View of Explanatory Variables and Objective Functions 2014 18th International Conference on Information Visualisation EVOLVE : A Visualization Tool for Multi-Objective Optimization Featuring Linked View of Explanatory Variables and Objective Functions Maki

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight local variation of one variable with respect to another.

More information

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures

More information

3. Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data

3. Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data 3. Multidimensional Information Visualization II Concepts for visualizing univariate to hypervariate data Vorlesung Informationsvisualisierung Prof. Dr. Andreas Butz, WS 2009/10 Konzept und Basis für n:

More information

7 Fractions. Number Sense and Numeration Measurement Geometry and Spatial Sense Patterning and Algebra Data Management and Probability

7 Fractions. Number Sense and Numeration Measurement Geometry and Spatial Sense Patterning and Algebra Data Management and Probability 7 Fractions GRADE 7 FRACTIONS continue to develop proficiency by using fractions in mental strategies and in selecting and justifying use; develop proficiency in adding and subtracting simple fractions;

More information

Tips and Guidance for Analyzing Data. Executive Summary

Tips and Guidance for Analyzing Data. Executive Summary Tips and Guidance for Analyzing Data Executive Summary This document has information and suggestions about three things: 1) how to quickly do a preliminary analysis of time-series data; 2) key things to

More information

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below. Graphing in Excel featuring Excel 2007 1 A spreadsheet can be a powerful tool for analyzing and graphing data, but it works completely differently from the graphing calculator that you re used to. If you

More information

Chapter 2 Basic Structure of High-Dimensional Spaces

Chapter 2 Basic Structure of High-Dimensional Spaces Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,

More information

Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts

Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts IEEE Symposium Series on Computational Intelligence Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts Heiner Zille Institute of Knowledge and Language Engineering University of Magdeburg, Germany

More information

Finding a preferred diverse set of Pareto-optimal solutions for a limited number of function calls

Finding a preferred diverse set of Pareto-optimal solutions for a limited number of function calls Finding a preferred diverse set of Pareto-optimal solutions for a limited number of function calls Florian Siegmund, Amos H.C. Ng Virtual Systems Research Center University of Skövde P.O. 408, 541 48 Skövde,

More information

x y

x y 10. LECTURE 10 Objectives I understand the difficulty in finding an appropriate function for a data set in general. In some cases, I can define a function type that may fit a data set well. Last time,

More information

What s New in Spotfire DXP 1.1. Spotfire Product Management January 2007

What s New in Spotfire DXP 1.1. Spotfire Product Management January 2007 What s New in Spotfire DXP 1.1 Spotfire Product Management January 2007 Spotfire DXP Version 1.1 This document highlights the new capabilities planned for release in version 1.1 of Spotfire DXP. In this

More information

Evolutionary Algorithms: Lecture 4. Department of Cybernetics, CTU Prague.

Evolutionary Algorithms: Lecture 4. Department of Cybernetics, CTU Prague. Evolutionary Algorithms: Lecture 4 Jiří Kubaĺık Department of Cybernetics, CTU Prague http://labe.felk.cvut.cz/~posik/xe33scp/ pmulti-objective Optimization :: Many real-world problems involve multiple

More information

Integrated Math I. IM1.1.3 Understand and use the distributive, associative, and commutative properties.

Integrated Math I. IM1.1.3 Understand and use the distributive, associative, and commutative properties. Standard 1: Number Sense and Computation Students simplify and compare expressions. They use rational exponents and simplify square roots. IM1.1.1 Compare real number expressions. IM1.1.2 Simplify square

More information

DEMO: Differential Evolution for Multiobjective Optimization

DEMO: Differential Evolution for Multiobjective Optimization DEMO: Differential Evolution for Multiobjective Optimization Tea Robič and Bogdan Filipič Department of Intelligent Systems, Jožef Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia tea.robic@ijs.si

More information

Late Parallelization and Feedback Approaches for Distributed Computation of Evolutionary Multiobjective Optimization Algorithms

Late Parallelization and Feedback Approaches for Distributed Computation of Evolutionary Multiobjective Optimization Algorithms Late arallelization and Feedback Approaches for Distributed Computation of Evolutionary Multiobjective Optimization Algorithms O. Tolga Altinoz Department of Electrical and Electronics Engineering Ankara

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

Multi-objective Optimization

Multi-objective Optimization Jugal K. Kalita Single vs. Single vs. Single Objective Optimization: When an optimization problem involves only one objective function, the task of finding the optimal solution is called single-objective

More information

Incorporation of Scalarizing Fitness Functions into Evolutionary Multiobjective Optimization Algorithms

Incorporation of Scalarizing Fitness Functions into Evolutionary Multiobjective Optimization Algorithms H. Ishibuchi, T. Doi, and Y. Nojima, Incorporation of scalarizing fitness functions into evolutionary multiobjective optimization algorithms, Lecture Notes in Computer Science 4193: Parallel Problem Solving

More information

Week 7 Picturing Network. Vahe and Bethany

Week 7 Picturing Network. Vahe and Bethany Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups

More information

REDUCING INFORMATION OVERLOAD IN LARGE SEISMIC DATA SETS. Jeff Hampton, Chris Young, John Merchant, Dorthe Carr and Julio Aguilar-Chang 1

REDUCING INFORMATION OVERLOAD IN LARGE SEISMIC DATA SETS. Jeff Hampton, Chris Young, John Merchant, Dorthe Carr and Julio Aguilar-Chang 1 REDUCING INFORMATION OVERLOAD IN LARGE SEISMIC DATA SETS Jeff Hampton, Chris Young, John Merchant, Dorthe Carr and Julio Aguilar-Chang 1 Sandia National Laboratories and 1 Los Alamos National Laboratory

More information

CHAPTER 4 A DECISION SUPPORT SYSTEM FOR DESIGNING ELECTRICAL POWER DISTRICTS IN THE REPUBLIC OF GHANA

CHAPTER 4 A DECISION SUPPORT SYSTEM FOR DESIGNING ELECTRICAL POWER DISTRICTS IN THE REPUBLIC OF GHANA CHAPTER 4 A DECISION SUPPORT SYSTEM FOR DESIGNING ELECTRICAL POWER DISTRICTS IN THE REPUBLIC OF GHANA 63 INTRODUCTION Generating a set of Pareto optimal solutions for a decision maker (DM) is only the

More information

Dynamic Aggregation to Support Pattern Discovery: A case study with web logs

Dynamic Aggregation to Support Pattern Discovery: A case study with web logs Dynamic Aggregation to Support Pattern Discovery: A case study with web logs Lida Tang and Ben Shneiderman Department of Computer Science University of Maryland College Park, MD 20720 {ltang, ben}@cs.umd.edu

More information

Chapter 1 Polynomials and Modeling

Chapter 1 Polynomials and Modeling Chapter 1 Polynomials and Modeling 1.1 Linear Functions Recall that a line is a function of the form y = mx+ b, where m is the slope of the line (how steep the line is) and b gives the y-intercept (where

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are

More information

Scatterplot: The Bridge from Correlation to Regression

Scatterplot: The Bridge from Correlation to Regression Scatterplot: The Bridge from Correlation to Regression We have already seen how a histogram is a useful technique for graphing the distribution of one variable. Here is the histogram depicting the distribution

More information

Performance Assessment of DMOEA-DD with CEC 2009 MOEA Competition Test Instances

Performance Assessment of DMOEA-DD with CEC 2009 MOEA Competition Test Instances Performance Assessment of DMOEA-DD with CEC 2009 MOEA Competition Test Instances Minzhong Liu, Xiufen Zou, Yu Chen, Zhijian Wu Abstract In this paper, the DMOEA-DD, which is an improvement of DMOEA[1,

More information

OPTIMIZATION, OPTIMAL DESIGN AND DE NOVO PROGRAMMING: DISCUSSION NOTES

OPTIMIZATION, OPTIMAL DESIGN AND DE NOVO PROGRAMMING: DISCUSSION NOTES OPTIMIZATION, OPTIMAL DESIGN AND DE NOVO PROGRAMMING: DISCUSSION NOTES MILAN ZELENY Introduction Fordham University, New York, USA mzeleny@fordham.edu Many older texts, with titles like Globally Optimal

More information

Introduction to Geospatial Analysis

Introduction to Geospatial Analysis Introduction to Geospatial Analysis Introduction to Geospatial Analysis 1 Descriptive Statistics Descriptive statistics. 2 What and Why? Descriptive Statistics Quantitative description of data Why? Allow

More information

Section 4.4: Parabolas

Section 4.4: Parabolas Objective: Graph parabolas using the vertex, x-intercepts, and y-intercept. Just as the graph of a linear equation y mx b can be drawn, the graph of a quadratic equation y ax bx c can be drawn. The graph

More information

Excel Core Certification

Excel Core Certification Microsoft Office Specialist 2010 Microsoft Excel Core Certification 2010 Lesson 6: Working with Charts Lesson Objectives This lesson introduces you to working with charts. You will look at how to create

More information

1. Data Analysis Yields Numbers & Visualizations. 2. Why Visualize Data? 3. What do Visualizations do? 4. Research on Visualizations

1. Data Analysis Yields Numbers & Visualizations. 2. Why Visualize Data? 3. What do Visualizations do? 4. Research on Visualizations Data Analysis & Business Intelligence Made Easy with Excel Power Tools Excel Data Analysis Basics = E-DAB Notes for Video: E-DAB-05- Visualizations: Table, Charts, Conditional Formatting & Dashboards Outcomes

More information

MEASURES OF CENTRAL TENDENCY

MEASURES OF CENTRAL TENDENCY 11.1 Find Measures of Central Tendency and Dispersion STATISTICS Numerical values used to summarize and compare sets of data MEASURE OF CENTRAL TENDENCY A number used to represent the center or middle

More information

A gradient-based multiobjective optimization technique using an adaptive weighting method

A gradient-based multiobjective optimization technique using an adaptive weighting method 10 th World Congress on Structural and Multidisciplinary Optimization May 19-24, 2013, Orlando, Florida, USA A gradient-based multiobjective optimization technique using an adaptive weighting method Kazuhiro

More information

Multi-Objective Optimization using Evolutionary Algorithms

Multi-Objective Optimization using Evolutionary Algorithms Multi-Objective Optimization using Evolutionary Algorithms Kalyanmoy Deb Department of Mechanical Engineering, Indian Institute of Technology, Kanpur, India JOHN WILEY & SONS, LTD Chichester New York Weinheim

More information

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA Kalyanmoy Deb, Amrit Pratap, and Subrajyoti Moitra Kanpur Genetic Algorithms Laboratory (KanGAL) Indian Institute

More information

A Fuzzy Logic Controller Based Dynamic Routing Algorithm with SPDE based Differential Evolution Approach

A Fuzzy Logic Controller Based Dynamic Routing Algorithm with SPDE based Differential Evolution Approach A Fuzzy Logic Controller Based Dynamic Routing Algorithm with SPDE based Differential Evolution Approach Debraj De Sonai Ray Amit Konar Amita Chatterjee Department of Electronics & Telecommunication Engineering,

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

Two-dimensional Totalistic Code 52

Two-dimensional Totalistic Code 52 Two-dimensional Totalistic Code 52 Todd Rowland Senior Research Associate, Wolfram Research, Inc. 100 Trade Center Drive, Champaign, IL The totalistic two-dimensional cellular automaton code 52 is capable

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated

More information

Multi-objective Optimization

Multi-objective Optimization Some introductory figures from : Deb Kalyanmoy, Multi-Objective Optimization using Evolutionary Algorithms, Wiley 2001 Multi-objective Optimization Implementation of Constrained GA Based on NSGA-II Optimization

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

Excel Tips and FAQs - MS 2010

Excel Tips and FAQs - MS 2010 BIOL 211D Excel Tips and FAQs - MS 2010 Remember to save frequently! Part I. Managing and Summarizing Data NOTE IN EXCEL 2010, THERE ARE A NUMBER OF WAYS TO DO THE CORRECT THING! FAQ1: How do I sort my

More information

Multiobjective Formulations of Fuzzy Rule-Based Classification System Design

Multiobjective Formulations of Fuzzy Rule-Based Classification System Design Multiobjective Formulations of Fuzzy Rule-Based Classification System Design Hisao Ishibuchi and Yusuke Nojima Graduate School of Engineering, Osaka Prefecture University, - Gakuen-cho, Sakai, Osaka 599-853,

More information

Multi-Objective Optimization using Evolutionary Algorithms

Multi-Objective Optimization using Evolutionary Algorithms Multi-Objective Optimization using Evolutionary Algorithms Kalyanmoy Deb Department ofmechanical Engineering, Indian Institute of Technology, Kanpur, India JOHN WILEY & SONS, LTD Chichester New York Weinheim

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms

More information

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering Gurpreet Kaur M-Tech Student, Department of Computer Engineering, Yadawindra College of Engineering, Talwandi Sabo,

More information

Graph Structure Over Time

Graph Structure Over Time Graph Structure Over Time Observing how time alters the structure of the IEEE data set Priti Kumar Computer Science Rensselaer Polytechnic Institute Troy, NY Kumarp3@rpi.edu Abstract This paper examines

More information

5.1 Introduction to the Graphs of Polynomials

5.1 Introduction to the Graphs of Polynomials Math 3201 5.1 Introduction to the Graphs of Polynomials In Math 1201/2201, we examined three types of polynomial functions: Constant Function - horizontal line such as y = 2 Linear Function - sloped line,

More information

An Intelligent Method for Searching Metadata Spaces

An Intelligent Method for Searching Metadata Spaces An Intelligent Method for Searching Metadata Spaces Introduction This paper proposes a manner by which databases containing IEEE P1484.12 Learning Object Metadata can be effectively searched. (The methods

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

Multi-Objective Sorting in Light Source Design. Louis Emery and Michael Borland Argonne National Laboratory March 14 th, 2012

Multi-Objective Sorting in Light Source Design. Louis Emery and Michael Borland Argonne National Laboratory March 14 th, 2012 Multi-Objective Sorting in Light Source Design Louis Emery and Michael Borland Argonne National Laboratory March 14 th, 2012 Outline Introduction How do we handle multiple design goals? Need to understand

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

/4 Directions: Graph the functions, then answer the following question.

/4 Directions: Graph the functions, then answer the following question. 1.) Graph y = x. Label the graph. Standard: F-BF.3 Identify the effect on the graph of replacing f(x) by f(x) +k, k f(x), f(kx), and f(x+k), for specific values of k; find the value of k given the graphs.

More information

Glossary Common Core Curriculum Maps Math/Grade 6 Grade 8

Glossary Common Core Curriculum Maps Math/Grade 6 Grade 8 Glossary Common Core Curriculum Maps Math/Grade 6 Grade 8 Grade 6 Grade 8 absolute value Distance of a number (x) from zero on a number line. Because absolute value represents distance, the absolute value

More information

ENV Laboratory 2: Graphing

ENV Laboratory 2: Graphing Name: Date: Introduction It is often said that a picture is worth 1,000 words, or for scientists we might rephrase it to say that a graph is worth 1,000 words. Graphs are most often used to express data

More information

Building Better Parametric Cost Models

Building Better Parametric Cost Models Building Better Parametric Cost Models Based on the PMI PMBOK Guide Fourth Edition 37 IPDI has been reviewed and approved as a provider of project management training by the Project Management Institute

More information

Multi-objective Optimization Algorithm based on Magnetotactic Bacterium

Multi-objective Optimization Algorithm based on Magnetotactic Bacterium Vol.78 (MulGrab 24), pp.6-64 http://dx.doi.org/.4257/astl.24.78. Multi-obective Optimization Algorithm based on Magnetotactic Bacterium Zhidan Xu Institute of Basic Science, Harbin University of Commerce,

More information

Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm

Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm Enhanced Performance of Search Engine with Multitype Feature Co-Selection of Db-scan Clustering Algorithm K.Parimala, Assistant Professor, MCA Department, NMS.S.Vellaichamy Nadar College, Madurai, Dr.V.Palanisamy,

More information

8 th Grade Mathematics Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the

8 th Grade Mathematics Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 8 th Grade Mathematics Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13. This document is designed to help North Carolina educators

More information

Derivatives and Graphs of Functions

Derivatives and Graphs of Functions Derivatives and Graphs of Functions September 8, 2014 2.2 Second Derivatives, Concavity, and Graphs In the previous section, we discussed how our derivatives can be used to obtain useful information about

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

A Distance Metric for Evolutionary Many-Objective Optimization Algorithms Using User-Preferences

A Distance Metric for Evolutionary Many-Objective Optimization Algorithms Using User-Preferences A Distance Metric for Evolutionary Many-Objective Optimization Algorithms Using User-Preferences Upali K. Wickramasinghe and Xiaodong Li School of Computer Science and Information Technology, RMIT University,

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

Overview for Families

Overview for Families unit: Picturing Numbers Mathematical strand: Data Analysis and Probability The following pages will help you to understand the mathematics that your child is currently studying as well as the type of problems

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

Quality Metrics for Visual Analytics of High-Dimensional Data

Quality Metrics for Visual Analytics of High-Dimensional Data Quality Metrics for Visual Analytics of High-Dimensional Data Daniel A. Keim Data Analysis and Information Visualization Group University of Konstanz, Germany Workshop on Visual Analytics and Information

More information

APPLICATION OF SELF-ORGANIZING MAPS IN VISUALIZATION OF MULTI- DIMENSIONAL PARETO FRONTS

APPLICATION OF SELF-ORGANIZING MAPS IN VISUALIZATION OF MULTI- DIMENSIONAL PARETO FRONTS Zeszyty Naukowe WSInf Vol 15, Nr 1, 2016 Tomasz Schlieter Institute of Computational Mechanics and Engineering, Silesian University of Technology, ul. Konarskiego 18A, 44-100 Gliwice email: tomasz.schlieter@polsl.pl

More information

A Content Based Image Retrieval System Based on Color Features

A Content Based Image Retrieval System Based on Color Features A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris

More information

A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections

A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections A Rank-by-Feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections Jinwook Seo* and Ben Shneiderman Department of Computer Science & Human-Computer Interaction

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

Desicion Making in Multi-Objective Optimization for Industrial Applications - Data Mining and Visualization of Pareto Data

Desicion Making in Multi-Objective Optimization for Industrial Applications - Data Mining and Visualization of Pareto Data Desicion Making in Multi-Objective Optimization for Industrial Applications - Data Mining and Visualization of Pareto Data Katharina Witowski 1, Martin Liebscher 1, Tushar Goel 2 1 DYNAmore GmbH,Stuttgart,

More information

Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization

Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization adfa, p. 1, 2011. Springer-Verlag Berlin Heidelberg 2011 Devang Agarwal and Deepak Sharma Department of Mechanical

More information

Exploratory Data Analysis EDA

Exploratory Data Analysis EDA Exploratory Data Analysis EDA Luc Anselin http://spatial.uchicago.edu 1 from EDA to ESDA dynamic graphics primer on multivariate EDA interpretation and limitations 2 From EDA to ESDA 3 Exploratory Data

More information

A Novel Approach to Planar Mechanism Synthesis Using HEEDS

A Novel Approach to Planar Mechanism Synthesis Using HEEDS AB-2033 Rev. 04.10 A Novel Approach to Planar Mechanism Synthesis Using HEEDS John Oliva and Erik Goodman Michigan State University Introduction The problem of mechanism synthesis (or design) is deceptively

More information

Approximation Model Guided Selection for Evolutionary Multiobjective Optimization

Approximation Model Guided Selection for Evolutionary Multiobjective Optimization Approximation Model Guided Selection for Evolutionary Multiobjective Optimization Aimin Zhou 1, Qingfu Zhang 2, and Guixu Zhang 1 1 Each China Normal University, Shanghai, China 2 University of Essex,

More information

A Multiobjective Memetic Algorithm Based on Particle Swarm Optimization

A Multiobjective Memetic Algorithm Based on Particle Swarm Optimization A Multiobjective Memetic Algorithm Based on Particle Swarm Optimization Dr. Liu Dasheng James Cook University, Singapore / 48 Outline of Talk. Particle Swam Optimization 2. Multiobjective Particle Swarm

More information

Visualizing Multi-Dimensional Functions in Economics

Visualizing Multi-Dimensional Functions in Economics Visualizing Multi-Dimensional Functions in Economics William L. Goffe Dept. of Economics and International Business University of Southern Mississippi Hattiesburg, MS 3946 Bill.Goffe@usm.edu June, 1999

More information

Visualization of Pareto Front Points when Solving Multi-objective Optimization Problems

Visualization of Pareto Front Points when Solving Multi-objective Optimization Problems ISSN 9 4X, ISSN 884X (online) INFORMATION TECHNOLOGY AND CONTROL,, Vol.4, No.4 Visualization of Pareto Front Points when Solving Multi-objective Optimization Problems Olga Kurasova,, Tomas Petkus, Ernestas

More information

Evgeny Maksakov Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages:

Evgeny Maksakov Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages: Advantages and disadvantages: Today Problems with visualizing high dimensional data Problem Overview Direct Visualization Approaches High dimensionality Visual cluttering Clarity of representation Visualization is time consuming Dimensional

More information

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras

Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture 18 All-Integer Dual Algorithm We continue the discussion on the all integer

More information

NCGA : Neighborhood Cultivation Genetic Algorithm for Multi-Objective Optimization Problems

NCGA : Neighborhood Cultivation Genetic Algorithm for Multi-Objective Optimization Problems : Neighborhood Cultivation Genetic Algorithm for Multi-Objective Optimization Problems Shinya Watanabe Graduate School of Engineering, Doshisha University 1-3 Tatara Miyakodani,Kyo-tanabe, Kyoto, 10-031,

More information

Chapter 1. Using the Cluster Analysis. Background Information

Chapter 1. Using the Cluster Analysis. Background Information Chapter 1 Using the Cluster Analysis Background Information Cluster analysis is the name of a multivariate technique used to identify similar characteristics in a group of observations. In cluster analysis,

More information

Computer Experiments: Space Filling Design and Gaussian Process Modeling

Computer Experiments: Space Filling Design and Gaussian Process Modeling Computer Experiments: Space Filling Design and Gaussian Process Modeling Best Practice Authored by: Cory Natoli Sarah Burke, Ph.D. 30 March 2018 The goal of the STAT COE is to assist in developing rigorous,

More information

Creating a Basic Chart in Excel 2007

Creating a Basic Chart in Excel 2007 Creating a Basic Chart in Excel 2007 A chart is a pictorial representation of the data you enter in a worksheet. Often, a chart can be a more descriptive way of representing your data. As a result, those

More information

6 TOOLS FOR A COMPLETE MARKETING WORKFLOW

6 TOOLS FOR A COMPLETE MARKETING WORKFLOW 6 S FOR A COMPLETE MARKETING WORKFLOW 01 6 S FOR A COMPLETE MARKETING WORKFLOW FROM ALEXA DIFFICULTY DIFFICULTY MATRIX OVERLAP 6 S FOR A COMPLETE MARKETING WORKFLOW 02 INTRODUCTION Marketers use countless

More information

Graphical Methods in Linear Programming

Graphical Methods in Linear Programming Appendix 2 Graphical Methods in Linear Programming We can use graphical methods to solve linear optimization problems involving two variables. When there are two variables in the problem, we can refer

More information

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp Chris Guthrie Abstract In this paper I present my investigation of machine learning as

More information

Introduction to WHO s DHIS2 Data Quality Tool

Introduction to WHO s DHIS2 Data Quality Tool Introduction to WHO s DHIS2 Data Quality Tool 1. Log onto the DHIS2 instance: https://who.dhis2.net/dq Username: demo Password: UGANDA 2016 2. Click on the menu icon in the upper right of the screen (

More information

Reference Point-Based Particle Swarm Optimization Using a Steady-State Approach

Reference Point-Based Particle Swarm Optimization Using a Steady-State Approach Reference Point-Based Particle Swarm Optimization Using a Steady-State Approach Richard Allmendinger,XiaodongLi 2,andJürgen Branke University of Karlsruhe, Institute AIFB, Karlsruhe, Germany 2 RMIT University,

More information

Improved Pruning of Non-Dominated Solutions Based on Crowding Distance for Bi-Objective Optimization Problems

Improved Pruning of Non-Dominated Solutions Based on Crowding Distance for Bi-Objective Optimization Problems Improved Pruning of Non-Dominated Solutions Based on Crowding Distance for Bi-Objective Optimization Problems Saku Kukkonen and Kalyanmoy Deb Kanpur Genetic Algorithms Laboratory (KanGAL) Indian Institute

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

addition + =5+C2 adds 5 to the value in cell C2 multiplication * =F6*0.12 multiplies the value in cell F6 by 0.12

addition + =5+C2 adds 5 to the value in cell C2 multiplication * =F6*0.12 multiplies the value in cell F6 by 0.12 BIOL 001 Excel Quick Reference Guide (Office 2010) For your lab report and some of your assignments, you will need to use Excel to analyze your data and/or generate graphs. This guide highlights specific

More information

Lamarckian Repair and Darwinian Repair in EMO Algorithms for Multiobjective 0/1 Knapsack Problems

Lamarckian Repair and Darwinian Repair in EMO Algorithms for Multiobjective 0/1 Knapsack Problems Repair and Repair in EMO Algorithms for Multiobjective 0/ Knapsack Problems Shiori Kaige, Kaname Narukawa, and Hisao Ishibuchi Department of Industrial Engineering, Osaka Prefecture University, - Gakuen-cho,

More information

Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity

Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity Application of Clustering Techniques to Energy Data to Enhance Analysts Productivity Wendy Foslien, Honeywell Labs Valerie Guralnik, Honeywell Labs Steve Harp, Honeywell Labs William Koran, Honeywell Atrium

More information

GRAPHING BAYOUSIDE CLASSROOM DATA

GRAPHING BAYOUSIDE CLASSROOM DATA LUMCON S BAYOUSIDE CLASSROOM GRAPHING BAYOUSIDE CLASSROOM DATA Focus/Overview This activity allows students to answer questions about their environment using data collected during water sampling. Learning

More information

D&B Market Insight Release Notes. July 2016

D&B Market Insight Release Notes. July 2016 D&B Market Insight Release Notes July 2016 Table of Contents User Experience and Performance 3 Mapping.. 4 Visualizations.... 5 User Experience and Performance Speed Improvements Improvements have been

More information

Lithological and surface geometry joint inversions using multi-objective global optimization methods

Lithological and surface geometry joint inversions using multi-objective global optimization methods Lithological and surface geometry joint inversions using multi-objective global optimization methods Peter G. Lelièvre 1, Rodrigo Bijani and Colin G. Farquharson 1 plelievre@mun.ca http://www.esd.mun.ca/~peter/home.html

More information