I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making

Size: px
Start display at page:

Download "I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making"

Transcription

1 I ve Seen Enough : Increentally Iproving Visualizations to Support Rapid Decision Making Sajjadur Rahan Marya Aliakbarpour 2 Ha Kyung Kong Eric Blais 3 Karrie Karahalios,4 Aditya Paraeswaran Ronitt Rubinfield 2 University of Illinois (UIUC) 4 Adobe Research 2 MIT 3 University of Waterloo {srahan7,hkong6,kkarahal,adityagp}@illinois.edu {aryaa@,ronitt@csail}.it.edu eblais@uwaterloo.ca ABSTRACT Data visualization is an effective echanis for identifying trends, insights, and anoalies in data. On large datasets, however, generating visualizations can take a long tie, delaying the extraction of insights, hapering decision aking, and reducing exploration tie. One solution is to use online sapling-based schees to generate visualizations faster while iproving the displayed estiates increentally, eventually converging to the exact visualization coputed on the entire data. However, the interediate visualizations are approxiate, and often fluctuate drastically, leading to potentially incorrect decisions. We propose sapling-based increental visualization algoriths that reveal the salient features of the visualization quickly with a 46 speedup relative to baselines while iniizing error, thus enabling rapid and errorfree decision aking. We deonstrate that these algoriths are optial in ters of saple coplexity, in that given the level of interactivity, they generate approxiations that take as few saples as possible. We have developed the algoriths in the context of an increental visualization tool, titled INCVISAGE, for trendline and heatap visualizations. We evaluate the usability of INCVIS- AGE via user studies and deonstrate that users are able to ake effective decisions with increentally iproving visualizations, especially copared to vanilla online-sapling based schees.. INTRODUCTION Visualization is eerging as the ost coon echanis for exploring and extracting value fro datasets by novice and expert analysts alike. This is evidenced by the huge popularity of data visualization tools such as Microsoft s PowerBI and Tableau, both of which have 00s of illions of dollars in revenue just this year [5, 3]. And yet data visualization on increasingly large datasets, reains cubersoe: when datasets are large, generating visualizations can take hours, ipeding interaction, preventing exploration, and delaying the extraction of insights [36]. One approach to generating visualizations faster is to saple a sall nuber of datapoints fro the dataset online; by using sapling, we can view visualizations that increentally iprove over tie and eventually converge to the visualization coputed on the entire data. However, such interediate visualizations are approxiate, and often fluctuate drastically, leading to incorrect insights and conclusions. The key question we wish to address in this paper is the following: can we develop a sapling-based increental visualization algorith that reveals the features of the eventual visualization quickly, but does so in a anner that is guaranteed to be correct? Illustrative Exaple. We describe the goals of our sapling algoriths via a pictorial exaple. In Figure, we depict, in the first row, the variation of present sapling algoriths as tie progresses and ore saples are taken: at t, t 2, t 4, t 7, and when all of the data has been sapled. This is, for exaple, what visualizing the results of an online-aggregation-like [9] sapling algorith ight provide. If a user sees the visualization at any of the interediate tie points, they ay ake incorrect decisions. For exaple, at tie t, the user ay reach an incorrect conclusion that the values at the start and the end are lower than ost of the trend, while in fact, the opposite is true this anoaly is due to the skewed saples that were drawn to reach t. The visualization continues to vary at t 2, t 4, and t 7, with values fluctuating randoly based on the saples that were drawn. Indeed, a user ay end up having to wait until the values stabilize, and even then ay not be able to fully trust the results. One approach to aeliorate this issue would be to use confidence intervals to guide users in deciding when to draw conclusions however, prior work has deonstrated that users are not able to interpret confidence intervals correctly [4]. At the sae tie, the users are subject to the sae vagaries of the saples that were drawn. Figure : INCVISAGE exaple. Another approach, titled INCVISAGE, that we espouse in this paper and depict in the second row is the following: at each tie point t i, reveal one additional segent for a i-segent trendline, by splitting one of the segents for the trendline at t i, when the tool is confident enough to do so. Thus, INCVISAGE is very conservative at t and just provides a ean value for the entire range, then at t 2, it splits the single segent into two segents, indicating that the trend increases towards the end. Overall, by t 7, the tool has indicated any of the iportant features of the trend: it starts off high, has a bup in the iddle, and then increases towards the end. This approach reveals features of the eventual visualization in the order of proinence, allowing users to gain early insights and draw conclusions early. This approach is reiniscent of interlaced pixelbased iage generation in browsers [38], where the iage slowly goes fro being blurry to sharp over the course of the rendering, displaying the ost salient features of the visualization before the less iportant features. Siilar ideas have also been developed in other doains such as signal processing [47] and geo aps [3]. So far, we ve described trendlines; the INCVISAGE approach can be applied to heatap visualizations as well depicted in row 4 for the corresponding online-aggregation-like approach shown in row 3 as is typical in heataps, the higher the value, the darker the

2 color. Once again row 3 the status quo fluctuates treendously, not letting analysts draw eaningful insights early and confidently. On the other hand, row 4 the INCVISAGE approach repeatedly subdivides a block into four blocks when it is confident enough to do so, ephasizing early, that the right hand top corner has a higher value, while the values right below it are soewhat lower. In fact, the interediate visualizations ay be preferable because users can get the big picture view without being influenced by noise. Open Questions. Naturally, developing INCVISAGE brings a whole host of open questions, that span the spectru fro usability to algorithic process. First, it is not clear at what rate we should be displaying the results of the increental visualization algorith. When can we be sure that we know enough to show the ith increent, given that the (i )th increent has been shown already? How should the ith increent differ fro the (i )th increent? How do we prioritize sapling to ensure that we get to the ith increent as soon as possible, but with guarantees? Can we show that our algorith is in a sense optial, in that it ais to take as few saples as possible to display the ith increent with guarantees? And at the sae tie, how do we ensure that our algorith is lightweight enough that coputation doesn t becoe a bottleneck? How do we place the control in the user s hands in order to control the level of interactivity needed? The other open questions involved are related to how users interpret increental visualizations. Can users understand and ake sense of the guarantees provided? Can they use these guarantees to ake well-infored decisions and terinate early without waiting for the entire visualization to be generated? Contributions. In this paper, we address all of these open questions. Our priary contribution in the paper is the notion of increentally iproving visualizations that surface iportant features as they are deterined with high confidence bringing a concept that is coonly used in other settings, e.g., rasterization and signal processing, to visualizations. Given a user specified interactivity threshold (described later), we develop increental visualizations algoriths for INCVISAGE that iniizes error. We introduce the concept of iproveent potential to help us pick the right iproveent per increent. We find, soewhat surprisingly, that a rearkably siple algorith works best under a sub-gaussian assuption [42] about the data, which is reasonable to assue in real-world datasets (as we show in this paper). We further deonstrate that these algoriths are optial in that they generate approxiations within soe error bound given the interactivity threshold. When users don t provide their desired interactivity threshold, we can pick appropriate paraeters that help the best navigate the tradeoff between error and latency. We additionally perfor user studies to evaluate the usability of an increental visualization interface, and evaluate whether users are able to ake effective decisions with increentally iproving visualizations. We found that they are able to effectively deterine when to stop the visualization and ake a decision, trading off latency and error, especially when copared to traditional online sapling schees. Prior Work. Our work is copleentary to other work in the increental visualization space. In particular, sapleaction [6] and online aggregation [9] both perfor online sapling to depict aggregate values, along with confidence-interval style estiates to depict the uncertainty in the current aggregates. In both cases, however, these approaches prevent users fro getting early insights since they need to wait for the values to stabilize. As we will discuss later, our approach can be used in tande with online aggregationbased approaches. IFOCUS [30], PFunk-H [], and ExploreSaple [50] are other approxiate visualization algoriths targeted at generating visualizations rapidly while preserving perceptual insights. IFOCUS ephasizes the preservation of pairwise ordering of bars in a bar chart, as opposed to the actual values; PFunk-H uses perceptual functions fro graphical perception research to terinate visualization generation early; ExploreSaple approxiates scatterplots, ensuring that overall distributions and outliers are preserved. An early paper by Hellerstein et al. [20] proposes a siilar technique of progressive rendering for scatterplot visualizations by using index statistics to depict estiates of density before the data records are actually fetched. Lastly, M4 [27] uses rasterization to reduce the diensionality of a tie series without ipacting the resulting visualization. None of these ethods ephasize revealing features of visualizations increentally. Outline. The outline of the reainder of this paper is as follows: in Section 2 we forally define the increental visualization proble. Section 3 outlines our increental visualization algorith while Section 4 details the syste architecture of INCVISAGE. In Section 5 we suarize the experiental results and the key takeaways. Then we present the user study design and outcoes in Section 6 (for usability) and 7 (for coparison to traditional online sapling schees). Section 8 describes other related work on data visualization and analytics. 2. PROBLEM FORMULATION In this section, we first describe the data odel, and the visualization types we focus on. We then forally define the proble. 2. Visualizations and Queries Data and Query Setting. We operate on a dataset consisting of a single large relational table R. Our approach also generalizes to ultiple tables obeying a star or a snowflake scheata but we focus on the single table case for ease of presentation. As in a traditional OLAP setting, we assue that there are diension attributes and easure attributes diension attributes are typically used as group-by attributes in aggregate queries, while easure attributes are the ones typically being aggregated. For exaple, in a product sales scenario, day of the year would be a typical diension attribute, while the sales would be a typical easure attribute. INCVISAGE supports two kinds of visualizations: trendlines and heataps. These two types of visualizations can be translated to aggregate queries Q T and Q H respectively: Q T = SELECT X a, AVG(Y) FROM R GROUP BY X a ORDER BY X a, and Q H = SELECT X a, X b, AVG(Y) FROM R GROUP BY X a, X b ORDER BY X a, X b Here, X a and X b are diension attributes in R, while Y is a easure attribute being aggregated. The trendline and heatap visualizations are depicted in Figure. For trendlines (query Q T ), the attribute X a is depicted along the x-axis while the aggregate AVG(Y) is depicted along the y-axis. On the other hand, for heataps (query Q H) the two attributes X a and X b are depicted along the x-axis and y-axis, respectively. The aggregate AVG(Y) is depicted by the color intensity for each block (i.e., each X a, X b cobination) of the heatap. Give a continuous color scale, the higher the value of AVG(Y), the higher the color intensity. The coplete set of queries (including WHERE clauses and other aggregates) that are supported by INCVISAGE can be found in Section 3.5 we focus on the siple setting for now. Note that we are iplicitly focusing on X a and X b that are ordinal, i.e., have an order associated with the so that it akes sense to visualize the as a trendline or a heatap. As we will deonstrate subsequently, this order is crucial in letting us approxiate portions of the visualization that share siilar behavior. Sapling Engine. We assue that we have a sapling engine that can efficiently retrieve rando tuples fro R corresponding to different values of the diension attribute(s) X a and/or X b (along with optional predicates fro a WHERE). Focusing on Q T for

3 now, given a certain value of X a = a i, our engine provides us a rando tuple that satisfies X a = a i. Then, by looking up the value of Y corresponding to this tuple, we can get an estiate for the average of Y for X a = a i. Our sapling engine is drawn fro Ki et al. [30] and aintains an in-eory bitap on the diension attributes, allowing us to quickly identify tuples atching arbitrary conditions [3]. Bitaps are highly copressible and effective for read-only workloads [32, 49, 48], and have been applied recently to sapling for approxiate generation of bar charts [30]. Thus, given our sapling engine, we can retrieve saples of Y given X a = a i (or X a = a i X b = b i for heat aps). We call the ultiset of values of Y corresponding to X a = a i across all tuples to be a group. This allows us to say that we are sapling fro a group, where iplicitly we ean we are sapling the corresponding tuples and retrieving the Y value. Next, we describe our proble of increentally generating visualizations ore forally. We focus on trendlines (i.e., Q T ); the corresponding definitions and techniques for heataps are described in our extended technical report [2]. 2.2 Increental Visualizations Fro this point forward, we describe the concepts in the context of our visualizations in row 2 of Figure. Segents and k-segent Approxiations. We denote the cardinality of our group-by diension attribute X a as, i.e., X a =. In Figure this value is 36. At all tie points over the course of visualization generation, we display one value of AVG(Y) corresponding to each group x i X a, i... thus, the user is always shown a coplete trendline visualization. To approxiate our trendlines, we use the notion of segents that encopass ultiple groups. We define a segent as follows: Definition. A segent S corresponds to a pair (I, η), where η is a value, while I is an interval I [, ] spanning a consecutive sequence of groups x i X a. For exaple, the segent S ([2, 4], 0.7) corresponds to the interval of x i corresponding to x 2, x 3, x 4, and has a value of 0.7. Then, a k-segent approxiation of a visualization coprises k disjoint segents that span the entire range of x i, i =.... Forally: Definition 2. A k-segent approxiation is a tuple L k = ( ) S,..., S k such that the segents S,..., S k partition the interval [, ] into k disjoint sub-intervals. In Figure, at t 2, we display a 2-segent approxiation, with segents ([, 30], 0.4) and ([3, 36], 0.8); and at t 7, we display a 7- segent approxiation, coprising ([, 3], 0.8), ([4, 4], 0.4),..., and ([35, 36], 0.7). When the nuber of segents is unspecified, we siply refer to this as a segent approxiation. Increentally Iproving Visualizations. We are now ready to describe our notion of increentally iproving visualizations. Definition 3. An increentally iproving visualization is defined to be a sequence of segent approxiations, (L,..., L ), where the ith ite L i, i > in the sequence is a i-segent approxiation, fored by selecting one of the segents in the (i )- segent approxiation L i, and dividing that segent into two. Thus, the segent approxiations that coprise an increentally iproving visualization have a very special relationship to each other: each one is a refineent of the previous, revealing one new feature of the visualization and is fored by dividing one of the segents S in the i-segent approxiation into two new segents to give an (i + )-segent approxiation: we call this process splitting a segent. The group within S L i iediately following which the split occurs is referred to as a split group. Any group in the interval I S except for the last group can be chosen as a split group. As an exaple, in Figure, the entire second row corresponds to an increentally iproving visualization, where the 2-segent approxiation is generated by taking the segent in the -segent approxiation corresponding to ([, 36], 0.5), and splitting it at group 30 to give ([, 30], 0.4) and ([3, 36], 0.8). Therefore, the split group is 30. The reason why we enforce two neighboring segent approxiations to be related in this way is to ensure that there is continuity in the way the visualization is generated, aking it a sooth user experience. If, on the other hand, each subsequent segent approxiation had no relationship to the previous one, it could be a very jarring experience for the user with the visualizations varying drastically, aking it hard for the to be confident in their decision aking. Underlying Data Model and Output Model. To characterize the perforance of an increentally iproving visualization, we need a odel for the underlying data. We represent the underlying data as a sequence of distributions D,..., D with eans µ,..., µ where, µ i = AVG(Y) for x i X a. To generate our increentally iproving visualization and its constituent segent approxiations, we draw saples fro distributions D,..., D. Our goal is to approxiate the ean values (µ,..., µ ) while taking as few saples as possible. The output of a k-segent approxiation L k can be represented alternately as a sequence of values (ν,..., ν ) such that ν i is equal to the value corresponding to the segent that encopasses x i, i.e., xi S j ν i = η j, where S j(i, η j) L k. By coparing (ν,..., ν ) with (µ,..., µ ), we can evaluate the error corresponding to a k-segent approxiation, as we describe later. Increentally Iproving Visualization Generation Algorith. Given our data odel, an increentally iproving visualization generation algorith proceeds in iterations: at the ith iteration, this algorith takes soe saples fro the distributions D,..., D, and then at the end of the iteration, it outputs the i-segent approxiation L i. We denote the nuber of saples taken in iteration i as N i. When L is output, the algorith terinates. 2.3 Characterizing Perforance There are ultiple ways to characterize the perforance of increentally iproving visualization generation algoriths. Saple Coplexity, Interactivity and Wall-Clock Tie. The first and ost straightforward way to evaluate perforance is by easuring the saples taken in each iteration k, N k, i.e., the saple coplexity. Since the tie taken to acquire the saples is proportional to the nuber of saples in our sapling engine (as shown in [30]), this is a proxy for the tie taken in each iteration. Recent work has identified 500s as a rule of thub for interactivity in exploratory visual data analysis [36], beyond which analysts end up getting frustrated, and as a result explore fewer hypotheses. To enforce this rule of thub, we can ensure that our algoriths take only as any saples per iteration as is feasible within 500s a tie budget. We also introduce a new etric called interactivity that quantifies the overall user experience: λ = k= N k ( k + ) where N k is the nuber of saples taken at iteration k and k is the nuber of iterations where N k > 0. The larger the λ, the saller the interactivity: this easure essentially captures the average waiting tie across iterations where saples are taken. We explore the easure in detail in Section 3.4 and 5.5. A ore direct way to evaluate perforance is to easure the wall-clock tie for each iteration. Error Per Iteration. Since our increentally iproving visualization algoriths trade off taking fewer saples to return results faster, it can also end up returning segent approxiations that are incorrect. We define the l 2 squared error of a k-segent approxiation L k with output sequence (ν,..., ν ) for the distributions k

4 D,..., D with eans µ,..., µ as err(l k ) = (µ i ν i) 2 () We note that there are several reasons a given k-segent approxiation ay be erroneous with respect to the underlying ean values (µ,..., µ ): () We are representing the data using k- segents as opposed to -segents. (2) We use increental refineent: each segent approxiation builds on the previous, possibly erroneous estiates. (3) The estiates for each group and each segent ay be biased due to sapling. These types of error are all unavoidable the first two reasons enable a visualization that increentally iproves over tie, while the last one occurs whenever we perfor sapling: () While a k-segent approxiation does not capture the data exactly, it provides a good approxiation preserving visual features, such as the overall ajor trends and the regions with a large value. Moreover, coputing an accurate k-segent approxiation requires fewer saples and therefore faster than an accurate -segent approxiation. (2) Increental refineent allows users to have a fluid experience, without the visualization changing copletely between neighboring approxiations. At the sae tie, is not uch worse in error than visualizations that change copletely between approxiations, as we will see later. 2.4 Proble Stateent The goal of our increentally iproving visualization generation algorith is, at each iteration k, generate a k-segent approxiation that is not just close to the best possible refineent at that point, but also takes the least nuber of saples to generate such an approxiation. Further, since the decisions ade by our algorith can vary depending on the saples retrieved, we allow the user to specify a failure probability δ, which we expect to be close to zero, such that the algorith satisfies the closeness guarantee with probability at least δ. Proble. Given a query Q T, and the paraeters δ, ɛ, design an increentally iproving visualization generation algorith that, at each iteration k, returns a k-segent approxiation while taking as few saples as possible, such that with probability δ, the error of the k-segent approxiation L k returned at iteration k does not exceed the error of the best k-segent approxiation fored by splitting a segent of L k by no ore than ɛ. 3. VISUALIZATION ALGORITHMS In this section, we gradually build up our solution to Proble. We start with the ideal case when we know the eans of all of the distributions up-front, and then work towards deriving an error guarantee for a single iteration. Then, we propose our increentally iproving visualization generation algorith ISplit assuing the sae guarantee across all iterations. We further discuss how we can tune the guarantee across iterations in Section 3.4. We consider extensions to other query classes in Section 3.5. We can derive siilar algoriths and guarantees for heataps [2]. All the proofs can be found in the technical report [2]. 3. Case : The Ideal Scenario We first consider the ideal case where the eans µ,..., µ of the distributions D,..., D are known. Then, our goal reduces to obtaining the best k segent approxiation of the distributions at iteration k, while respecting the refineent restriction. Say the increentally iproving visualization generation algorith has obtained a k-segent approxiation L k at the end of iteration k. Then, at iteration (k + ), the task is to identify a segent S i L k such that splitting S i into two new segents T and U iniizes the i= error of the corresponding (k + )-segent approxiation L k+. We describe the approach, followed by its justification. Approach. At each iteration, we split the segent S i L k into T U T and U that axiizes the quantity S i (µt µu )2. Intuitively, this quantity defined below as the iproveent potential of a refineent picks segents that are large, and within the, splits where we get roughly equal sized T and U, with large differences between µ T and µ U. Justification. The l 2 squared error of a segent S i (I i, η i), where I i = [p, q] and p q, for the distributions D p,..., D q with eans µ p,..., µ q is err(s i ) = (q p + ) q j=p (µ j η i ) 2 = S i j S i (µ j η i ) 2 Here, S i is the nuber of groups (distributions) encopassed by segent S i. When the eans of the distributions are known, err(s i) will be iniized if we represent the value of segent S i as the ean of the groups encopassed by S i, i.e., setting η i = µ Si = j S i µ j/ S i iniizes err(s i). Therefore, in the ideal scenario, the error of the segent S i is err(s i ) = S i (µ j η i ) 2 = S j S i i j S i µ 2 j µ2 S i (2) Then, using Equation, we can express the l 2 squared error of the k-segent approxiation L k as follows: err(l k ) = k (µ i ν i ) 2 S i = err(s i) i= Now, L k+ is obtained by splitting a segent S i L k into two segents T and U. Then, the error of L k+ is: err(l k+ ) = err(l k ) S i err(s i) + T U err(t ) + err(u) T U = err(l k ) S i (µ T µ U ) 2. We use the above expression to define the notion of iproveent potential. The iproveent potential of a segent S i L k is the iniization of the error of L k+ obtained by splitting S i into T and U. Thus, the iproveent potential of segent S i relative to T and U is T U (S i, T, U) = S i (µt µu )2 (3) For any segent S i = (I i, η i), every group in the interval I i except the last one can be chosen to be the split group (see Section 2.2). The split group axiizing the iproveent potential of S i, iniizes the error of L k+. The axiu iproveent potential of a segent is expressed as follows: T U (S i) = ax (S i, T, U) = ax (µt µu )2 T,U S i T,U S i S i Lastly, we denote the iproveent potential of a given L k+ by φ(l k+, S i, T, U), where φ(l k+, S i, T, U) = (S i, T, U). Therefore, the axiu iproveent potential of L k+, φ (L k+ ) = ax Si L k (S i). When the eans of the distributions are known, at iteration (k + ), the optial algorith siply selects the refineent corresponding to φ (L k+ ), which is the segent approxiation with the axiu iproveent potential. 3.2 Case 2: The Online-Sapling Scenario We now consider the case where the eans µ,..., µ are unknown. Siilar to the previous case, we want to identify a segent S i L k such that splitting S i into T and U results in the axiu iproveent potential. We will first describe our approach for a given iteration assuing saples have been taken, then we will describe our approach for selecting saples. i=

5 3.2. Selecting the Refineent Given Saples Approach. Unfortunately, since the eans are unknown, we cannot easure the exact iproveent potential, so we iniize the epirical iproveent potential based on saples seen so far. Analogous to the previous section, we siply pick the refineent that T U axiizes S i ( µt µu )2, where the µs are the epirical estiates of the eans. Justification. At iteration (k + ), we first draw saples fro the distributions D,..., D to obtain the estiated eans µ,..., µ. For each S i L k, we set its value to η i = µ Si = j S i µ j/ S i, which we call the estiated ean of S i. For any refineent L k+ of L k, we then let the estiated iproveent potential of L k+ be φ(l k+, S i, T, U) = T µ 2 T + U µ 2 U S µ 2 T U S = i S i ( µ T µ U ) 2 For siplicity we denote φ(l k+, S i, T, U) and φ(l k+, S i, T, U) as φ(l k+ ) and φ(l k+ ), respectively. Our goal is to get a guarantee for err(l k+ ) is relative to err(l k ). Instead of a guarantee on the actual error err, for which we would need to know the eans of the distributions, our guarantee is instead on a new quantity, err, which we define to be the epirical error. Given S i = (I i, η i) and Equation 2, err is defined as follows: err (S i) = S i j S i µ 2 j ηi 2. Although err (S i) is not equal to err(s i) when η i µ S, err (S i) converges to err(s i) as η i gets closer to µ S (i.e., as ore saples are taken). Siilarly, the error of k-segent approxiation err (L k ) converges to err(l k ). We show experientally (see Section 5) that optiizing for err (L k+ ) gives us a good solution of err(l k+ ) itself. To derive our guarantee on err, we need one ore piece of terinology. At iteration (k+), we define T (I, η), where I = [p, q] and p q to be a boundary segent if either p or q is a split group in L k. In other words, at the iteration (k + ), all the segents in L k and all the segents that ay appear in L k+ after splitting a segent are called boundary segents. Next, we show that if we estiate the boundary segents accurately, then we can find a split which is very close to the best possible split. Theore. If for every boundary segent T of the k-segent approxiation L k, we obtain an estiate µ T of the ean µ T that satisfies 2 µ T µ T 2 ɛ, then the refineent 6 T L k+ of L k that axiizes the estiated value φ(l k+ ) will have error that exceeds the error of the best refineent L k+ of L k by at ost err (L k+ ) err (L k+) ɛ Deterining the Saple Coplexity To achieve the guarantee for Theore, we need to retrieve a certain nuber of saples fro each of the distributions. Approach. Perhaps soewhat surprisingly, we find that we need to draw a constant C = 288 a σ 2 ɛ 2 ln ( 4 δ ) fro each distribution D i to satisfy the requireents of Theore. (We will define these paraeters subsequently.) Thus, our sapling approach is rearkably siple and is essentially unifor sapling plus, as we show in the next subsection, other approaches cannot provide significantly better guarantees. What is not siple, however, is showing that taking C saples allows us to satisfy the requireents for Theore. Another benefit of unifor sapling is that we can switch between showing our segent approxiations or showing the actual running estiates of each of the groups (as in online aggregation [9]) for the latter purpose, unifor sapling is trivially optial. Justification. To justify our choice, we assue that the data is generated fro a sub-gaussian distribution [42]. Sub-Gaussian distributions for a general class of distributions with a strong tail decay property, with its tails decaying at least as rapidly as the tails of a Gaussian distribution. This class encopasses Gaussian distributions as well as distributions where extree outliers are rare this is certainly true when the values derive fro real observations that are bounded. We validate this in our experients. Therefore, we represent the distributions D,..., D by sub-gaussian distributions with ean µ i and sub-gaussian paraeter σ. Given this generative assuption, we can deterine the nuber of saples required to obtain an estiate with a desired accuracy using Hoeffding s inequality [2]. Given Hoeffding s inequality along with the union bound, we can derive an upper bound on the nuber of saples we need to estiate the ean µ i of each x i. Lea. For a fixed δ > 0 and a k-segent approxiation of the distributions D,..., D represented by independent rando saples x,..., x with sub-gaussian paraeter σ 2 and ean 288 a σ 2 ɛ 2 µ i [0, a] if we draw C = ln ( ) 4 saples uniforly δ fro each x i, then with probability at least δ, 2 µ T µ T 2 ɛ for every boundary segent T of L 6 T k Deriving a Lower bound We can derive a lower bound for the saple coplexity of any algorith for Proble : Theore 2. Say we have a dataset D of groups with eans (µ, µ 2,..., µ ). Assue there exists an algorith A that finds a k-segent approxiation which is ɛ-close to the optial k-segent approxiation with probability 2/3. For sufficiently sall ɛ, A draws Ω( k/ɛ 2 ) saples. The theore states that any algorith that outputs a k-segent approxiation of which is ɛ-close to a dataset has to draw O( k/ɛ 2 ) saples fro the dataset. 3.3 The ISplit Algorith We now present our increentally iproving visualization generation algorith ISplit. Given the paraeters ɛ and δ, ISplit aintains the sae guarantee of error (ɛ) in generating the segent approxiations in each iteration. Theore and Lea suffice to show that the ISplit algorith is a greedy approxiator, that, at each iteration identifies a segent approxiation that is at least ɛ close to the best segent approxiation for that iteration. Data: X a, Y, δ, ɛ Start with the -segent approxiator L = (L ). 2 for k = 2,..., do 3 L k = (S,..., S k ). 4 for each S i L k do 5 for each group q S p do 6 Draw C saples uniforly. Copute ean µ q end 7 Copute µ Si = S i q S µ q i end T U 8 Find (T, U) = argax i; T,U Si S i ( µ T µ U ) 2. 9 Update L k+ = S,..., S i, T, U, S i+,..., S k. end Algorith : ISplit The paraeter ɛ is often difficult for end-users to specify. Therefore, in INCVISAGE, we allow users to instead explicitly specify an expected tie budget per iteration, B as explained in Section 2.3, the nuber of saples taken deterines the tie taken per iteration, so we can reverse engineer the nuber of saples fro B. So, given the sapling rate f of the sapling engine (see Section 2.) and B, the saple coplexity per iteration per group is siply C = B f/. Using Lea, we can copute the corresponding ɛ. Another way of setting B is to use standard rules of thub for interactivity (see Section 2.3).

6 3.4 Tuning ɛ Across Iterations So far, we have considered only the case where the algorith k= T k k, where k takes the sae nuber of saples C per group across iterations and does not reuse the saples across different iterations. If we instead reuse the saples drawn in previous iterations, the error at iteration k is ɛ a σ2 k = ln ( ) 4 kc δ, where k C is the total nuber of saples drawn so far. Therefore, the decrease in the error up to iteration k, ɛ k, fro the error up to iteration (k ), ɛ k, is (k )/k, where ɛ k = (k )/k ɛ k. This has the following effect: later iterations both take the sae aount of tie as previous ones, and produce only inor updates of the visualization. Sapling Approaches. This observation indicates that we can obtain higher interactivity with only a sall effect on the accuracy of the approxiations by considering variants of the algorith that decrease the nuber of saples across iterations instead of drawing the sae nuber of saples in each iteration. We consider three natural approaches for deterining the nuber of saples we draw at each iteration: linear decrease (i.e., reduce the nuber of saples by β at each iteration), geoetric decrease (i.e., divide the nuber of saples by α at each iteration), and all-upfront (i.e., non-interactive) sapling. To copare these different approaches to the constant (unifor) sapling approach and aongst each other, we first copute the total saple coplexity, interactivity, and error guarantees of each of the as a function of their other paraeters. Letting T k denote the total nuber of saples drawn in the first k iterations, the interactivity of a sapling approach defined in Section 2.3 can be written as: λ = is the nuber of iterations we take non-zero saples. The error guarantees we consider are, as above, the average error over all iterations. This error guarantee is err = ɛ 2 k = i= i= 288 a σ 2 2 T k ln ( 4 δ ) = 288 a σ 2 2 ln ( ) 4 δ i= We provide evidence that the estiated err and λ are siilar to err and λ on real datasets in Section We are now ready to derive the expressions for both error and interactivity for the sapling approaches entioned earlier. Expressions for Error and Interactivity. The analytical expressions of both interactivity and error for all of these four approaches are shown in Table and derived in the technical report [2]. In particular, for succinctness, we have left the forulae for Error for geoetric and linear decrease as is without siplification; on substituting T k into the expression for Error, we obtain fairly coplicated forulae with dozens of ters, including ultiple uses of the q-digaa function Ψ [25] for the geoetric decrease case, the Euler-Mascheroni constant [44] for the linear decrease case, and the Haronic nuber H for the constant sapling case: these expressions are provided in the technical report [2]. Coparing the Sapling Approaches. We first exaine the allupfront sapling approach. We find that the all-upfront sapling approach is strictly worse than constant sapling (which is a special case of linear and geoetric decrease). Theore 3. If for a setting of paraeters, a constant sapling approach and an all-upfront sapling approach have the sae interactivity, then the error of constant sapling is strictly less than the error of all-upfront sapling. Our experiental results in Section 5.5 suggests that the geoetric decrease approach with the optial choice of paraeter α has better error than not only the all-upfront approach but the linear decrease and constant sapling approaches as well. This reains an open question, but when we fix the initial saple coplexity N (which is proportional to the bound on the axiu tie taken per iteration as provided by the user), we can show that geoetric decrease with the right choice of paraeter α does have the optial T k interactivity aong the three interactive approaches. Theore 4. Given N, the interactivity of geoetric decrease with paraeter α = (N ) /( ) is strictly better than the interactivity of any linear decrease approach and constant sapling. The proof is established by first showing that for both linear and geoetric decrease, the optial interactivity is obtained for an explicit choice of decrease paraeter that depends only on the initial nuber of saples and the total nuber of iterations. Lea 2. In geoetric sapling with a fixed N, α = (N ) /( ) has the optial interactivity and it has saller error than all of the geoetric sapling approaches with α > α. Lea 3. In linear sapling with a fixed N, β = (N )/( ) has the optial interactivity and it has strictly saller error than all of the linear sapling approaches with β > β. Optial α and Knee Region. As shown in Lea 2, given N, we can copute the optial decrease paraeter α, such that any value of α > α results in higher error and lesser interactivity (higher λ). This behavior results into the eergence of a knee region in the error-interactivity curve which is confired in our experiental results (see Section 5.5). Essentially, starting fro α = any value α α has saller error than any value α > α. Therefore, for any given N there is an optial region [, α ]. For exaple, for N = 5000, 25000, 5000, and 0000, the optial range of α is [,.024], [,.028], [,.03], and [,.032], respectively. By varying α along the optial region one can either optiize for error or interactivity. For exaple, starting with α = as α α we trade accuracy for interactivity. 3.5 Extensions The increentally iproving visualization generation algorith described previously for siple queries with the AVG aggregation function can also be extended to support aggregations such as COUNT and SUM, as well as additional predicates (via a WHERE clause) details and proofs can be found in the technical report [2] The COUNT Aggregate Function Given that we know the total nuber of tuples in the database, estiating the COUNT aggregate function essentially corresponds to the proble of estiating the fraction of tuples τ i that belong to each group i. We focus on the setting below when we only use our bitap index. We note that approxiation techniques for COUNT queries have also been studied previously [8, 24, 26], for the case when no indexes are available. Using our sapling engine, we can estiate the fractions τ i by scanning the bitap index for each group. When we retrieve another saple fro group i, we can also estiate the nuber of tuples we needed to skip over to reach the tuple that belongs to group i the indices allow us to estiate this inforation directly, providing us an estiate for τ i, i.e., τ i. Theore 5. With an expected total nuber of saples C count = + γ 2 ln(2/δ)/2, the τ i, i can be estiated to within a factor γ, i.e., τ i τ i γ holds with probability at least δ. Essentially, the algorith takes as any saples fro each group until the index of the saple is γ 2 ln(2/δ)/2. We can show that overall, the expected nuber of saples is bounded above by C count. Since this nuber is sall, we don t even need to increentally estiate or visualize COUNT The SUM Aggregate Function. There are two settings we consider for the SUM aggregate function: when the group sizes (i.e., the nuber of tuples in each group) are known, and when they are unknown. Known Group Sizes. A siple variant of Algorith can also be used to approxiate SUM in this case. Let n i be the size of

7 Approach decrease paraeter T k Error Interactivity Linear Decrease β N k k (k ) β 2 A ( ) k= T N k N k 2 + β [ 6 2k 2 3k + 3k + 3 ] α Geoetric Decrease α N k α k A [ (α ) k= N (α ) α +α ] T k (α ) 2 α Constant Sapling α =, or β = 0 kn A H N N (+) 2 A All-upfront T k= = N and T k> = 0 N N 288 a σ2 Table : Expressions for error and interactivity for different sapling approaches. Here, A = 2 group i and κ = ax j n j. As in the original setting, the algorith coputes the estiate µ i of the average of the group. Then s i = n i µ i is an estiate on the su of each group i, naely s i, that is used in place of the average in the rest of the algorith. We have: Theore 6. Assue we have C i = 288a2 σ 2 n 2 i ɛ 2 κ 2 ln 4 δ saples fro group i. Then, the refineent L k+ of L k that axiizes the estiated iproveent potential φ + (L k+ ) will have error that exceeds the error of the best refineent L k+ of L k by at ost err + (L k+ ) err + (L k+) ɛκ 2. Note that here we have ɛκ 2 instead of ɛ: while at first glance this ay see like an aplification of the error, it is not so: first, the scales are different and visualizing the SUM is like visualizing AVG ultiplied by κ an error by one pixel for the forer would equal κ ties the error by one pixel for the latter but are visually equivalent; second, our error function uses a squared l 2- like distance, explaining the κ 2. Unknown Group Sizes. For this case, we siply eploy the known group size case, once the COUNT estiation is used to first approxiate the τ i with γ = ɛ/24a. We have s i = τ i µ iκ t, where κ t denotes the total nuber of ites: i= ni. Theore 7. Assue we have C i = 52 a2 σ 2 τ 2 i ɛ 2 ln 4 δ saples fro group i. Then, the refineent L k+ of L k that axiizes the estiated φ + (L k+ ) will have error that exceeds the error of the best refineent L k+ of L k by at ost err + (L k+ ) err + (L k+) ɛκ 2 t. Q T = SELECT X a, AVG(Y) FROM R GROUP BY X a ORDER BY X a WHERE Pred 4. INCVISAGE SYSTEM ARCHITECTURE The INCVISAGE client is a web-based front-end that captures user input and renders visualizations produced by the INCVISAGE back-end. The INCVISAGE back-end is coposed of three coponents: (A) a query parser, which is responsible for parsing the input query Q T or Q H (see Section 2.), (B) a view processor, which executes ISplit (see Section 3), and (C) a sapling engine which returns the saples requested by the view processor at each iteration. As discussed in Section 2., INCVISAGE uses a bitap-based sapling engine to retrieve a rando saple of records atching a set of ad-hoc conditions [30]. At the end of each iteration, the view processor sends the visualizations generated to the front-end encoded in json. To reduce the aount of json data transferred, only the changes in the visualization are counicated to the front-end. The front-end is responsible for capturing user input and rendering visualizations generated by the INCVISAGE back-end. The visualizations (i.e., the segent approxiations) generated by the back-end increentally iprove over tie, but the users can pause and replay the visualizations on deand. The details of the frontend and screenshots, along with an architecture diagra can be found the technical report [2] 5. PERFORMANCE EVALUATION ln ( ) 4 δ We evaluate our algoriths on three priary etrics: the error of the visualizations, the degree of correlation with the best algorith in choosing the split groups, and the runtie perforance. We also discuss how the choice of the initial saples N and sapling factor α ipacts the error and interactivity. 5. Experiental Setup Algoriths Tested. Each of the increentally iproving visualization generation algoriths that we evaluate perfors unifor sapling, and take either B (tie budget) and f (sapling rate of the sapling engine) as input, or ɛ (desired error threshold) and δ (the probability of correctness) as input, and in either case coputes the required N and α. The algoriths are as follows: ISplit: At each iteration k, the algorith draws N k saples uniforly fro each group, where N k is the total nuber of saples requested at iteration k and is the total nuber of groups. ISplit uses the concept of iproveent potential (see Section 3) to split an existing segent into two new segents. RSplit: At each iteration k, the algorith takes the sae saples as ISplit but then selects a segent, and a split group within that, all at rando to split. ISplit-Best: The algorith siulates the ideal case where the ean of all the groups are known upfront (see Section 3.). The visualizations generated have the lowest possible error at any given iteration (i.e. for any k-segent approxiation) aong approaches that perfor refineent of previous approxiation. DPSplit: We describe the details of this algorith in our technical report [2], but at a high level, at each iteration k, this algorith takes the sae saples as ISplit, but instead of perforing refineent, DPSplit recoputes the best possible k-segent approxiation using dynaic prograing. We include this algorith to easure the ipact of the iterative refineent constraint. DPSplit-Best: This algorith siulates the case where the eans of all the groups are known upfront, and the sae approach for producing k-segent approxiations used by DPSplit is used. The visualizations generated have the lowest possible error aong the algoriths entioned above since they have advance knowledge of the eans, and do not need to obey iterative refineent. Nae Description #Rows Size (GB) E U Sensor Intel Sensor dataset []. 2.2M 0.73 FL US Flight dataset [6]. 74M 7.2 T 20 NYC taxi trip data [4] 70M 6.3 T2 202 NYC taxi trip data [4] 66M 4.5 T3 203 NYC taxi trip data [4] 66M 4.3 WG Weather data of US [7] 45M 27 Table 2: Datasets Used for the Experients and User Studies. E = Experients and U = User Studies (Section 6 and 7). Datasets and Queries. The datasets used in the perforance evaluation experients and the user studies (Section 6 and Section 7) are shown in Table 2 and are arked by ticks ( ) to indicate where they were used. For the US flight dataset (FL) we show the results for the attribute Arrival Delay. Since all of the three years exhibit siilar results for the NYC taxi dataset, we only present the results for the year 20 (T) for the attribute Trip Tie. For the weather dataset, we show results for the attribute Teperature. The results for the other datasets are included in our technical report [2]. To verify our sub-gaussian assuption, we generated a Q-Q plot [46] for each of the selected attributes to copare the

8 distributions with Gaussian distributions. The FL, T, T2, and T3 datasets all exhibit right-skewed Gaussian distributions while WG exhibits a truncated Gaussian distribution [2]. Unless stated explicitly, we use the sae query in all the experients calculate the average of an attribute at each day of the year. Metrics. We use the following etrics to evaluate the algoriths: Error: We easure the error of the visualizations generated at each iteration k via err(l k ) (see Section 2.3). The average error across iterations is coputed as: ẽrr(l k ) = k= err(l k). Tie: We also evaluate the wall-clock execution tie. Correlation: We use Spearan s ranked correlation coefficient to easure the degree of correlation between two algoriths in choosing the order of groups as split groups. Interactivity: We use a new etric called interactivity (defined in Section 2.3) to select appropriate paraeters for ISplit. Interactivity is essentially the average waiting tie for generating the segent approxiations. We explore the easure in Section 5.5. Ipleentation Setup. We evaluate the runtie perforance of all our algoriths on a bitap-based sapling engine [3]. In addition, we ipleent a SCAN algorith which perfors a sequential scan of the entire dataset. This is the approach a traditional database syste would use. Since both ISplit and SCAN are ipleented on the sae engine, we can ake direct coparisons of execution ties between the two algoriths. All our experients are single threaded and are conducted on a HP-Z230-SFF workstation with an Intel Xeon E3-240 CPU and 6GB eory running Ubuntu 2.04 LTS. We set the disk read block size to 256KB. Unless explicitly stated we assue the tie budget B = 500s and use the paraeter values of N = 25000, α =.02 (with a geoetric decrease), and δ = All experients were averaged over 30 trials. 5.2 Coparing Execution Tie In this section, we copare the Wall Clock tie of ISplit, DPSplit and SCAN for the datasets entioned in Table 2. Suary: ISplit is several orders of agnitude faster than SCAN in revealing increental visualizations. The copletion tie of DPSplit exceeds the copletion tie of even SCAN. Moreover, when generating the early segent approxiations, DPSplit always exhibits higher latency copared to ISplit. Wall Clock Tie (sec) SCAN Copletion-DPSplit Iteration = 00-DPSplit Iteration = 50-DPSplit Iteration = 0-DPSplit Copletion-ISplit Iteration = 00-ISplit Iteration = 50-ISplit Iteration = 0-ISplit FL (70) T (70) WG (45) Datasets Figure 2: Coparison of Wall Clock Tie. We depict the wall-clock tie in Figure 2 for three different datasets for ISplit and DPSplit at iteration 0, 50, and 00, and at copletion, and for SCAN. First note that as the size of the dataset increases, the tie for SCAN drastically increases since the aount of data that needs to be scanned increases. On the other hand, the tie for copletion for ISplit stays stable, and uch saller than SCAN for all datasets: on the largest dataset, the copletion tie is alost 6 th that of SCAN. When considering earlier iterations, ISplit perfors even better, revealing the first 0, 50, and 00 segent approxiations within 5 seconds, 3 seconds, and 22 seconds, respectively, for all datasets, allowing users to get insights early by taking a sall nuber of saples. Copared to SCAN, the speed-up in revealing the first 0 features of the visualization is 2X, 22X, and 46X for the FL, T and WG datasets. When coparing ISplit to DPSplit, we first note that DPSplit is taking the sae saples as ISplit, so its differences in execution tie are priarily due to coputation tie. We see soe strange behavior in that while DPSplit is worse than ISplit for the early iterations, for copletion it is uch worse, and in fact worse than SCAN. The dynaic prograing coputation coplexity depends on the nuber of segents. Therefore, the coputation starts occupying a larger fraction of the overall wall-clock tie for the latter iterations, rendering DPSplit ipractical for latter iterations. Even for earlier iterations, DPSplit is worse than ISplit, revealing the first 0, 50, and 00 features within 7 seconds, 27 seconds, and 75 seconds, respectively, as opposed to 5, 3, and 22 for ISplit. As we will see later, this additional tie does not coe with a coensurate decrease in error, aking DPSplit uch less desirable than ISplit as an increentally iproving visualization generation algorith. 5.3 Increental Iproveent Evaluation We now copare the error for ISplit with RSplit and ISplit-Best. Suary: (a) The error of ISplit, RSplit, and ISplit-Best reduce across iterations. At each iteration, ISplit exhibits lower error in generating visualizations than RSplit. (b) ISplit exhibits higher correlation with ISplit-Best in the choice of split groups, with 0.9 for any N greater than RSplit has near-zero correlation overall. Figure 3 depicts the iterations on the x-axis and the l 2-error on the y axis of the corresponding segent approxiations for each of the algoriths for two datasets (others are siilar). For exaple, in Figure 3, at the first iteration, all three algoriths obtain the - segent approxiation with roughly siilar error; and as the nuber of iterations increase, the error continues to decrease. The error for ISplit is lower than RSplit throughout, for all datasets, justifying our choice of iproveent potential as a good etric to guide splitting criteria. ISplit-Best has lower error than the other two, this is because ISplit-Best has access to the eans for each group beforehand. We also explore the correlation of the split group order of ISplit and RSplit with ISplit-Best on varying the saple coplexity. Due to space liitations, the results can be found in our technical report [2] where we show that ISplit s split groups actually atch those fro ISplit-Best, even if the error is a bit higher. err(lk) ISplit RSplit ISplit-Best Iteration a. FL err(lk) ISplit RSplit ISplit-Best Iteration b. T Figure 3: l 2 squared error of ISplit, Rsplit, and ISplit-Best. 5.4 Releasing the Refineent Restriction We now copare ISplit with DPSplit and DPSplit-Best we ai to evaluate the ipact of the refineent restriction, and whether it leads to substantially lower error. Due to space liitations we again present the experients in our technical report [2]. Suary: Given the sae set of paraeters, DPSplit and ISplit have roughly siilar error; as expected DPSplit-Best has uch lower error than both ISplit and DPSplit. Thus, in order to reduce the drastic variation of the segent approxiations, while not having a significant ipact on error, ISplit is a better choice than DPSplit. 5.5 Optiizing for Error and Interactivity So far, we have kept the sapling paraeters fixed across algoriths; here we study the ipact of these paraeters. Specifically,

I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making

I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Maing Sajjadur Rahman Maryam Aliabarpour Ha Kyung Kong Eric Blais 3 Karrie Karahalios,4 Aditya Parameswaran Ronitt Rubinfield

More information

Predicting x86 Program Runtime for Intel Processor

Predicting x86 Program Runtime for Intel Processor Predicting x86 Progra Runtie for Intel Processor Behra Mistree, Haidreza Haki Javadi, Oid Mashayekhi Stanford University bistree,hrhaki,oid@stanford.edu 1 Introduction Progras can be equivalent: given

More information

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering Clustering Cluster Analysis of Microarray Data 4/3/009 Copyright 009 Dan Nettleton Group obects that are siilar to one another together in a cluster. Separate obects that are dissiilar fro each other into

More information

Identifying Converging Pairs of Nodes on a Budget

Identifying Converging Pairs of Nodes on a Budget Identifying Converging Pairs of Nodes on a Budget Konstantina Lazaridou Departent of Inforatics Aristotle University, Thessaloniki, Greece konlaznik@csd.auth.gr Evaggelia Pitoura Coputer Science and Engineering

More information

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011 EE 364B Convex Optiization An ADMM Solution to the Sparse Coding Proble Sonia Bhaskar, Will Zou Final Project Spring 20 I. INTRODUCTION For our project, we apply the ethod of the alternating direction

More information

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing A Trajectory Splitting Model for Efficient Spatio-Teporal Indexing Slobodan Rasetic Jörg Sander Jaes Elding Mario A. Nasciento Departent of Coputing Science University of Alberta Edonton, Alberta, Canada

More information

An Efficient Approach for Content Delivery in Overlay Networks

An Efficient Approach for Content Delivery in Overlay Networks An Efficient Approach for Content Delivery in Overlay Networks Mohaad Malli, Chadi Barakat, Walid Dabbous Projet Planète, INRIA-Sophia Antipolis, France E-ail:{alli, cbarakat, dabbous}@sophia.inria.fr

More information

Performance Analysis of RAID in Different Workload

Performance Analysis of RAID in Different Workload Send Orders for Reprints to reprints@benthascience.ae 324 The Open Cybernetics & Systeics Journal, 2015, 9, 324-328 Perforance Analysis of RAID in Different Workload Open Access Zhang Dule *, Ji Xiaoyun,

More information

A Learning Framework for Nearest Neighbor Search

A Learning Framework for Nearest Neighbor Search A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University

More information

Theoretical Analysis of Local Search and Simple Evolutionary Algorithms for the Generalized Travelling Salesperson Problem

Theoretical Analysis of Local Search and Simple Evolutionary Algorithms for the Generalized Travelling Salesperson Problem Theoretical Analysis of Local Search and Siple Evolutionary Algoriths for the Generalized Travelling Salesperson Proble Mojgan Pourhassan ojgan.pourhassan@adelaide.edu.au Optiisation and Logistics, The

More information

A Hybrid Network Architecture for File Transfers

A Hybrid Network Architecture for File Transfers JOURNAL OF IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 6, NO., JANUARY 9 A Hybrid Network Architecture for File Transfers Xiuduan Fang, Meber, IEEE, Malathi Veeraraghavan, Senior Meber,

More information

Image Filter Using with Gaussian Curvature and Total Variation Model

Image Filter Using with Gaussian Curvature and Total Variation Model IJECT Vo l. 7, Is s u e 3, Ju l y - Se p t 016 ISSN : 30-7109 (Online) ISSN : 30-9543 (Print) Iage Using with Gaussian Curvature and Total Variation Model 1 Deepak Kuar Gour, Sanjay Kuar Shara 1, Dept.

More information

Collaborative Web Caching Based on Proxy Affinities

Collaborative Web Caching Based on Proxy Affinities Collaborative Web Caching Based on Proxy Affinities Jiong Yang T J Watson Research Center IBM jiyang@usibco Wei Wang T J Watson Research Center IBM ww1@usibco Richard Muntz Coputer Science Departent UCLA

More information

The optimization design of microphone array layout for wideband noise sources

The optimization design of microphone array layout for wideband noise sources PROCEEDINGS of the 22 nd International Congress on Acoustics Acoustic Array Systes: Paper ICA2016-903 The optiization design of icrophone array layout for wideband noise sources Pengxiao Teng (a), Jun

More information

A Measurement-Based Model for Parallel Real-Time Tasks

A Measurement-Based Model for Parallel Real-Time Tasks A Measureent-Based Model for Parallel Real-Tie Tasks Kunal Agrawal 1 Washington University in St. Louis St. Louis, MO, USA kunal@wustl.edu https://orcid.org/0000-0001-5882-6647 Sanjoy Baruah 2 Washington

More information

Investigation of The Time-Offset-Based QoS Support with Optical Burst Switching in WDM Networks

Investigation of The Time-Offset-Based QoS Support with Optical Burst Switching in WDM Networks Investigation of The Tie-Offset-Based QoS Support with Optical Burst Switching in WDM Networks Pingyi Fan, Chongxi Feng,Yichao Wang, Ning Ge State Key Laboratory on Microwave and Digital Counications,

More information

A Learning Framework for Nearest Neighbor Search

A Learning Framework for Nearest Neighbor Search A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University

More information

Geo-activity Recommendations by using Improved Feature Combination

Geo-activity Recommendations by using Improved Feature Combination Geo-activity Recoendations by using Iproved Feature Cobination Masoud Sattari Middle East Technical University Ankara, Turkey e76326@ceng.etu.edu.tr Murat Manguoglu Middle East Technical University Ankara,

More information

Planet Scale Software Updates

Planet Scale Software Updates Planet Scale Software Updates Christos Gkantsidis 1, Thoas Karagiannis 2, Pablo Rodriguez 1, and Milan Vojnović 1 1 Microsoft Research 2 UC Riverside Cabridge, UK Riverside, CA, USA {chrisgk,pablo,ilanv}@icrosoft.co

More information

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands Oblivious Routing for Fat-Tree Based Syste Area Networks with Uncertain Traffic Deands Xin Yuan Wickus Nienaber Zhenhai Duan Departent of Coputer Science Florida State University Tallahassee, FL 3306 {xyuan,nienaber,duan}@cs.fsu.edu

More information

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS Key words SOA, optial, coplex service, coposition, Quality of Service Piotr RYGIELSKI*, Paweł ŚWIĄTEK* OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS One of the ost iportant tasks in service oriented

More information

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage 216 IEEE 23rd International Conference on High Perforance Coputing A Low-Cost Multi-Failure Resilient Replication Schee for High Data Availability in Cloud Storage Jinwei Liu* and Haiying Shen *Departent

More information

Fair Energy-Efficient Sensing Task Allocation in Participatory Sensing with Smartphones

Fair Energy-Efficient Sensing Task Allocation in Participatory Sensing with Smartphones Advance Access publication on 24 February 2017 The British Coputer Society 2017. All rights reserved. For Perissions, please eail: journals.perissions@oup.co doi:10.1093/cojnl/bxx015 Fair Energy-Efficient

More information

The Boundary Between Privacy and Utility in Data Publishing

The Boundary Between Privacy and Utility in Data Publishing The Boundary Between Privacy and Utility in Data Publishing Vibhor Rastogi Dan Suciu Sungho Hong ABSTRACT We consider the privacy proble in data publishing: given a database instance containing sensitive

More information

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks Energy-Efficient Disk Replaceent and File Placeent Techniques for Mobile Systes with Hard Disks Young-Jin Ki School of Coputer Science & Engineering Seoul National University Seoul 151-742, KOREA youngjk@davinci.snu.ac.kr

More information

Smarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math

Smarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math Sarter Balanced Assessent Consortiu s, s, Stard Alignent for Math The Sarter Balanced Assessent Consortiu (SBAC) has created a hierarchy coprised of clais targets that together can be used to ake stateents

More information

Adaptive Holistic Scheduling for In-Network Sensor Query Processing

Adaptive Holistic Scheduling for In-Network Sensor Query Processing Adaptive Holistic Scheduling for In-Network Sensor Query Processing Hejun Wu and Qiong Luo Departent of Coputer Science and Engineering Hong Kong University of Science & Technology Clear Water Bay, Kowloon,

More information

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches Efficient Estiation of Inclusion Coefficient using HyperLogLog Sketches Azade Nazi, Bolin Ding, Vivek Narasayya, Surajit Chaudhuri Microsoft Research {aznazi, bolind, viveknar, surajitc}@icrosoft.co ABSTRACT

More information

Data & Knowledge Engineering

Data & Knowledge Engineering Data & Knowledge Engineering 7 (211) 17 187 Contents lists available at ScienceDirect Data & Knowledge Engineering journal hoepage: www.elsevier.co/locate/datak An approxiate duplicate eliination in RFID

More information

ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH

ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH Kulapraote Prathuchai Geoinforatics Center, Asian Institute of Technology, 58 Moo9, Klong Luang, Pathuthani, Thailand.

More information

Boosted Detection of Objects and Attributes

Boosted Detection of Objects and Attributes L M M Boosted Detection of Objects and Attributes Abstract We present a new fraework for detection of object and attributes in iages based on boosted cobination of priitive classifiers. The fraework directly

More information

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points Joint Measureent- and Traffic Descriptor-based Adission Control at Real-Tie Traffic Aggregation Points Stylianos Georgoulas, Panos Triintzios and George Pavlou Centre for Counication Systes Research, University

More information

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues Mapping Data in Peer-to-Peer Systes: Seantics and Algorithic Issues Anastasios Keentsietsidis Marcelo Arenas Renée J. Miller Departent of Coputer Science University of Toronto {tasos,arenas,iller}@cs.toronto.edu

More information

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks IEEE IFOCO 2 International Workshop on Future edia etworks and IP-based TV Utility-Based Resource Allocation for ixed Traffic in Wireless etworks Li Chen, Bin Wang, Xiaohang Chen, Xin Zhang, and Dacheng

More information

Minimax Sensor Location to Monitor a Piecewise Linear Curve

Minimax Sensor Location to Monitor a Piecewise Linear Curve NSF GRANT #040040 NSF PROGRAM NAME: Operations Research Miniax Sensor Location to Monitor a Piecewise Linear Curve To M. Cavalier The Pennsylvania State University University Par, PA 680 Whitney A. Conner

More information

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, August 23, 2004, 12:38 PM) PART III: CHAPTER ONE DIFFUSERS FOR CGH S

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, August 23, 2004, 12:38 PM) PART III: CHAPTER ONE DIFFUSERS FOR CGH S COPUTER GEERATED HOLOGRAS Optical Sciences 67 W.J. Dallas (onday, August 3, 004, 1:38 P) PART III: CHAPTER OE DIFFUSERS FOR CGH S Part III: Chapter One Page 1 of 8 Introduction Hologras for display purposes

More information

A Novel Fast Constructive Algorithm for Neural Classifier

A Novel Fast Constructive Algorithm for Neural Classifier A Novel Fast Constructive Algorith for Neural Classifier Xudong Jiang Centre for Signal Processing, School of Electrical and Electronic Engineering Nanyang Technological University Nanyang Avenue, Singapore

More information

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3 2017 2nd International Conference on Coputational Modeling, Siulation and Applied Matheatics (CMSAM 2017) ISBN: 978-1-60595-499-8 TensorFlow and Keras-based Convolutional Neural Network in CAT Iage Recognition

More information

EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B.

EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B. EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B. Girod1 1 Stanford University, USA 2 Qualco Inc., USA ABSTRACT We study

More information

An Ad Hoc Adaptive Hashing Technique for Non-Uniformly Distributed IP Address Lookup in Computer Networks

An Ad Hoc Adaptive Hashing Technique for Non-Uniformly Distributed IP Address Lookup in Computer Networks An Ad Hoc Adaptive Hashing Technique for Non-Uniforly Distributed IP Address Lookup in Coputer Networks Christopher Martinez Departent of Electrical and Coputer Engineering The University of Texas at San

More information

Evaluation of a multi-frame blind deconvolution algorithm using Cramér-Rao bounds

Evaluation of a multi-frame blind deconvolution algorithm using Cramér-Rao bounds Evaluation of a ulti-frae blind deconvolution algorith using Craér-Rao bounds Charles C. Beckner, Jr. Air Force Research Laboratory, 3550 Aberdeen Ave SE, Kirtland AFB, New Mexico, USA 87117-5776 Charles

More information

Distributed Multicast Tree Construction in Wireless Sensor Networks

Distributed Multicast Tree Construction in Wireless Sensor Networks Distributed Multicast Tree Construction in Wireless Sensor Networks Hongyu Gong, Luoyi Fu, Xinzhe Fu, Lutian Zhao 3, Kainan Wang, and Xinbing Wang Dept. of Electronic Engineering, Shanghai Jiao Tong University,

More information

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems Design Optiization of Mixed Tie/Event-Triggered Distributed Ebedded Systes Traian Pop, Petru Eles, Zebo Peng Dept. of Coputer and Inforation Science, Linköping University {trapo, petel, zebpe}@ida.liu.se

More information

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression Sha M. Kakade Microsoft Research and Wharton, U Penn skakade@icrosoft.co Varun Kanade SEAS, Harvard University

More information

Carving Differential Unit Test Cases from System Test Cases

Carving Differential Unit Test Cases from System Test Cases Carving Differential Unit Test Cases fro Syste Test Cases Sebastian Elbau, Hui Nee Chin, Matthew B. Dwyer, Jonathan Dokulil Departent of Coputer Science and Engineering University of Nebraska - Lincoln

More information

Summary. Reconstruction of data from non-uniformly spaced samples

Summary. Reconstruction of data from non-uniformly spaced samples Is there always extra bandwidth in non-unifor spatial sapling? Ralf Ferber* and Massiiliano Vassallo, WesternGeco London Technology Center; Jon-Fredrik Hopperstad and Ali Özbek, Schluberger Cabridge Research

More information

On the Computation and Application of Prototype Point Patterns

On the Computation and Application of Prototype Point Patterns On the Coputation and Application of Prototype Point Patterns Katherine E. Tranbarger Freier 1 and Frederic Paik Schoenberg 2 Abstract This work addresses coputational probles related to the ipleentation

More information

Generating Mechanisms for Evolving Software Mirror Graph

Generating Mechanisms for Evolving Software Mirror Graph Journal of Modern Physics, 2012, 3, 1050-1059 http://dx.doi.org/10.4236/jp.2012.39139 Published Online Septeber 2012 (http://www.scirp.org/journal/jp) Generating Mechaniss for Evolving Software Mirror

More information

1 Extended Boolean Model

1 Extended Boolean Model 1 EXTENDED BOOLEAN MODEL It has been well-known that the Boolean odel is too inflexible, requiring skilful use of Boolean operators to obtain good results. On the other hand, the vector space odel is flexible

More information

Geometry. The Method of the Center of Mass (mass points): Solving problems using the Law of Lever (mass points). Menelaus theorem. Pappus theorem.

Geometry. The Method of the Center of Mass (mass points): Solving problems using the Law of Lever (mass points). Menelaus theorem. Pappus theorem. Noveber 13, 2016 Geoetry. The Method of the enter of Mass (ass points): Solving probles using the Law of Lever (ass points). Menelaus theore. Pappus theore. M d Theore (Law of Lever). Masses (weights)

More information

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh 1, Issa Khalil 2, and Abdallah Khreishah 3 1: Coputer Engineering & Inforation Technology,

More information

Weeks 1 3 Weeks 4 6 Unit/Topic Number and Operations in Base 10

Weeks 1 3 Weeks 4 6 Unit/Topic Number and Operations in Base 10 Weeks 1 3 Weeks 4 6 Unit/Topic Nuber and Operations in Base 10 FLOYD COUNTY SCHOOLS CURRICULUM RESOURCES Building a Better Future for Every Child - Every Day! Suer 2013 Subject Content: Math Grade 3rd

More information

Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization

Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization 1 Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization Jonathan van de Belt, Haed Ahadi, and Linda E. Doyle The Centre for Future Networks and Counications - CONNECT,

More information

THE rapid growth and continuous change of the real

THE rapid growth and continuous change of the real IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 1, JANUARY/FEBRUARY 2015 47 Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh, Issa

More information

Solving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification

Solving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification Solving the Daage Localization Proble in Structural Health Monitoring Using Techniques in Pattern Classification CS 9 Final Project Due Dec. 4, 007 Hae Young Noh, Allen Cheung, Daxia Ge Introduction Structural

More information

Improve Peer Cooperation using Social Networks

Improve Peer Cooperation using Social Networks Iprove Peer Cooperation using Social Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 Noveber 5, 2007 Corresponding

More information

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm Detection of Outliers and Reduction of their Undesirable Effects for Iproving the Accuracy of K-eans Clustering Algorith Bahan Askari Departent of Coputer Science and Research Branch, Islaic Azad University,

More information

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13 Coputer Aided Drafting, Design and Manufacturing Volue 26, uber 2, June 2016, Page 13 CADDM 3D reconstruction of coplex curved objects fro line drawings Sun Yanling, Dong Lijun Institute of Mechanical

More information

Deterministic Voting in Distributed Systems Using Error-Correcting Codes

Deterministic Voting in Distributed Systems Using Error-Correcting Codes IEEE TRASACTIOS O PARALLEL AD DISTRIBUTED SYSTEMS, VOL. 9, O. 8, AUGUST 1998 813 Deterinistic Voting in Distributed Systes Using Error-Correcting Codes Lihao Xu and Jehoshua Bruck, Senior Meber, IEEE Abstract

More information

Optimal Route Queries with Arbitrary Order Constraints

Optimal Route Queries with Arbitrary Order Constraints IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.?, NO.?,? 20?? 1 Optial Route Queries with Arbitrary Order Constraints Jing Li, Yin Yang, Nikos Maoulis Abstract Given a set of spatial points DS,

More information

A Broadband Spectrum Sensing Algorithm in TDCS Based on ICoSaMP Reconstruction

A Broadband Spectrum Sensing Algorithm in TDCS Based on ICoSaMP Reconstruction MATEC Web of Conferences 73, 03073(08) https://doi.org/0.05/atecconf/087303073 SMIMA 08 A roadband Spectru Sensing Algorith in TDCS ased on I Reconstruction Liu Yang, Ren Qinghua, Xu ingzheng and Li Xiazhao

More information

Resource Optimization for Web Service Composition

Resource Optimization for Web Service Composition Resource Optiization for Web Coposition Xia Gao, Ravi Jain, Zulfikar Razan, Ulas Kozat Abstract coposition recently eerged as a costeffective way to quickly create new services within a network. Soe research

More information

Control Message Reduction Techniques in Backward Learning Ad Hoc Routing Protocols

Control Message Reduction Techniques in Backward Learning Ad Hoc Routing Protocols Control Message Reduction Techniques in Backward Learning Ad Hoc Routing Protocols Navodaya Garepalli Kartik Gopalan Ping Yang Coputer Science, Binghaton University (State University of New York) Contact:

More information

2013 IEEE Conference on Computer Vision and Pattern Recognition. Compressed Hashing

2013 IEEE Conference on Computer Vision and Pattern Recognition. Compressed Hashing 203 IEEE Conference on Coputer Vision and Pattern Recognition Copressed Hashing Yue Lin Rong Jin Deng Cai Shuicheng Yan Xuelong Li State Key Lab of CAD&CG, College of Coputer Science, Zhejiang University,

More information

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multi-cluster Platforms

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multi-cluster Platforms 1 Scheduling Parallel Task Graphs on (Alost) Hoogeneous Multi-cluster Platfors Pierre-François Dutot, Tchiou N Takpé, Frédéric Suter and Henri Casanova Abstract Applications structured as parallel task

More information

Scheduling Parallel Real-Time Recurrent Tasks on Multicore Platforms

Scheduling Parallel Real-Time Recurrent Tasks on Multicore Platforms IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL., NO., NOV 27 Scheduling Parallel Real-Tie Recurrent Tasks on Multicore Platfors Risat Pathan, Petros Voudouris, and Per Stenströ Abstract We

More information

Novel Image Representation and Description Technique using Density Histogram of Feature Points

Novel Image Representation and Description Technique using Density Histogram of Feature Points Novel Iage Representation and Description Technique using Density Histogra of Feature Points Keneilwe ZUVA Departent of Coputer Science, University of Botswana, P/Bag 00704 UB, Gaborone, Botswana and Tranos

More information

POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang

POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang IEEE International Conference on ultiedia and Expo POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION Junjun Jiang, Ruiin Hu, Zhen Han, Tao Lu, and Kebin Huang National Engineering

More information

Guillotine subdivisions approximate polygonal subdivisions: Part III { Faster polynomial-time approximation schemes for

Guillotine subdivisions approximate polygonal subdivisions: Part III { Faster polynomial-time approximation schemes for Guillotine subdivisions approxiate polygonal subdivisions: Part III { Faster polynoial-tie approxiation schees for geoetric network optiization Joseph S. B. Mitchell y April 19, 1997; Last revision: May

More information

Relief shape inheritance and graphical editor for the landscape design

Relief shape inheritance and graphical editor for the landscape design Relief shape inheritance and graphical editor for the landscape design Egor A. Yusov Vadi E. Turlapov Nizhny Novgorod State University after N. I. Lobachevsky Nizhny Novgorod Russia yusov_egor@ail.ru vadi.turlapov@cs.vk.unn.ru

More information

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS Guofei Jiang and George Cybenko Institute for Security Technology Studies and Thayer School of Engineering Dartouth College, Hanover NH 03755

More information

A Fast Multi-Objective Genetic Algorithm for Hardware-Software Partitioning In Embedded System Design

A Fast Multi-Objective Genetic Algorithm for Hardware-Software Partitioning In Embedded System Design A Fast Multi-Obective Genetic Algorith for Hardware-Software Partitioning In Ebedded Syste Design 1 M.Jagadeeswari, 2 M.C.Bhuvaneswari 1 Research Scholar, P.S.G College of Technology, Coibatore, India

More information

Optimized stereo reconstruction of free-form space curves based on a nonuniform rational B-spline model

Optimized stereo reconstruction of free-form space curves based on a nonuniform rational B-spline model 1746 J. Opt. Soc. A. A/ Vol. 22, No. 9/ Septeber 2005 Y. J. Xiao and Y. F. Li Optiized stereo reconstruction of free-for space curves based on a nonunifor rational B-spline odel Yi Jun Xiao Departent of

More information

A Comparative Study of Two-phase Heuristic Approaches to General Job Shop Scheduling Problem

A Comparative Study of Two-phase Heuristic Approaches to General Job Shop Scheduling Problem IE Vol. 7, No. 2, pp. 84-92, Septeber 2008. A Coparative Study of Two-phase Heuristic Approaches to General Job Shop Scheduling Proble Ji Ung Sun School of Industrial and Managent Engineering Hankuk University

More information

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router Derivation of an Analytical Model for Evaluating the Perforance of a Multi- Queue Nodes Network Router 1 Hussein Al-Bahadili, 1 Jafar Ababneh, and 2 Fadi Thabtah 1 Coputer Inforation Systes Faculty of

More information

Set Theoretic Estimation for Problems in Subtractive Color

Set Theoretic Estimation for Problems in Subtractive Color Set Theoretic Estiation for Probles in Subtractive Color Gaurav Shara Digital Iaging Technology Center, Xerox Corporation, MS0128-27E, 800 Phillips Rd, Webster, NY 14580 Received 25 March 1999; accepted

More information

IN many interactive multimedia streaming applications, such

IN many interactive multimedia streaming applications, such 840 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 8, NO. 5, MAY 06 A Geoetric Approach to Server Selection for Interactive Video Streaing Yaochen Hu, Student Meber, IEEE, DiNiu, Meber, IEEE, and Zongpeng Li, Senior

More information

Approaching the Skyline in Z Order

Approaching the Skyline in Z Order Approaching the Skyline in Z Order KenC.K.Lee Baihua Zheng Huajing Li Wang-Chien Lee cklee@cse.psu.edu bhzheng@su.edu.sg huali@cse.psu.edu wlee@cse.psu.edu Pennsylvania State University, University Park,

More information

A Study of the Relationship Between Support Vector Machine and Gabriel Graph

A Study of the Relationship Between Support Vector Machine and Gabriel Graph A Study of the Relationship Between Support Vector Machine and Gabriel Graph Wan Zhang and Irwin King {wzhang, king}@cse.cuhk.edu.hk Departent of Coputer Science & Engineering The Chinese University of

More information

Privacy-preserving String-Matching With PRAM Algorithms

Privacy-preserving String-Matching With PRAM Algorithms Privacy-preserving String-Matching With PRAM Algoriths Report in MTAT.07.022 Research Seinar in Cryptography, Fall 2014 Author: Sander Sii Supervisor: Peeter Laud Deceber 14, 2014 Abstract In this report,

More information

Gromov-Hausdorff Distance Between Metric Graphs

Gromov-Hausdorff Distance Between Metric Graphs Groov-Hausdorff Distance Between Metric Graphs Jiwon Choi St Mark s School January, 019 Abstract In this paper we study the Groov-Hausdorff distance between two etric graphs We copute the precise value

More information

Multipath Selection and Channel Assignment in Wireless Mesh Networks

Multipath Selection and Channel Assignment in Wireless Mesh Networks Multipath Selection and Channel Assignent in Wireless Mesh Networs Soo-young Jang and Chae Y. Lee Dept. of Industrial and Systes Engineering, KAIST, 373-1 Kusung-dong, Taejon, Korea Tel: +82-42-350-5916,

More information

On Performance Bottleneck of Anonymous Communication Networks

On Performance Bottleneck of Anonymous Communication Networks On Perforance Bottleneck of Anonyous Counication Networks Ryan Pries, Wei Yu, Steve Graha, and Xinwen Fu Abstract Although a significant aount of effort has been directed at discovering attacks against

More information

Smarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math

Smarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math Sarter Balanced Assessent Consortiu Clais, s, Stard Alignent for Math The Sarter Balanced Assessent Consortiu (SBAC) has created a hierarchy coprised of clais targets that together can be used to ake stateents

More information

Detecting Anomalous Structures by Convolutional Sparse Models

Detecting Anomalous Structures by Convolutional Sparse Models Detecting Anoalous Structures by Convolutional Sparse Models Diego Carrera, Giacoo Boracchi Dipartiento di Elettronica, Inforazione e Bioingegeria Politecnico di Milano, Italy {giacoo.boracchi, diego.carrera}@polii.it

More information

Survivability Function A Measure of Disaster-Based Routing Performance

Survivability Function A Measure of Disaster-Based Routing Performance Survivability Function A Measure of Disaster-Based Routing Perforance Journal Club Presentation on W. Molisz. Survivability function-a easure of disaster-based routing perforance. IEEE Journal on Selected

More information

MiPPS: A Generative Model for Multi-Manifold Clustering

MiPPS: A Generative Model for Multi-Manifold Clustering Manifold Learning and its Applications: Papers fro the AAAI Fall Syposiu (FS-9-) MiPPS: A Generative Model for Multi-Manifold Clustering Oluwasani Koyejo and Joydeep Ghosh Electrical and Coputer Engineering

More information

A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Membership Functions *

A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Membership Functions * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 21, 607-626 (2005) A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Mebership Functions * SHIH-HSU HUANG AND JIAN-YUAN LAI + Departent of

More information

Different criteria of dynamic routing

Different criteria of dynamic routing Procedia Coputer Science Volue 66, 2015, Pages 166 173 YSC 2015. 4th International Young Scientists Conference on Coputational Science Different criteria of dynaic routing Kurochkin 1*, Grinberg 1 1 Kharkevich

More information

Heterogeneous Radial Basis Function Networks

Heterogeneous Radial Basis Function Networks Proceedings of the International Conference on Neural Networks (ICNN ), vol. 2, pp. 23-2, June. Heterogeneous Radial Basis Function Networks D. Randall Wilson, Tony R. Martinez e-ail: randy@axon.cs.byu.edu,

More information

Cassia County School District #151. Expected Performance Assessment Students will: Instructional Strategies. Performance Standards

Cassia County School District #151. Expected Performance Assessment Students will: Instructional Strategies. Performance Standards Unit 1 Congruence, Proof, and Constructions Doain: Congruence (CO) Essential Question: How do properties of congruence help define and prove geoetric relationships? Matheatical Practices: 1. Make sense

More information

HIGH PERFORMANCE PRE-SEGMENTATION ALGORITHM FOR SONAR IMAGES

HIGH PERFORMANCE PRE-SEGMENTATION ALGORITHM FOR SONAR IMAGES HIGH PERFORMANCE PRE-SEGMENTATION ALGORITHM FOR SONAR IMAGES Benjain Lehann*, Konstantinos Siantidis*, Dieter Kraus** *ATLAS ELEKTRONIK GbH Sebaldsbrücker Heerstraße 235 D-28309 Breen, GERMANY Eail: benjain.lehann@atlas-elektronik.co

More information

Multi Packet Reception and Network Coding

Multi Packet Reception and Network Coding The 2010 Military Counications Conference - Unclassified Progra - etworking Protocols and Perforance Track Multi Packet Reception and etwork Coding Aran Rezaee Research Laboratory of Electronics Massachusetts

More information

The Hopcount to an Anycast Group

The Hopcount to an Anycast Group The Hopcount to an Anycast Group Piet Van Mieghe Delft University of Technology Abstract The probability density function of the nuber of hops to the ost nearby eber of the anycast group consisting of

More information

Adaptive Parameter Estimation Based Congestion Avoidance Strategy for DTN

Adaptive Parameter Estimation Based Congestion Avoidance Strategy for DTN Proceedings of the nd International onference on oputer Science and Electronics Engineering (ISEE 3) Adaptive Paraeter Estiation Based ongestion Avoidance Strategy for DTN Qicai Yang, Futong Qin, Jianquan

More information

Closing The Performance Gap between Causal Consistency and Eventual Consistency

Closing The Performance Gap between Causal Consistency and Eventual Consistency Closing The Perforance Gap between Causal Consistency and Eventual Consistency Jiaqing Du Călin Iorgulescu Aitabha Roy Willy Zwaenepoel EPFL ABSTRACT It is well known that causal consistency is ore expensive

More information

Minimax Sensor Location in the Plane

Minimax Sensor Location in the Plane NSF GRANT #040040 NSF PROGRAM NAME: Operations Research Miniax Sensor Location in the Plane To M. Cavalier The Pennsylvania State University University Par, PA 680 Enrique del Castillo The Pennsylvania

More information

Leveraging Relevance Cues for Improved Spoken Document Retrieval

Leveraging Relevance Cues for Improved Spoken Document Retrieval Leveraging Relevance Cues for Iproved Spoken Docuent Retrieval Pei-Ning Chen 1, Kuan-Yu Chen 2 and Berlin Chen 1 National Taiwan Noral University, Taiwan 1 Institute of Inforation Science, Acadeia Sinica,

More information

3 Conference on Inforation Sciences and Systes, The Johns Hopkins University, March, 3 Sensitivity Characteristics of Cross-Correlation Distance Metric and Model Function F. Porikli Mitsubishi Electric

More information

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou Departent of Coputer Science and Inforation Engineering, National Chiayi

More information