Non-homogeneous Generalization in Privacy Preserving Data Publishing

Size: px
Start display at page:

Download "Non-homogeneous Generalization in Privacy Preserving Data Publishing"

Transcription

1 Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong ABSTRACT Most previous research on privacy-preserving ata publishing, base on the -anonymity moel, has followe the simplistic approach of homogeneously giving the same generalize value in all quasi-ientifiers within a partition. We observe that the anonymization error can be reuce if we follow a non-homogeneous generalization approach for groups of size larger than. Such an approach woul allow tuples within a partition to tae ifferent generalize quasi-ientifier values. Anonymization following this moel is not trivial, as its irect application can easily violate - anonymity. In aition, non-homogeneous generalization allows for aitional types of attac, which shoul be consiere in the process. We provie a methoology for verifying whether a nonhomogeneous generalization violates -anonymity. Then, we propose a technique that generates a non-homogeneous generalization for a partition an show that its result satisfies -anonymity, however by straightforwarly applying it, privacy can be compromise if the attacer nows the anonymization algorithm. Base on this, we propose a ranomization metho that prevents this type of attac an show that -anonymity is not compromise by it. Nonhomogeneous generalization can be use on top of any existing partitioning approach to improve its utility. In aition, we show that a new partitioning technique tailore for non-homogeneous generalization can further improve quality. A thorough experimental evaluation emonstrates that our methoology greatly improves the utility of anonymize ata in practice. Categories an Subject Descriptors H.2.7 [Database Management]: Security, integrity, an protection General Terms Algorithms Keywors Non-homogeneous generalization, privacy, anonymization Supporte by grant HKU 71518E from Hong Kong RGC. Permission to mae igital or har copies of all or part of this wor for personal or classroom use is grante without fee provie that copies are not mae or istribute for profit or commercial avantage an that copies bear this notice an the full citation on the first page. To copy otherwise, to republish, to post on servers or to reistribute to lists, requires prior specific permission an/or a fee. SIGMOD 1, June 6 11, 21, Inianapolis, Iniana, USA. Copyright 21 ACM /1/6...$ INTRODUCTION The problem of privacy-preserving ata publishing has been extensively stuie since it was first introuce in [2, 21]. Consier a large table which has to be release to the public for research purposes. Privacy is typically compromise by careless publishing of the table [3], since sensitive information may be leae. Thus, the goal of ata publishing is to transform the table, such that iniviuals may not be line to specific tuples with high certainty. At the same time, the publishe ata shoul still be useful, so an optimization problem arises: anonymize the ata such that a certain egree of privacy is preserve while ata utility is maximize. In the table to be publishe, apart from the eys that are suppresse before publication, there is a set of attributes calle the quasi-ientifier (QID). The QID of each tuple is nown to the attacer an may be use to ientify an iniviual. A typical example of QID is {ZIP coe, gener, ate of birth}, which can uniquely ientify 63% of the population i US Census ata [8]. The popular -anonymity principle [2, 21] requires that the probability of an aversary being able to fin out the ientity of an anonymize tuple is at most 1. The most common technique for achieving - anonymity is generalization [13, 14, 1, 6]. The table is ivie into groups having tuples or more an the QID values in each group are generalize to a range containing all original values. Table 2 shows an exemplary 2-anonymize table using generalization. The original ata are shown in Table 1. (t i in Table 2 is the generalize version of t i in Table 1 for easy reference.) For example, the age of t 3 is originally 15 an after generalization, it is replace by the range Apart from microata publication, -anonymity has been largely aopte in applications lie location-base services [19, 11], to protect the ientity of query issuers. A wie range of algorithms using generalization are propose for aressing -anonymity [1, 14, 6]. They share a common framewor: first partition the tuples into groups, then assign the same generalize QID to tuples in the same group. The group of tuples with the same QID is calle equivalence class. Such an approach, to which we refer as homogeneous generalization, raises an important question: oes generalization have to be homoge- QID Sens. attribute Tuple ID Zip coe Gener Age Disease t M 3 Flu t F 28 Cancer t M 15 Cancer t M 48 AIDS t M 2 None Table 1: Original table

2 Tuple ID Zip coe Gener Age Disease t 1 91*** * 15-3 Flu t 2 91*** * 15-3 Cancer t 3 91*** * 15-3 Cancer t 4 923** M 2-48 AIDS t 5 923** M 2-48 None Table 2: 2-anonymity using homogeneous generalization Tuple ID Zip coe Gener Age Disease t * * 28-3 Flu t 2 91*** * Cancer t 3 91*** M 15-3 Cancer t 4 923** M 2-48 AIDS t 5 923** M 2-48 None Table 3: 2-anonymity using non-homogeneous generalization neous? For example, consier the possible publication of Table 1, as shown in Table 3. t 1, t 2 an t 3 have a ifferent generalize QID. This generalization is non-homogeneous. Assuming the aversary nows the QIDs of all iniviuals containe in Table 1, he can fin out the ientity of any anonymize tuple in Table 3 with probability at most 1. Hence, 2-anonymity is satisfie, as this is 2 also the case for Table 2. On the other han, if we compare the utility of the two tables, we can observe that Table 3 is better than Table 2, regarless of the utility measure use; for each tuple an QID attribute of Table 3, the generalize range is smaller than or equal to the corresponing range in the corresponing tuple an attribute in Table 2. This example shows that it is possible to achieve higher utility using non-homogeneous generalization. The iea of non-homogeneous generalization was first introuce in [7], which stuies techniques with a guarantee that an aversary cannot associate a generalize tuple to less than iniviuals. However, the propose solutions o not offer bouns for the probability of each association. Hence, some iniviuals may have higher probability to be associate to an anonymize tuple than others an this may lea to privacy breaches. In this paper, we systematically stuy the use of non-homogeneous generalization in anonymizing tables. We provie a methoology for verifying whether a non-homogeneous generalization violates -anonymity. Then, we propose a technique that generates a non-homogeneous generalization an show that its result satisfies -anonymity, however by straightforwarly applying it, privacy can be compromise if the attacer nows aitionally the anonymization algorithm. Base on this, we propose a ranomization metho that prevents this type of attac an show that -anonymity is not compromise by it. Although non-homogeneous generalization can be use on top of any existing partitioning approach to improve its utility, we show that a new partitioning technique tailore for non-homogeneous generalization can further improve quality. Our main focus throughout the paper is -anonymity, however, we also iscuss how our methoology can be extene to improve utility for other privacy principles. A thorough experimental evaluation emonstrates that our methoology greatly improves the quality of anonymize ata in practice. The rest of the paper is organize as follows. The next section reviews relate wor an positions it against this paper. In Section 3, we formally efine the -anonymity problem an provie a partial orering mechanism for comparing the utility of ifferent anonymization results. Section 4 iscusses the main challenges of non-homogeneous generalization an provies some properties that can be use to efine a goo generalization. Our methoology is escribe in Section 5. Section 6 iscusses the extension of our methoology for l-iversity [16] an in Section 7 we experimentally evaluate it. Finally, Section 8 conclues the paper. 2. RELATED WORK A privacy principle, -anonymity, is evelope in [2, 21] to guar against aversaries having the QIDs of iniviuals as bacgroun nowlege. The goal of -anonymity is to prevent an aversary from ientifying an iniviual with a probability higher than 1. Generalization an suppression are use to protect privacy. Generalization replaces the exact QID value by a less concrete form. For example, value 15 is generalize to range [15-3]. Suppression removes some values or the entire tuple from T. The most popular metho stuie by the community for -anonymity has been homogeneous generalization. The tuples in the table are partitione into groups calle equivalence classes. The QID of tuples in the same equivalence class are generalize to be the same. As fining the best partitioning that achieves -anonymity, while maximizing utility is NP-har [18], ifferent fast heuristics are evelope. These can be classifie into two approaches: (i) global recoring (e.g., [14, 13]): if any two tuples have the same QID value, they must tae the same generalize QID; (ii) local recoing (e.g., [1, 28, 6]): two tuples having the same QID may be generalize ifferently. Local recoring generally gives publishe tables of higher utility, ue to its flexibility. Apart from -anonymity, there are other principles (e.g., [16, 26, 15, 25, 27, 24, 22, 17]) that target ifferent privacy concerns an/or ifferent aversary assumptions. The relation to be anonymize typically contains a sensitive attribute. Even if the aversary cannot associate the tuples in the publishe table with iniviuals (i.e., -anonymity is satisfie), he may associate an iniviual to a particular sensitive value with high probability if there are multiple occurrences of the same sensitive value in the equivalence class where the QID of the iniviual belongs. For example, suppose Bob is a male of age 15 (t 3 in Table 1). Although there are 3 possible tuples {t 1, t 2, t 3} for Bob in Table 2, an attacer can erive that Bob is liely (with the high probability of 2 ) to have cancer. The 3 l-iversity principle [16, 26] aims to boun the maximum of this inference probability to be 1. The t-closeness principle [15], on l the other han, aims to control the inference probability so that it is similar to the general istribution of the sensitive values. For example, if 9% of population in T o not smoe, the goal is to ensure that 9% of iniviuals in each equivalence class are non-smoing. Both l-iversity an t-closeness assume the same basic aversary capability: nowing the QID of iniviuals. Some wors assume that an aversary may obtain aitional bacgroun nowlege. For instance, [24] assumes the aversary may now the algorithm use in generalization. In [22], the aversary may corrupt some iniviuals, obtain the sensitive values of them, an use them to infer the remaining sensitive values in the equivalence class. Nevertheless solutions to all above problems may suffer from privacy breaches; [12] has emonstrate how to breach privacy using e- Finetti s theorem. Perturbation [1, 5] is another technique that has been use to preserve privacy in ata publishing. Recent wors also use a hybri approach, combining perturbation an generalization to preserve privacy [22]. In perturbation, noise is ae to the original ata, such that the resulting values ranomly eviate from the original ones. Compare to generalization, Perturbation may introuce high error, especially for aggregate queries with small ranges. In aition, noise filtering techniques may be use to breach privacy [9].

3 The closest piece of wor relate to ours is [7], where nonhomogeneous generalization is introuce. The principle of global (1, )-anonymity is propose, which guarantees that an iniviual is not associate to less than generalize tuples. In aition, a generalization technique for global (1, )-anonymity is evelope. However, this wor suffers from two major rawbacs. First, the principle oes not ensure that an aversary associates an iniviual to at least tuples with even probability. As a result, an anonymize tuple may have probability > 1 to be associate with an iniviual; thus, global (1, )-anonymity has a weaer privacy level compare to -anonymity. Secon, the propose algorithm has a high complexity of O( 2.5 ) an is thus not suitable in practice. In this paper, our goal is to evelop a methoology for nonhomogeneous generalization, which improves utility while maintaining an aequate level of privacy. Our stuy is mainly focuse on the basic -anonymity moel, the reasons being that: (i) algorithms for -anonymity are simple an many wors (e.g., [16]) have aapte them for ifferent principles; (ii) -anonymity is commonly use in applications lie location-base services [19, 11], where there are no aitional (sensitive) attributes. Apart from the basic -anonymity moel, we also consier the scenarios with stronger aversaries, with nowlege of generation algorithm (Section 4.2) an ientities of some generalize tuples (corruption, Section 5.2) 3. PROBLEM DEFINITION Consier a relational table T, in which there are 3 classes of attributes: (i) attributes that are eys in T : such attributes are remove in the publishe table to prevent immeiate ientification of iniviuals; (ii) attributes that are part of the quasi-ientifier (QID): the QID of every iniviual is nown to the attacer as bacgroun nowlege an can be use to lin tuples in the table to iniviuals; (iii) attributes that are not part of a ey or QID: the values of such attributes are retaine in the publishe table. Our goal is to generate a publishable table T such that (i) the -anonymity privacy constraint is satisfie; an (ii) utility is maximize. In Sectio.1, we escribe our assumptions about the aversary an efine -anonymity. In Sectio.2, we escribe how the utility of ifferent anonymize tables can be compare. 3.1 Aversary assumption an -anonymity We assume an aversary may obtain the value of QID an the ientification of any tuple in T by sources other than T (e.g., a public voters table). Let H be the aversary s nowlege containing the QID an ientity of all nown iniviuals. In the worst case, the aversary may have access to the QID of every iniviual, thus by joining H an T on QID, tuples t in T may be line to iniviuals. -anonymity aims at preventing the aversary from fining an iniviual s ientity with a probability higher tha. DEFINITION 1. (Ientity notion) Given two tables H, T, if a tuple t i H an a tuple t j T belong to the same iniviual, we say they have the same ientity, enote as t i = I t j. DEFINITION 2. (-anonymity) Given a table T, assume that H is the projection of T on ey an QID attributes. We say - anonymity is preserve in an anonymize table T if t i H, t j T, Pr(t i = I t j) Measuring an comparing utility Measuring the utility of an anonymize table is usually one by means of an objective information loss measure that compares T with T. Popular measures inclue the iscernibility metric [4], which sums the squares of the equivalence class carinalities, the normalize certainty penalty (NCP) [28] which is efine by the sum of QID attribute ranges in each equivalence class, an the global certainty penalty (GCP) [6], which is a normalize version of NCP. In Section 5.3, we provie a efinition of GCP, which we use in this paper, as it affects the functionality of our ata partitioning algorithm. In general, the utility of the anonymize ata may not be easily capture by specific measures, as it epens on the application of the publishe ata. Our purpose is not to limit our stuy to a particular utility metric but to evelop a new methoology, which generally improves the utility of existing methos that apply homogeneous generalization. As generalization converts precise ata to uncertain ata, a metho that restricts the uncertainty of each tuple compare to the result of another metho is certainly better. Definitio formally states when one anonymize table Ta is strictly better than another Tb in terms of utility (enote by Ta Tb ); we can use it to efine a partial orer for anonymization results. In this paper, we aim at fining a local-optimal solution T, i.e., Ti such that Ti T, Ti violates -anonymity. DEFINITION 3. (utility-base orering) Consier two anonymize tuples t 1, t 2. We say that t 1 preserves a better utility than t 2, enote by t 1 t 2 if for all attributes i in the QID, t 1[i] t 2[i], an there is at least one attribute j in the QID, for which t 1[j] t 2[j], e.g., 9115*,*,28-3 (t 1 in Table 3) 91***,*,15-3 (t 1 in Table 2). Consier two anonymize tables Ta, Tb, both with n tuples, such that tuples Ta [i] an Tb [i] originate from T [i], for all i [1, n]. We say Ta preserves a better utility than Tb, enote by Ta Tb, if i [1, n], Ta [i] Tb [i] or Ta [i] = Tb [i] an j [1, n], such that Ta [j] Tb [j]. As iscusse earlier, -anonymity can be achieve by first partitioning the ata into groups an then uniformly transform the QIDs of all recors in the same group to tae the same generalize value. Homogeneous generalization may not prouce results of the highest possible quality. In the next section, we iscuss the challenges of a non-homogeneous generalization technique that coul be applie alternatively. 4. CHALLENGES IN NON-HOMOGENEOUS GENERALIZATION Assume that the bacgroun nowlege of the aversary is a table H containing the QID of every iniviual. Given a publishe table T, the aversary performs a lining attac by joining T with H; for each tuple t in H, the aversary fins all tuples t in T such that t[qid] is inclue in the generalize t [QID]. For example, if the QID of t T is 1 an there is a tuple t T with t [QID] = [5-2], then t is a possible generalization of t. We call the pair t, t a match. A vali assignment is a maximal 1-to-1 assignment between tuples of H an T. In this paper, we ientify two challenges to non-homogeneous generalization. We will iscuss the issue of ineffective matches when joining H an T in Section 4.1 an how these can be ientifie an eliminate. In Section 4.2, we will iscuss a privacy threat, for the case where the aversary nows the algorithm, which is use to generate the anonymize table. 4.1 Pruning of ineffective matches Intuitively, -anonymity can be satisfie if an aversary fins that there are at least matches relate to the same tuple in H/T. This is the case for homogeneous generalization; given a generalize QID, any of the or more tuples from the original table T that

4 Tuple ID QID t 1 1 t 2 2 t 3 3 t 4 4 t 5 5 (a) original table Tuple ID QID t t t t t (b) anonymize table n 5 Table 4: Non-homogeneous generalization n 4 were groupe together an match that QID, have the same probability (at most 1/) to match any tuple in T with that generalize QID. However, the same oes not apply in the non-homogeneous case. Consier the original table T shown in Table 4a an an anonymize table T using non-homogeneous generalization, as shown in Table 4b. For every t i in T, there are at least two matches in T an vice versa. However, 2-anonymity is not satisfie. Both t 1 an t 5 in T match t 1, an t 5 in T. t 2 matches t 1, t 2 an t 5 in T. Since t 1 an t 5 must be either t 1 or t 5, t 2 can only be matche to t 2 in a vali assignment, violating 2-anonymity. We say that the match of t 2 to t 1 (an t 5) is ineffective if an aversary can eliminate such a possibility. DEFINITION 4. (Match an assignment) Given a table T an its anonymize table T, a match m is a 2-tuple t i, t j where t i T, t j T an the QID of t i is inclue in that of t j. An assignment a is a set of matches m i = t xi, t y j, where t xi T, t y j T, a = T, an for each pair of matches m i a, m j a, x i x j an y i y j. DEFINITION 5. (Effective match) Given a table T an its anonymize table T, let H be the projection of T on ey an QID attributes. Given two tuples t i H, t j T, a match m = t i, t j is sai to be effective, if an only if there exists an assignment a such that m a. An assignment represents the scenario that an aversary gives a unique ientity to every anonymize tuple in T an vice versa. If a match cannot be foun in any of the assignments, then the aversary will happily remove this match. If all ineffective matches are remove an there are less than matches left for a tuple in H, -anonymity is violate. In the following, we first iscuss how to etermine if a match is effective (Section 4.1.1). Then, we present the property that the generalize table shoul satisfy in orer for all matches to be effective (Section 4.1.2) Necessary conition for effective match In orer to satisfy -anonymity, we must have at least effective matches for each tuple in T. In orer to etermine if a match is effective, we use an assignment graph which is use to visualize the matches. DEFINITION 6. (Assignment graph) Consier a table T an its anonymize table T. Assume that the tuples in both tables are orere, such that the assignment a = { t i, t i } is vali, for all t i T, t i T. An assignment graph G = (V, E) is a irecte graph with T vertices. For i = 1 to T, n i V represents t i T an t i T, An ege n i n j is present if an only if the QID of t i is inclue in that of t j. Figure 1 shows the assignment graph constructe for Table 4b. Each ege in the assignment graph represents a match that can be foun by joining T an T. For example, the ege from to n 4 means that t 3 in T joins to t 4 in T. Next we show how to verify the effectiveness of a match in the assignment graph. Figure 1: Assignment graph of Table 4b THEOREM 1. Consier a table T, its anonymize table T, an the corresponing assignment graph G = (V, E). The match t i, t j (corresponing to ege (n i, n j) E) is effective if an only if n i is reachable from n j. PROOF. If part. Note that if a match is not effective, we cannot fin an assignment containing the match. So, we can prove the statement by showing how to construct an assignment that contains the target match t i, t j. Without loss of generality, we assume j > i an a path from n j to n i is {n j, n j 1,..., n i+1, n i}. Note that a possible assignment is a = { t i, t i }. For > j an < i, the match t, t is ae to assignment a. For = j to i + 1, since there is an ege from t to t 1, we can a t, t 1 to a. Finally, we inclue our target match t i, t j to a. Hence, an assignment containing the target match is prouce. Only if part. Given an effective match t i, t j, there is an assignment a that contains this match. Each match in a is an ege in G, thus a is a subset of E. We now show that we can fin a path from n i to n j using the matches in a. Consier a subgraph G = (V, a ) that only contains matches in a. Each noe in G has exactly one outgoing ege an one incoming ege. Hence, G must be compose of cycles. The match t i, t j is represente by the ege (n i, n j) which lies on a cycle as well. So, n i is reachable from n j by traveling through the cycle. For example in Figure 1, is not reachable from. This means the match of t 2, t 1 is not effective an cannot appear in a vali assignment. Thus, all ineffective matches can be ientifie an remove from the assignment graph by the aversary. This results in a reuce assignment graph. Figure 2 shows the graph erive from the initial one shown in Figure 1, containing only the effective matches. This gives a clearer picture why Table 4b oes not satisfy 2-anonymity, as t 2 can only be mappe to t 2 in a vali assignment. n 5 n 4 Figure 2: Effective matches in the graph of Table 4b

5 4.1.2 Impact of effective match on generalization From Theorem 1, we now that the effectiveness of a match can be etermine by looing at the connectivity of noes in a graph. In fact, if we eep only effective matches, the graph will egenerate to the set of its strongly connecte components. THEOREM 2. Consier a table T, its anonymize table T, an the corresponing assignment graph G = (V, E). If all matches are effective, G is a set of strongly connecte components, such that there are no eges between any two components. PROOF. A graph can always be ecompose to a number of strongly connecte components. We prove the theorem by showing that each component in G is inepenent, i.e., there is no ege between any two components. We prove the statement by contraiction. Without loss of generality, we assume C 1 an C 2 are two components in G an there is a path from u to v where u C 1 an v C 2. Thus any noe x C 1 can reach any noe y C 2 via the path from u to v. Since there are effective matches only, there must be a path from v to u ue to Theorem 1. Hence, from every noe y C 2, we can reach any noe x C 1. C 1 an C 2 are a single strongly connecte component, which contraicts the assumption. Theorem 2 leas to an interesting observation: tuples are partitione to strongly connecte components in a non-homogeneous generalization. Note that the complexity of fining the strongly connecte components is linear in the number of eges in G, ue to Tarjan s algorithm [23]. 4.2 Ranomization in generalization With non-homogeneous generalization, the generalize QID of tuples in a partition (i.e., equivalent class) may vary. For example, in Table 3, t 1, t 2 an t 3 have a unique generalize QID. This offers aitional information to an aversary in his quest for the ientities of the anonymize tuples. If we use a eterministic nonhomogeneous generalization approach, the generalize value of each tuple in the table woul be the same for every possible run of such a metho. Therefore if the aversary nows the generalization algorithm, he can apply it on H, compare the result with the anonymize table an infer the original QID of the anonymize tuples an therefore their ientities. Due to this problem, ranomization is necessary when anonymizing a table with non-homogeneous generalization. A goo ranomization technique implies that when an aversary fins tuples in H joine with an anonymize tuple t i in T, the probability of each of these tuples being the real ientity of t i is the same (= 1 ). We can achieve this goal by first computing the generalize QID of tuples eterministically an then assigning each generalize QID to a tuple in T in a ranomize way. Figure 3 is an example, illustrating this process for 2-anonymity. First, we generate for each original tuple t i in T a generalize QID t i, which contains t i an 1 aitional ones from T. The QID generation function, enote by gen, taes as input a set of tuples an gives a generalize QID range. Next, for each generalize QID, we assign a ranom ientity to it with a probability of 1. In the example, we have pice t 2 = I t 1, t 3 = I t 2, an t 1 = I t 3. The other attributes are copie to the anonymize table accoringly. Thus, the generalization proceure is ivie into two steps: (i) generalize QID generation; (ii) ranom assignment generation. The QID generation etermines whether -anonymity can be achieve an affects the possible assignments that we can choose from in the ranomization. For example, if the QID generation for Table 4a is one as shown in Table 4b, it is not possible to achieve ID t 1 t 2 t 3 gen({t 1,t 2 }) gen({t 2,t 3 }) gen({t 1,t 3 }) Table T QID <1, 1, 2> <1, 2, 1> <2, 1, 1> ID t 1 t 2 t 3 Other a b c QID generation QID <1, 1-2, 1-2> <1-2, 1-2, 1> <1-2, 1, 1-2> ID t 1 t 2 t 3 Anonymize table T * <1, 1-2, 1-2> <1-2, 1-2, 1> <1-2, 1, 1-2> Ranom assignment Possible matches t 1 t 2 t 1 QID t 2 t 3 t 3 Other Figure 3: The generalization process in achieving 2-anonymity -anonymity, as we have shown in Section In the following subsection, we will iscuss how we can guarantee -anonymity in the QID generation process A sufficient conition for -anonymity In orer to preserve -anonymity, we have alreay shown that a necessary conition is to have at least effective matches for each tuple. However, this conition alone cannot guarantee - anonymity. Consier the assignment graph shown in Figure 4. For simplicity, self-loops are omitte an reciprocal eges connecting the same pair of noes are merge to a single biirectional ege. K 3a an K 3b are complete graphs of 3 noes. Note that every ege in the graph represents an effective match an every noe has at least three incoming an outgoing eges. However, 3-anonymity cannot be achieve by ranomization on top of this assignment graph. For an ege n i where i 1, the path from n i to must go through ege n 5 n 6. So, if t 1 I t 1, we now that t 5 = I t 6. From this, we can raw a conclusion that either t 1 = I t 1 or t 5 = I t 6 is true (a probability of 1 to breach privacy). 2 K 3b n 8 n 7 n 4 n 6 n 5 K 3a Figure 4: An example of assignment graph that has 3 effective matches for each tuple but violates 3-anonymity (self-loops are remove for simplicity) In the above example, each of the possible assignments contains either the match t 1, t 1 or t 5, t 6. Thus, we can fin at most 2 assignments with no overlapping matches. In fact, if there are = 3 assignments with no overlapping matches, we can achieve -anonymity. First, we efine the concept of match-ifferent conition. b c a

6 DEFINITION 7. (match-ifferent assignments) Given two assignments a i, a j. a i is match-ifferent to a j if a i a j =. Having match-ifferent assignments, we can ranomly pic one of them as the resulting assignment of the ranomization process. For each anonymize tuple t j, there are ifferent possible ientities in total. Each ientity of t j is in a ifferent matchifferent assignment. Since each assignment has the same chance to be pice, t j is assigne to a particular tuple in T with the same 1 chance; hence, -anonymity can be achieve. To ensure that there are match-ifferent assignments in the set of generalize QID, we prove that it is sufficient that in the assignment graph with only effective matches each noe has the same number ( ) of incoming eges an outgoing eges. LEMMA 1. Consier an assignment graph with only effective matches, where each noe has outgoing eges an incoming eges. Given match-ifferent assignments a 1, a 2,..., a, where 1 < <, we can always fin an assignment a +1 such that a +1 is match-ifferent to a i for i = 1 to. PROOF. See Appenix A. 5. ANONYMIZATION USING NON-HOMO- GENEOUS GENERALIZATION In this section, we iscuss how we can generate a -anonymize table using non-homogeneous generalization, builing on the observations from the previous section. Although we can apply non-homogeneous generalization irectly on T, ue to the large scale of the ata, the high cost of the necessary ranomization, an the natural partitions that possibly exist in the ata, we first partition the ata into groups of or more tuples an then apply non-homogeneous generalization to each group. In a nutshell we follow the following framewor: 1. Divie the tuples into partitions 2. Generalize the QID of each tuple in each partition 3. Assign generalize QIDs to tuples, base on a ranom assignment We first iscuss how we generalize the QID of each tuple (step 2) in Section 5.1. Then, we explain our ranomization technique (step 3) in Section 5.2. Finally, we outline our partitioning metho (step 1) in Section Ring generalization Assuming that the ata are partitione, non-homogeneous generalization shoul be applie to each partition. In fact, we nee to etermine for each tuple, which 1 other tuples will be inclue in the generalization. Then, we can exten the QID of a tuple to inclue the QID of the other tuples. Let gen be a QID generalization function that taes as input a set of tuples an returns a generalize range on QID. For example, consiering Table 4a, gen({t 2, t 4, t 5}) = [2-5]. Let P S(t i) be the set of tuples in T that is use to prouce a generalize QID for t i. In the above example, P S(t 2) = {t 2, t 4, t 5}. Base on Lemma 1, we shoul have P S(t i) = P S(t j) for all i, j. In orer to minimize information loss in the generalize QIDs an satisfy -anonymity, we esign a generalization with P S(t i) = for all t i. Consier the set of tuples in a partition P an assume that the tuples are orere as t 1, t 2,..., t P. An easy way to construct P S(t i) is to assign the consecutive tuples gen(t i, t i+1,..., t i+ 1 ) to t i. (Note that if i + j > P, we use i + j P instea.) We call this ring generalization, as the assignment graph resulting from it loos lie a ring. Figure 5 illustrates the ring generalization for a partition with 5 tuples an = 3. The upper-left graph in Figure 6 is the corresponing assignment graph. Tuple in T P S(t i) Tuple in T t 1 t 1 t 2 t 3 t 1 = gen(t 1, t 2, t 3) t 2 t 2 t 3 t 4 t 2 = gen(t 2, t 3, t 4) t 3 t 3 t 4 t 5 t 3 = gen(t 3, t 4, t 5) t 4 t 4 t 5 t 1 t 4 = gen(t 1, t 4, t 5) t 5 t 5 t 1 t 2 t 5 = gen(t 1, t 2, t 5) Figure 5: Ring generalization for a partition with 5 tuples an = 3 n 5 n 5 n4 ring generalization n4 thir assignment n 5 n 5 n4 first assignment n4 secon assignment Figure 6: Generating three ranom assignments Note that every match in the ring generalization is effective, as it is part of a cycle. In aition, since each noe in the assignment graph represente by ring generalization has incoming eges an outgoing eges, we can fin match-ifferent assignments an -anonymity can be assure by ranomly picing one of them as the actual assignment (Lemma 1). The ring generalization is a local optimal solution, because we cannot remove any more eges from the graph. In aition, we can easily show that it can give equal or better utility compare to any homogeneous generalization. If the partition size is, the ring generalization egenerates to a homogeneous generalization, where all original tuples in the group match with all generalize tuples. If the partition size P is greater than, in the ring generalization, every tuple will match a set S of tuples as oppose to P in the homogeneous case. Since S P, an all utility metrics are monotonic to subset relationships, only better utility can be achieve by the ring generalization. LEMMA 2. Ring generalization gives a -anonymize partition of equal or better utility than that given by a homogeneous generalization on the same partition. An aitional benefit of this generalization is that, given a proper orering of the tuples (e.g., using Hilbert curves), the values in each generalize QID will be close to each other with high probability, maximizing the utility gain compare to a homogeneous generalization.

7 5.2 Ranomization In this section, we will escribe how we ranomly assign each generalize QID to a tuple in T (an replace the original QID of the tuple by the generalize one). An intuitive iea is to generate all possible assignments an pic one uniformly at ranom. Unfortunately, such an approach may violate -anonymity. For example, in a partition with 5 tuples, the generalize QIDs using ring generalization for 3-anonymity will have 13 ifferent possible assignments. Note that 13 is not ivisible by 3, meaning that some matches are containe in more assignments than other matches. This allows the aversary to infer these matches with probability higher tha/. In the example of Figure 5, matches t i, t i 1 have higher probability (5/13) than matches t i, t i an t i, t i 2 (with probability 4/13). As iscusse in Section 4.2.1, the solution is to efine matchifferent assignments, an ranomly select one of them. One easy construction of match-ifferent assignments is to set a i = { t i, t i j } for j = to 1. (Note that if i j < 1, we use i j + P instea.) Consier the example in Figure 5, an assume that we pic each column of P S(t i) as an assignment. By setting one of these assignment as the real assignment ranomly, there is a chance of 1 a column is chosen. Thus the probability Pr(ti =I t i j) is 1 an -anonymity is preserve. However, if we apply this approach, privacy can easily be compromise when the aversary nows the ientity of one anonymize tuple as bacgroun nowlege. In practice, such bacgroun nowlege can easily be acquire. For example, using the generalization of Figure 5, if an aversary nows that t 3 = I t 2, he nows that the secon column is the real assignment. Hence, he can fin out the ientities of all anonymize tuples, e.g., t 2 = I t 1. This type of attac is calle corruption an has been stuie in [22]. In orer to increase the resistance to corruption, the matchifferent assignments are efine ranomly. The generation process shares a similar framewor as the proof of Lemma 1. The pseuocoe of an algorithm that generates a ranom assignment is shown in Figure 7. In the followings, we briefly escribe the basic iea of the algorithm. The algorithm is run times to generate the assignments. At each run, it operates on the set of matches M that are not present in assignments generate in previous runs. It tries to fin a set of cycles in the graph that cover all noes by ranom wals an use them to efine an assignment. The cycles are foun incrementally, starting from an unassigne noe. After a cycle has been foun, all its noes are mare as processe an searching for a new cycle starts until all noes are processe, in which case the assignment is committe an returne. Cycles are not irectly committe in the assignment once foun, because some of them, when remove, may result in a graph where there oes not exist any cycle. Thus, while fining a new cycle, matches set by previous cycles may change. In aition, we limit each noe to be visite at most once in each ranom wal (by remembering the noes travele in U). The algorithm bactracs when it reaches a ea en. Lemma 1 guarantees correctness an termination. Figure 6 exemplifies three runs of the algorithm on the ring generalization of Figure 5 shown at the upper left of Figure 6. The soli eges show the matches that are chosen for the current assignment. For example, the first assignment, containing cycles n 4,, an n 5 n 5, will assign t 1 to t 4, t 4 to t 2, t 2 to t 1, t 3 to t 3, an t 5 to t 5. After the first run, the corresponing eges are remove an the algorithm is run again to generate the secon assignment. Regaring corruption, ring generalization gives a ( 1)-vertexconnecte assignment graph, i.e., the graph is still connecte after any 1 noes in the graph are remove. Assume an aver- Input: A partition of tuples P ; A set of anonymize tuples Q; A set of possible matches M (excluing matches alreay use in other assignments). Output: An assignment a M. 1. a = { t i, t i } // initial assignment (possibly invali) 2. L = P // L represents the set of unprocesse noes 3. While (L ) 4. // pic an ege to start a loop 5. Pic t i L at ranom 6. Pic t j Q ranomly such that t i, t j M 7. U = {t j} // U remembers the noes travele While (t i / U) // fin a cycle by ranom wal // t i is assigne to t j, so the one that is assigne 1. // to t j before has to fin another pair 11. Select t x where t x, t j a 12. Pic t y U ranomly where t x, t y M 13. if such t y oes not exist t j = t j s parent // bactracing else A t y to U an set t j as t y s parent En while 18. // a loop is foun 19. upate a an remove noes in the loop from L 2. En while 21. return a Figure 7: Algorithm for generating a ranom assignment sary obtains bacgroun nowlege about the ientity of a set of anonymize tuples Q, belonging to the same partition. The aversary can remove the corresponing noes from the assignment graph. If Q < 1, the assignment graph is still connecte, i.e., all matches are still effective. There are at least Q outgoing eges an Q incoming eges for each remaining noe. So, there are at least Q possible ientities for each anonymize tuple. Due to ranomization, an aversary cannot fin out the actual ientity of an anonymize tuple. Therefore, non-homogeneous generalization with ranomization offers a similar privacy protection to -anonymity corruption as homogeneous generalization Cost analysis an optimizations The cost in ranomization for a partition P is ominate by generating the match-ifferent assignments. When generating a new assignment, we maintain a list of unprocesse noes L (line 2). Then, we fin a cycle in the assignment graph which starts with a noe in L by a epth-first ranom wal. Note that each noe can be visite at most once. The complexity for that is O( V + E ) where V is number of noes an E is the number of eges in the graph. V = P an E = P in the assignment graph. Hence, it taes O( P ) to generate a ranom cycle. Note that each new cycle contains at least one noe in L. In the worst case, we nee P iterations to assign every noe in L. So, the overall cost to generate a match-ifferent assignment is O( P 2 ). Since P, ranomization becomes expensive for large values of, or when the partitions are very large compare to. Therefore a goo partitioning strategy shoul avoi generating huge groups. We now escribe two simple optimizations to reuce this cost in practice. In Section 7, we experimentally evaluate the cost of the optimize ranomization algorithm an show that it is bearable in practical cases, as it only epens on the size of the partitions an not the atabase size. Reucing the number of generate match-ifferent assignments. In the ranomization process, we first generate match-

8 ifferent assignments, an then choose one of them ranomly. Let a 1, a 2,... a be the generate assignments in orer. Since which assignment will be pice is inepenent of the generation process, we can first etermine which a i of the assignments will be pice to be the real assignment an then generate up to the i-th assignment. This will, on average, reuce half of the ranomization cost. Note that the assignments before the i-th shoul be generate, as they etermine which eges remain at the time of the generation of the i-th assignment. Generating an picing always the first assignment woul result in the selection of some matches with higher probability an is not acceptable, as iscusse in the beginning of Section 5.2. Using a ranom permutation to generate initial assignment. The goal of the algorithm in Figure 7 is to generate a ranom match-ifferent assignment. A fast Monte Carlo way is to use a ranom permutation. The resulting permutation may not be a vali assignment because some of the matches may not be in the set of possible matches M. However, some matches in the permutation may be vali. We use this as the initial assignment to the algorithm (line 1 in Figure 7). L is initialize to be the set of anonymize tuples that o not have a vali match. This reuces the initial size of L an hence the computational cost of assignment generation. 5.3 Partitioning In this section, we iscuss how to choose a goo partitioning strategy for non-homogeneous generalization. Before this, we will provie an appropriate measure for utility, which we aopt from previous wor. Base on our iscussion so far, value ranges are use to efine a generalize QID of tuples. For example, for a QID which is generate by the three values 15, 2, 48, we use range [15-48]. This format is compact an is easy to use in ata analysis, however, it introuces some unnecessary information loss, as values within the range but not present in the generating set of values are inclue in it. For example, value 17 is implicitly inclue in the generalize range [15-48]. In fact, the QID generalization moel that has the minimum information loss is the set representation, e.g., set {15, 2, 48} is use as the generalize QID. The set representation offers a significant improvement in utility an it is also more general, as it is appropriate for both orere an nominal attributes. Therefore, we aopt it in this paper an use it in subsequent iscussions. In aition, we use the Global Certainty Penalty (GCP) [6] as a measure for utility. DEFINITION 8. ( metric - GCP) Let t i be an anonymize tuple in anonymize table T using set representation. Let A be a QID attribute, A be the carinality of A, an count A(t i) be the number of istinct values of A in t i. The normalize certainty penalty NCP of t i on attribute A is NCP A(t i) = count A (t i ) 1 A 1. NCP(T ) = A QID t i T NCP A(t i), for the whole table T. Finally, GCP(T ) is efine as NCP(T ) T where is the number of attributes in the QID. As iscusse, similar to the homogeneous case, for scalability reasons we shoul ivie the tuples of T into partitions before applying non-homogeneous generalization to each of them. One option is to use an off-the-shelf partitioning metho for homogeneous generalization (e.g., [14]) an then apply our ring generalization at each partition. However, existing partitioning strategies may not be the most appropriate as they o not tae into account the use of non-homogeneous generalization. The main ifference between homogeneous an non-homogeneous generalization is that, in the former, it is always better to ivie a large group into two. For example, consier a set of four QID t 1 = 1, 1, 1, t 2 = 1, 1, 2, 8 A 1 t 3 = 2, 1, 2, t 4 = 1, 2, 1 an assume the omains of all QID attributes are the same. Suppose that we put partition t 1, t 2 into group G 1 an t 3, t 4 into another group G 2. t 1 an t 2 iffer in one value, whereas t 3 an t 4 iffer i values. Thus, with this grouping the value of NCP is. If we apply ring generalization irectly on the four tuples, without partitioning, the value of NCP is = 6. Hence, we obtain a higher information loss A 1 A 1 after we partition the tuples. In the non-homogeneous case, even if the partition size is much larger than, each generalize QID is generate from exactly tuples. Thus, low information loss can be still achieve in large partitions, as oppose to homogeneous generalization, which suffers from high information loss if the size of partitions is large. On the other han, the size of the partitions affects the cost of ranomization in our approach (see Section 5.2.1), therefore it shoul be controlle Partitioning base on lexicographical orer We now iscuss our partitioning strategy for non-homogeneous generalization. As iscusse, we consier a set representation for the QIDs, i.e., in an anonymize tuple t i each QID attribute taes the set of values of that attribute in the tuples that generate the QID. Hence, the istinct count count A(t i) of A s values in t i is less than or equal to. Accoring to the NCP measure, if the omain size of A is small, we lose more information if more than one values of A exist in a generalize tuple. For example, consier attributes sex an birthay with omain sizes 2 a66, respectively. If we put two tuples in a group with ifferent sex values the NCP for that attribute in the group will be maximize, but if we put two tuples with ifferent birthay, the introuce NCP error is small. Thus, uring partitioning, we shoul prioritize the reuction of count A(t i) for attributes with small omains. To achieve this goal, we orer the attributes accoring to their omain size; then the tuples in T are sorte in lexicographical QID orer, base on the attribute orering. The tuples are partitione in a top-own fashion. First, we consier attribute A 1 with the smallest omain size an put tuples with the same A-value in the same partition. This results in a set of partitions P 1, P 2,... P m, such that the NCP of A 1 will be in all partitions. However, some partitions may have less than tuples. For each such partition P j, we fin a neighboring partition P x, either P j 1 or P j+1, that can either be merge with P j or some tuples can be move from P x to P j in orer for both of them to have at least tuples. If P x + P j < 2, we merge P x with P j; otherwise, we move tuples from P x to P j such that P j =. After we are one with A 1, we recursively partition the resulting groups using the next attribute in orer (i.e., A 2). In some partitions, the tuples may have ifferent A 1 values (ue to merging). For such partitions, we o not attempt to further ecompose them recursively using another attribute. The partitioning strategy is repeate until all partitions are finalize or there are no more attributes that can be use for recursive partitioning. Figure 8 shows a pseuocoe of the partitioning algorithm. On each finalize partition, we apply non-homogeneous generalization, as explaine in Sections 5.1 an Cost analysis of partitioning Before we apply the partitioning algorithm, shown in Figure 8, we nee to sort the attributes an the tuples. Assuming that the number of attributes in the QID is negligible compare to the number of tuples, sorting costs O( T log( T )). Moving ata between partitions or merging, is applie only for consecutive partitions. Each partition efine by the first attribute can recursively be repartitione up to times, assuming a -imensional QID. As the ata having the same values in attributes A 1, A 2,... A i are sorte

9 Input: a set of tuples P ; parameter in -anonymity ; Attribute use for partitioning A x; Preconition: (i) attributes in QID are sorte in ascening orer of omain size A 1, A 2,..., A QID ; (ii) tuples are sorte in lexicographical orer accoring to attribute-orer; (iii) all tuples in P have the same value on A 1, A 2,... A x // minimize the uncertainty for attribute A i 2. Partition P into P 1, P 2,... P m using A i 3. // if there are not enough tuples in a group 4. For each P j where P j < 5. Fin P x as a neighboring partition of P j 6. If P j + P x > 2 7. move P j tuples from P x to P j 8. Else 9. Merge P j an P x 1. En for 11. For each P j 12. If ( t a, t b P j, t a[a i] t b [A i]) or (A i = A QID ) 13. Apply non-homogeneous generalization on P j 14. Else 15. partition(p j,, A i+1) // recursive call 16. En for Figure 8: Partitioning Algorithm w.r.t. attribute A i+1, no aitional sorting is require. In the worst case, where all tuples have the same values in all attributes the table will be rea times, so the worst-case cost is T ( + log T ). 6. EXTENSION TO L-DIVERSITY In this section, we iscuss how we can apply non-homogeneous generalization for other privacy principles. In particular, we focus on l-iversity, escribe in Definition 9. DEFINITION 9. (l-iversity) Let T be a table with a sensitive attribute S, an H be the projection of T on ey an QID attributes. l-iversity is preserve by an anonymize table T if t i H, s S, Pr(t i[s] = s) 1 in a lining attac by l joining H an T. By efinition, in orer to satisfy l-iversity each partition P shoul not contain a sensitive value that occurs more than P times. In l such a partition, we can orer the tuples in P such that no consecutive l tuples have a sensitive value that occurs more than once. For example, consier a partition of size 4, where there are 2 sensitive values s 1 an s 2, each appearing 2 times in the partitions. In orer to achieve 2-iversity, we can arrange the tuples with sensitive values in the orer of {s 1, s 2, s 1, s 2}. By applying ring generalization to the orere tuples, each generalize QID will cover two tuples with ifferent sensitive values. Hence, 2-iversity is satisfie. Thus, non-homogeneous generalization can irectly be applie on the partitions generate by any existing algorithm for l-iversity, lie [6, 26]. The utility will be higher than or equal to that of applying homogeneous generalization. However, as existing algorithms o not tae into account non-homogeneous generalization, they ten to generate partitions with minimal sizes, so the utility improvement by non-homogeneous generalization may not be maximal. We now iscuss how our partitioning strategy (escribe in Section 5.3) can be aapte to generate l-iverse partitions. Note that, the original table shoul satisfy l-iversity, otherwise it cannot be split to partitions which all satisfy l-iversity [26]. Recall that our partitioning strategy recursively ivies the tuples using one attribute at a time. In each iteration, we obtain a set of partitions P 1, P 2,..., P m. For a partition P j that oes not satisfy l-iversity, we fin a partition P x such that l-iversity is satisfie by the merge partition P j P x. If such a partition cannot be foun, we merge P j with a ranom partition an merging continues until the resulting partition satisfies l-iversity. In the worst case, the algorithm will merge all partitions into a single one, which must satisfy l- iversity. Hence, this scheme always gives a set of partitions that satisfy l-iversity. Then, for each partition, we orer the tuples an apply non-homogeneous generalization. As we have iscusse in Section 5.2.1, the cost of ranomization is highly correlate to the partition size. So, if a huge partition is generate by the above process, the generalization cost for it may be extremely high. In orer to reuce the cost, we perform arbitrary splits to such partitions, maing sure that each sensitive value appears in each smaller partition at most once. This way, each partition has a maximum size equal to the omain of the sensitive attribute, which is usually small in practice. 7. EMPIRICAL EVALUATION In this section, we experimentally compare our evelope anonymization scheme, enote by (non-homogeneous generalization), with the state-of-the-art -anonymity algorithm of [6]. The algorithm of [6] sorts the the tuples base on their values on a Hilbert curve an then generates partitions using a ynamic programming algorithm that is optimal for homogeneous generalization of 1-imensional QID. Although the resulting partitions may not be optimal, they have low information loss, since two nearby values on the Hilbert curve are also near in the original high imensional space with high probability. Range representation is use for generalize QIDs in [6], however, a set representation can irectly be applie on the partitions to improve utility. We use HR to enote the original algorithm of [6] with range representation an its version with set representation. Our algorithm applies non-homogeneous generalization on top of the partitioning scheme propose in Section 5.3 an uses set representation for the generalize QID. All optimizations for ranomization as escribe in Section are implemente in. In aition, we implemente a metho that uses our partitioning strategy followe by homogeneous generalization (using set representation), to verify whether any improvement in the information loss is achieve ue to our partitioning strategy or ue to non-homogeneous generalization. We enote this metho by (homogeneous partitioning)., after partitioning, uses a single generalize QID to represent all ata in each partition. Ranomization is not applie by this algorithm, since generalization is homogeneous. All algorithms are implemente in C++ an the experiments are run on an Intel Core 2 Duo 2.8GHz machine with 2GB RAM, running Winows. Our experiments are one mainly on a real ataset CENSUS (ownloaable from ) that is wiely use in the literature (e.g., in [27, 26, 6]). The ataset contains information about 5K iniviuals. A summary of the attributes in the ataset is shown in Table 5. Note that the majority of attributes are nominal, inicating that a set representation for a generalize QID is more appropriate than a range representation, since there is no natural orer for most of the attributes. In the experiments, we vary the following parameters: (i) number of tuples n: we sample the CENSUS ataset to generate input tables of varying size; (ii) number of attributes in the QID: we use the first attributes as the QID while other attributes are treate as others ; (iii) value of in -anonymity. The range of each parameter an the efault value is shown in Table 6. We measure the information loss of the generalize tables using GCP (efine in Section 5.3), which is a commonly use metric

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract The Reconstruction of Graphs Dhananay P. Mehenale Sir Parashurambhau College, Tila Roa, Pune-4030, Inia. Abstract In this paper we iscuss reconstruction problems for graphs. We evelop some new ieas lie

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

Design of Policy-Aware Differentially Private Algorithms

Design of Policy-Aware Differentially Private Algorithms Design of Policy-Aware Differentially Private Algorithms Samuel Haney Due University Durham, NC, USA shaney@cs.ue.eu Ashwin Machanavajjhala Due University Durham, NC, USA ashwin@cs.ue.eu Bolin Ding Microsoft

More information

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems On the Role of Multiply Sectione Bayesian Networks to Cooperative Multiagent Systems Y. Xiang University of Guelph, Canaa, yxiang@cis.uoguelph.ca V. Lesser University of Massachusetts at Amherst, USA,

More information

Distributed Line Graphs: A Universal Technique for Designing DHTs Based on Arbitrary Regular Graphs

Distributed Line Graphs: A Universal Technique for Designing DHTs Based on Arbitrary Regular Graphs IEEE TRANSACTIONS ON KNOWLEDE AND DATA ENINEERIN, MANUSCRIPT ID Distribute Line raphs: A Universal Technique for Designing DHTs Base on Arbitrary Regular raphs Yiming Zhang an Ling Liu, Senior Member,

More information

Overlap Interval Partition Join

Overlap Interval Partition Join Overlap Interval Partition Join Anton Dignös Department of Computer Science University of Zürich, Switzerlan aignoes@ifi.uzh.ch Michael H. Böhlen Department of Computer Science University of Zürich, Switzerlan

More information

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University

More information

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method Southern Cross University epublications@scu 23r Australasian Conference on the Mechanics of Structures an Materials 214 Transient analysis of wave propagation in 3D soil by using the scale bounary finite

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base

More information

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help CS2110 Spring 2016 Assignment A. Linke Lists Due on the CMS by: See the CMS 1 Preamble Linke Lists This assignment begins our iscussions of structures. In this assignment, you will implement a structure

More information

Considering bounds for approximation of 2 M to 3 N

Considering bounds for approximation of 2 M to 3 N Consiering bouns for approximation of to (version. Abstract: Estimating bouns of best approximations of to is iscusse. In the first part I evelop a powerseries, which shoul give practicable limits for

More information

d 3 d 4 d d d d d d d d d d d 1 d d d d d d

d 3 d 4 d d d d d d d d d d d 1 d d d d d d Proceeings of the IASTED International Conference Software Engineering an Applications (SEA') October 6-, 1, Scottsale, Arizona, USA AN OBJECT-ORIENTED APPROACH FOR MANAGING A NETWORK OF DATABASES Shu-Ching

More information

Learning Polynomial Functions. by Feature Construction

Learning Polynomial Functions. by Feature Construction I Proceeings of the Eighth International Workshop on Machine Learning Chicago, Illinois, June 27-29 1991 Learning Polynomial Functions by Feature Construction Richar S. Sutton GTE Laboratories Incorporate

More information

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources An Algorithm for Builing an Enterprise Network Topology Using Wiesprea Data Sources Anton Anreev, Iurii Bogoiavlenskii Petrozavosk State University Petrozavosk, Russia {anreev, ybgv}@cs.petrsu.ru Abstract

More information

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

6 Gradient Descent. 6.1 Functions

6 Gradient Descent. 6.1 Functions 6 Graient Descent In this topic we will iscuss optimizing over general functions f. Typically the function is efine f : R! R; that is its omain is multi-imensional (in this case -imensional) an output

More information

Fast Fractal Image Compression using PSO Based Optimization Techniques

Fast Fractal Image Compression using PSO Based Optimization Techniques Fast Fractal Compression using PSO Base Optimization Techniques A.Krishnamoorthy Visiting faculty Department Of ECE University College of Engineering panruti rishpci89@gmail.com S.Buvaneswari Visiting

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mit.eu 6.854J / 18.415J Avance Algorithms Fall 2008 For inormation about citing these materials or our Terms o Use, visit: http://ocw.mit.eu/terms. 18.415/6.854 Avance Algorithms

More information

Frequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises

Frequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises verview Frequent Pattern Mining comprises Frequent Pattern Mining hristian Borgelt School of omputer Science University of Konstanz Universitätsstraße, Konstanz, Germany christian.borgelt@uni-konstanz.e

More information

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES

BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES BIJECTIONS FOR PLANAR MAPS WITH BOUNDARIES OLIVIER BERNARDI AND ÉRIC FUSY Abstract. We present bijections for planar maps with bounaries. In particular, we obtain bijections for triangulations an quarangulations

More information

Learning Subproblem Complexities in Distributed Branch and Bound

Learning Subproblem Complexities in Distributed Branch and Bound Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University

More information

Lecture 1 September 4, 2013

Lecture 1 September 4, 2013 CS 84r: Incentives an Information in Networks Fall 013 Prof. Yaron Singer Lecture 1 September 4, 013 Scribe: Bo Waggoner 1 Overview In this course we will try to evelop a mathematical unerstaning for the

More information

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing

Indexing the Edges A simple and yet efficient approach to high-dimensional indexing Inexing the Eges A simple an yet efficient approach to high-imensional inexing Beng Chin Ooi Kian-Lee Tan Cui Yu Stephane Bressan Department of Computer Science National University of Singapore 3 Science

More information

Divide-and-Conquer Algorithms

Divide-and-Conquer Algorithms Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:

More information

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem

Throughput Characterization of Node-based Scheduling in Multihop Wireless Networks: A Novel Application of the Gallai-Edmonds Structure Theorem Throughput Characterization of Noe-base Scheuling in Multihop Wireless Networks: A Novel Application of the Gallai-Emons Structure Theorem Bo Ji an Yu Sang Dept. of Computer an Information Sciences Temple

More information

A Stochastic Process on the Hypercube with Applications to Peer to Peer Networks

A Stochastic Process on the Hypercube with Applications to Peer to Peer Networks A Stochastic Process on the Hypercube with Applications to Peer to Peer Networs [Extene Abstract] Micah Aler Department of Computer Science, University of Massachusetts, Amherst, MA 0003 460, USA micah@cs.umass.eu

More information

AnyTraffic Labeled Routing

AnyTraffic Labeled Routing AnyTraffic Labele Routing Dimitri Papaimitriou 1, Pero Peroso 2, Davie Careglio 2 1 Alcatel-Lucent Bell, Antwerp, Belgium Email: imitri.papaimitriou@alcatel-lucent.com 2 Universitat Politècnica e Catalunya,

More information

Optimal Oblivious Path Selection on the Mesh

Optimal Oblivious Path Selection on the Mesh Optimal Oblivious Path Selection on the Mesh Costas Busch Malik Magon-Ismail Jing Xi Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 280, USA {buschc,magon,xij2}@cs.rpi.eu Abstract

More information

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization 1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24

More information

Improving Performance of Sparse Matrix-Vector Multiplication

Improving Performance of Sparse Matrix-Vector Multiplication Improving Performance of Sparse Matrix-Vector Multiplication Ali Pınar Michael T. Heath Department of Computer Science an Center of Simulation of Avance Rockets University of Illinois at Urbana-Champaign

More information

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques Politehnica University of Timisoara Mobile Computing, Sensors Network an Embee Systems Laboratory ing Techniques What is testing? ing is the process of emonstrating that errors are not present. The purpose

More information

Message Transport With The User Datagram Protocol

Message Transport With The User Datagram Protocol Message Transport With The User Datagram Protocol User Datagram Protocol (UDP) Use During startup For VoIP an some vieo applications Accounts for less than 10% of Internet traffic Blocke by some ISPs Computer

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

Fuzzy Clustering in Parallel Universes

Fuzzy Clustering in Parallel Universes Fuzzy Clustering in Parallel Universes Bern Wisweel an Michael R. Berthol ALTANA-Chair for Bioinformatics an Information Mining Department of Computer an Information Science, University of Konstanz 78457

More information

2-connected graphs with small 2-connected dominating sets

2-connected graphs with small 2-connected dominating sets 2-connecte graphs with small 2-connecte ominating sets Yair Caro, Raphael Yuster 1 Department of Mathematics, University of Haifa at Oranim, Tivon 36006, Israel Abstract Let G be a 2-connecte graph. A

More information

Comparison of Methods for Increasing the Performance of a DUA Computation

Comparison of Methods for Increasing the Performance of a DUA Computation Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,

More information

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly International Journal "Information Technologies an Knowlege" Vol. / 2007 309 [Project MINERVAEUROPE] Project MINERVAEUROPE: Ministerial Network for Valorising Activities in igitalisation -

More information

arxiv: v1 [cs.cr] 22 Apr 2015 ABSTRACT

arxiv: v1 [cs.cr] 22 Apr 2015 ABSTRACT Differentially Private -Means Clustering arxiv:10.0v1 [cs.cr] Apr 01 ABSTRACT Dong Su #, Jianneng Cao, Ninghui Li #, Elisa Bertino #, Hongxia Jin There are two broa approaches for ifferentially private

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Svärm, Linus; Stranmark, Petter Unpublishe: 2010-01-01 Link to publication Citation for publishe version (APA): Svärm, L., & Stranmark, P. (2010). Shift-map Image Registration.

More information

Image Segmentation using K-means clustering and Thresholding

Image Segmentation using K-means clustering and Thresholding Image Segmentation using Kmeans clustering an Thresholing Preeti Panwar 1, Girhar Gopal 2, Rakesh Kumar 3 1M.Tech Stuent, Department of Computer Science & Applications, Kurukshetra University, Kurukshetra,

More information

CS269I: Incentives in Computer Science Lecture #8: Incentives in BGP Routing

CS269I: Incentives in Computer Science Lecture #8: Incentives in BGP Routing CS269I: Incentives in Computer Science Lecture #8: Incentives in BGP Routing Tim Roughgaren October 19, 2016 1 Routing in the Internet Last lecture we talke about elay-base (or selfish ) routing, which

More information

Using Vector and Raster-Based Techniques in Categorical Map Generalization

Using Vector and Raster-Based Techniques in Categorical Map Generalization Thir ICA Workshop on Progress in Automate Map Generalization, Ottawa, 12-14 August 1999 1 Using Vector an Raster-Base Techniques in Categorical Map Generalization Beat Peter an Robert Weibel Department

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication

Additional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication Aitional Divie an Conquer Algorithms Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication Divie an Conquer Closest Pair Let s revisit the closest pair problem. Last

More information

Open Access Adaptive Image Enhancement Algorithm with Complex Background

Open Access Adaptive Image Enhancement Algorithm with Complex Background Sen Orers for Reprints to reprints@benthamscience.ae 594 The Open Cybernetics & Systemics Journal, 205, 9, 594-600 Open Access Aaptive Image Enhancement Algorithm with Complex Bacgroun Zhang Pai * epartment

More information

1 Surprises in high dimensions

1 Surprises in high dimensions 1 Surprises in high imensions Our intuition about space is base on two an three imensions an can often be misleaing in high imensions. It is instructive to analyze the shape an properties of some basic

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation

On Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation On Effectively Determining the Downlink-to-uplink Sub-frame With Ratio for Mobile WiMAX Networks Using Spline Extrapolation Panagiotis Sarigianniis, Member, IEEE, Member Malamati Louta, Member, IEEE, Member

More information

A shortest path algorithm in multimodal networks: a case study with time varying costs

A shortest path algorithm in multimodal networks: a case study with time varying costs A shortest path algorithm in multimoal networks: a case stuy with time varying costs Daniela Ambrosino*, Anna Sciomachen* * Department of Economics an Quantitative Methos (DIEM), University of Genoa Via

More information

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks

Robust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks Robust PIM-SM Multicasting using Anycast RP in Wireless A Hoc Networks Jaewon Kang, John Sucec, Vikram Kaul, Sunil Samtani an Mariusz A. Fecko Applie Research, Telcoria Technologies One Telcoria Drive,

More information

Multilevel Linear Dimensionality Reduction using Hypergraphs for Data Analysis

Multilevel Linear Dimensionality Reduction using Hypergraphs for Data Analysis Multilevel Linear Dimensionality Reuction using Hypergraphs for Data Analysis Haw-ren Fang Department of Computer Science an Engineering University of Minnesota; Minneapolis, MN 55455 hrfang@csumneu ABSTRACT

More information

Secure Network Coding for Distributed Secret Sharing with Low Communication Cost

Secure Network Coding for Distributed Secret Sharing with Low Communication Cost Secure Network Coing for Distribute Secret Sharing with Low Communication Cost Nihar B. Shah, K. V. Rashmi an Kannan Ramchanran, Fellow, IEEE Abstract Shamir s (n,k) threshol secret sharing is an important

More information

Scalable Deterministic Scheduling for WDM Slot Switching Xhaul with Zero-Jitter

Scalable Deterministic Scheduling for WDM Slot Switching Xhaul with Zero-Jitter FDL sel. VOA SOA 100 Regular papers ONDM 2018 Scalable Deterministic Scheuling for WDM Slot Switching Xhaul with Zero-Jitter Bogan Uscumlic 1, Dominique Chiaroni 1, Brice Leclerc 1, Thierry Zami 2, Annie

More information

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE

THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE БСУ Международна конференция - 2 THE BAYESIAN RECEIVER OPERATING CHARACTERISTIC CURVE AN EFFECTIVE APPROACH TO EVALUATE THE IDS PERFORMANCE Evgeniya Nikolova, Veselina Jecheva Burgas Free University Abstract:

More information

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks

MORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks : a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications

More information

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember

filtering LETTER An Improved Neighbor Selection Algorithm in Collaborative Taek-Hun KIM a), Student Member and Sung-Bong YANG b), Nonmember 107 IEICE TRANS INF & SYST, VOLE88 D, NO5 MAY 005 LETTER An Improve Neighbor Selection Algorithm in Collaborative Filtering Taek-Hun KIM a), Stuent Member an Sung-Bong YANG b), Nonmember SUMMARY Nowaays,

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH Galen H Sasaki Dept Elec Engg, U Hawaii 2540 Dole Street Honolul HI 96822 USA Ching-Fong Su Fuitsu Laboratories of America 595 Lawrence Expressway

More information

Data Mining: Clustering

Data Mining: Clustering Bi-Clustering COMP 790-90 Seminar Spring 011 Data Mining: Clustering k t 1 K-means clustering minimizes Where ist ( x, c i t i c t ) ist ( x m j 1 ( x ij i, c c t ) tj ) Clustering by Pattern Similarity

More information

Reformulation and Solution Algorithms for Absolute and Percentile Robust Shortest Path Problems

Reformulation and Solution Algorithms for Absolute and Percentile Robust Shortest Path Problems > REPLACE THIS LINE WITH YOUR PAPER IENTIFICATION NUMBER (OUBLE-CLICK HERE TO EIT) < 1 Reformulation an Solution Algorithms for Absolute an Percentile Robust Shortest Path Problems Xuesong Zhou, Member,

More information

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition ITERATIOAL JOURAL OF MATHEMATICS AD COMPUTERS I SIMULATIO A eural etwork Moel Base on Graph Matching an Annealing :Application to Han-Written Digits Recognition Kyunghee Lee Abstract We present a neural

More information

Modifying ROC Curves to Incorporate Predicted Probabilities

Modifying ROC Curves to Incorporate Predicted Probabilities Moifying ROC Curves to Incorporate Preicte Probabilities Cèsar Ferri DSIC, Universitat Politècnica e València Peter Flach Department of Computer Science, University of Bristol José Hernánez-Orallo DSIC,

More information

William S. Law. Erik K. Antonsson. Engineering Design Research Laboratory. California Institute of Technology. Abstract

William S. Law. Erik K. Antonsson. Engineering Design Research Laboratory. California Institute of Technology. Abstract Optimization Methos for Calculating Design Imprecision y William S. Law Eri K. Antonsson Engineering Design Research Laboratory Division of Engineering an Applie Science California Institute of Technology

More information

Study of Network Optimization Method Based on ACL

Study of Network Optimization Method Based on ACL Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department

More information

THE increasingly digitized power system offers more data,

THE increasingly digitized power system offers more data, 1 Cyber Risk Analysis of Combine Data Attacks Against Power System State Estimation Kaikai Pan, Stuent Member, IEEE, Anré Teixeira, Member, IEEE, Milos Cvetkovic, Member, IEEE, an Peter Palensky, Senior

More information

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory Feature Extraction an Rule Classification Algorithm of Digital Mammography base on Rough Set Theory Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative

More information

On the Placement of Internet Taps in Wireless Neighborhood Networks

On the Placement of Internet Taps in Wireless Neighborhood Networks 1 On the Placement of Internet Taps in Wireless Neighborhoo Networks Lili Qiu, Ranveer Chanra, Kamal Jain, Mohamma Mahian Abstract Recently there has emerge a novel application of wireless technology that

More information

NAND flash memory is widely used as a storage

NAND flash memory is widely used as a storage 1 : Buffer-Aware Garbage Collection for Flash-Base Storage Systems Sungjin Lee, Dongkun Shin Member, IEEE, an Jihong Kim Member, IEEE Abstract NAND flash-base storage evice is becoming a viable storage

More information

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Characterizing Decoding Robustness under Parametric Channel Uncertainty Characterizing Decoing Robustness uner Parametric Channel Uncertainty Jay D. Wierer, Wahee U. Bajwa, Nigel Boston, an Robert D. Nowak Abstract This paper characterizes the robustness of ecoing uner parametric

More information

Learning convex bodies is hard

Learning convex bodies is hard Learning convex boies is har Navin Goyal Microsoft Research Inia navingo@microsoftcom Luis Raemacher Georgia Tech lraemac@ccgatecheu Abstract We show that learning a convex boy in R, given ranom samples

More information

A Plane Tracker for AEC-automation Applications

A Plane Tracker for AEC-automation Applications A Plane Tracker for AEC-automation Applications Chen Feng *, an Vineet R. Kamat Department of Civil an Environmental Engineering, University of Michigan, Ann Arbor, USA * Corresponing author (cforrest@umich.eu)

More information

Efficient Recovery from False State in Distributed Routing Algorithms

Efficient Recovery from False State in Distributed Routing Algorithms Efficient Recovery from False State in Distribute Routing Algorithms Daniel Gyllstrom, Suarshan Vasuevan, Jim Kurose, Gerome Milau Department of Computer Science University of Massachusetts Amherst {pg,

More information

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters Available online at www.scienceirect.com Proceia Engineering 4 (011 ) 34 38 011 International Conference on Avances in Engineering Cluster Center Initialization Metho for K-means Algorithm Over Data Sets

More information

Probabilistic Medium Access Control for. Full-Duplex Networks with Half-Duplex Clients

Probabilistic Medium Access Control for. Full-Duplex Networks with Half-Duplex Clients Probabilistic Meium Access Control for 1 Full-Duplex Networks with Half-Duplex Clients arxiv:1608.08729v1 [cs.ni] 31 Aug 2016 Shih-Ying Chen, Ting-Feng Huang, Kate Ching-Ju Lin, Member, IEEE, Y.-W. Peter

More information

Ad-Hoc Networks Beyond Unit Disk Graphs

Ad-Hoc Networks Beyond Unit Disk Graphs A-Hoc Networks Beyon Unit Disk Graphs Fabian Kuhn, Roger Wattenhofer, Aaron Zollinger Department of Computer Science ETH Zurich 8092 Zurich, Switzerlan {kuhn, wattenhofer, zollinger}@inf.ethz.ch ABSTRACT

More information

6.823 Computer System Architecture. Problem Set #3 Spring 2002

6.823 Computer System Architecture. Problem Set #3 Spring 2002 6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem

More information

Supporting Fully Adaptive Routing in InfiniBand Networks

Supporting Fully Adaptive Routing in InfiniBand Networks XIV JORNADAS DE PARALELISMO - LEGANES, SEPTIEMBRE 200 1 Supporting Fully Aaptive Routing in InfiniBan Networks J.C. Martínez, J. Flich, A. Robles, P. López an J. Duato Resumen InfiniBan is a new stanar

More information

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency

Software Reliability Modeling and Cost Estimation Incorporating Testing-Effort and Efficiency Software Reliability Moeling an Cost Estimation Incorporating esting-effort an Efficiency Chin-Yu Huang, Jung-Hua Lo, Sy-Yen Kuo, an Michael R. Lyu -+ Department of Electrical Engineering Computer Science

More information

Tracking and Regulation Control of a Mobile Robot System With Kinematic Disturbances: A Variable Structure-Like Approach

Tracking and Regulation Control of a Mobile Robot System With Kinematic Disturbances: A Variable Structure-Like Approach W. E. Dixon e-mail: wixon@ces.clemson.eu D. M. Dawson e-mail: awson@ces.clemson.eu E. Zergeroglu e-mail: ezerger@ces.clemson.eu Department of Electrical & Computer Engineering, Clemson University, Clemson,

More information

A multiple wavelength unwrapping algorithm for digital fringe profilometry based on spatial shift estimation

A multiple wavelength unwrapping algorithm for digital fringe profilometry based on spatial shift estimation University of Wollongong Research Online Faculty of Engineering an Information Sciences - Papers: Part A Faculty of Engineering an Information Sciences 214 A multiple wavelength unwrapping algorithm for

More information

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM International Journal of Physics an Mathematical Sciences ISSN: 2277-2111 (Online) 2016 Vol. 6 (1) January-March, pp. 24-6/Mao an Shi. THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM Hua Mao

More information

Rough Set Approach for Classification of Breast Cancer Mammogram Images

Rough Set Approach for Classification of Breast Cancer Mammogram Images Rough Set Approach for Classification of Breast Cancer Mammogram Images Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative Methos an Information Systems

More information

Data Mining: Concepts and Techniques. Chapter 7. Cluster Analysis. Examples of Clustering Applications. What is Cluster Analysis?

Data Mining: Concepts and Techniques. Chapter 7. Cluster Analysis. Examples of Clustering Applications. What is Cluster Analysis? Data Mining: Concepts an Techniques Chapter Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign www.cs.uiuc.eu/~hanj Jiawei Han an Micheline Kamber, All rights reserve

More information

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 4, APRIL IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 1, NO. 4, APRIL 01 74 Towar Efficient Distribute Algorithms for In-Network Binary Operator Tree Placement in Wireless Sensor Networks Zongqing Lu,

More information

PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING

PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING PERFECT ONE-ERROR-CORRECTING CODES ON ITERATED COMPLETE GRAPHS: ENCODING AND DECODING FOR THE SF LABELING PAMELA RUSSELL ADVISOR: PAUL CULL OREGON STATE UNIVERSITY ABSTRACT. Birchall an Teor prove that

More information

Table-based division by small integer constants

Table-based division by small integer constants Table-base ivision by small integer constants Florent e Dinechin, Laurent-Stéphane Diier LIP, Université e Lyon (ENS-Lyon/CNRS/INRIA/UCBL) 46, allée Italie, 69364 Lyon Ceex 07 Florent.e.Dinechin@ens-lyon.fr

More information

Improving Spatial Reuse of IEEE Based Ad Hoc Networks

Improving Spatial Reuse of IEEE Based Ad Hoc Networks mproving Spatial Reuse of EEE 82.11 Base A Hoc Networks Fengji Ye, Su Yi an Biplab Sikar ECSE Department, Rensselaer Polytechnic nstitute Troy, NY 1218 Abstract n this paper, we evaluate an suggest methos

More information

EDOVE: Energy and Depth Variance-Based Opportunistic Void Avoidance Scheme for Underwater Acoustic Sensor Networks

EDOVE: Energy and Depth Variance-Based Opportunistic Void Avoidance Scheme for Underwater Acoustic Sensor Networks sensors Article EDOVE: Energy an Depth Variance-Base Opportunistic Voi Avoiance Scheme for Unerwater Acoustic Sensor Networks Safar Hussain Bouk 1, *, Sye Hassan Ahme 2, Kyung-Joon Park 1 an Yongsoon Eun

More information

Adjacency Matrix Based Full-Text Indexing Models

Adjacency Matrix Based Full-Text Indexing Models 1000-9825/2002/13(10)1933-10 2002 Journal of Software Vol.13, No.10 Ajacency Matrix Base Full-Text Inexing Moels ZHOU Shui-geng 1, HU Yun-fa 2, GUAN Ji-hong 3 1 (Department of Computer Science an Engineering,

More information

Chapter 5 Proposed models for reconstituting/ adapting three stereoscopes

Chapter 5 Proposed models for reconstituting/ adapting three stereoscopes Chapter 5 Propose moels for reconstituting/ aapting three stereoscopes - 89 - 5. Propose moels for reconstituting/aapting three stereoscopes This chapter offers three contributions in the Stereoscopy area,

More information

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks

Backpressure-based Packet-by-Packet Adaptive Routing in Communication Networks 1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stolyar Abstract Backpressure-base aaptive routing

More information

I DT MC. Operating Manual SINAMICS S120. Verification of Performance Level e in accordance with EN ISO

I DT MC. Operating Manual SINAMICS S120. Verification of Performance Level e in accordance with EN ISO I DT MC Operating Manual SINAMICS S20 Verification of Performance Level e in accorance with EN ISO 3849- Document Project Status: release Organization: I DT MC Baseline:.2 Location: Erl. F80 Date: 24.09.2009

More information

Estimating Velocity Fields on a Freeway from Low Resolution Video

Estimating Velocity Fields on a Freeway from Low Resolution Video Estimating Velocity Fiels on a Freeway from Low Resolution Vieo Young Cho Department of Statistics University of California, Berkeley Berkeley, CA 94720-3860 Email: young@stat.berkeley.eu John Rice Department

More information

Questions? Post on piazza, or Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)!

Questions? Post on piazza, or  Radhika (radhika at eecs.berkeley) or Sameer (sa at berkeley)! EE122 Fall 2013 HW3 Instructions Recor your answers in a file calle hw3.pf. Make sure to write your name an SID at the top of your assignment. For each problem, clearly inicate your final answer, bol an

More information