Reclaiming Space from Duplicate Files in a Serverless Distributed File System

Size: px
Start display at page:

Download "Reclaiming Space from Duplicate Files in a Serverless Distributed File System"

Transcription

1 Reclaiming Space rom Duplicate Files in a Serverless Distributed File System John R. Douceur Atul Adya William J. Bolosky Dan Simon Marvin Theimer July 22 Technical Report MSR-TR-22-3 Microsot Research Microsot Corporation One Microsot Way Redmond, WA 9852

2 Reclaiming Space rom Duplicate Files in a Serverless Distributed File System* John R. Douceur, Atul Adya, William J. Bolosky, Dan Simon, Marvin Theimer Microsot Research {johndo, adya, bolosky, dansimon, theimer}@microsot.com Abstract The Farsite distributed ile system provides availability by replicating each ile onto multiple desktop computers. Since this replication consumes signiicant storage space, it is important to reclaim used space where possible. Measurement o over 5 desktop ile systems shows that nearly hal o all consumed space is occupied by duplicate iles. We present a mechanism to reclaim space rom this incidental duplication to make it available or controlled ile replication. Our mechanism includes ) convergent encryption, which enables duplicate iles to coalesced into the space o a single ile, even i the iles are encrypted with dierent users keys, and 2) SALAD, a Sel- Arranging, Lossy, Associative Database or aggregating ile content and location inormation in a decentralized, scalable, ault-tolerant manner. Large-scale simulation experiments show that the duplicate-ile coalescing system is scalable, highly eective, and ault-tolerant.. Introduction This paper addresses the problems o identiying and coalescing identical iles in the Farsite [8] distributed ile system, or the purpose o reclaiming storage space consumed by incidentally redundant content. Farsite is a secure, scalable, serverless ile system that logically unctions as a centralized ile server but that is physically distributed among a networked collection o desktop workstations. Since desktop machines are not always on, not centrally managed, and not physically secured, the space reclamation process must tolerate a high rate o system ailure, operate without central coordination, and unction in tandem with cryptographic security. Farsite s intended purpose is to provide the advantages o a central ile server (a global name space, locationtransparency, reliability, availability, and security) without the attendant disadvantages (additional expense, physical plant, administration, and vulnerability to geographically localized aults). It provides high availability and reliability while executing on a substrate o inherently unreliable machines primarily through a high degree o replication o both ile content and directory inrastructure. Since this deliberate and controlled replication causes a dramatic increase in the space consumed by the ile system, it is advantageous to reclaim storage space due to incidental and erratic duplication o ile content. Since the disk space o desktop computers is mostly unused [3] and becoming less used over time [4], reclaiming disk space might not seem to be an important issue. However, Farsite (like most peer-to-peer systems [32]) relies on voluntary participation o the client machines owners, who may be reluctant to let their machines participate in a distributed ile system that substantially reduces the disk space available or their use. Measurements [8] o 55 desktop ile systems at Microsot show that almost hal o all occupied disk space can be reclaimed by eliminating duplicate iles rom the aggregated set o multiple users systems. Perorming this reclamation in Farsite requires solutions to our problems: ) Enabling the identiication and coalescing o identical iles when these iles are (or security reasons) encrypted with the keys o dierent users. 2) Identiying, in a decentralized, scalable, aulttolerant manner, iles that have identical content. 3) Relocating the replicas o iles with identical content to a common set o storage machines. 4) Coalescing the identical iles to reclaim storage space, while maintaining the semantics o separate iles. The latter two o these problems are addressed by a ile-replica-relocation system [4] and the Windows 2 [34] Single-Instance Store (SIS) [7], which are described in other publications. In the present paper, we describe the irst two problems solutions: ) convergent encryption, a cryptosystem that produces identical ciphertext iles rom identical plaintext iles, irrespective o their encryption keys and 2) SALAD, a Sel-Arranging, Lossy, Associative Database or aggregating and analyzing ile content inormation. Collectively, these components are called the Duplicate-File Coalescing (DFC) subsystem o Farsite. The ollowing section briely describes the Farsite system. Section 3 explains convergent encryption and ormally proves its security. Section 4 describes both the steady-state operation o SALAD and the maintenance o SALAD as machines join and leave the system. Section 5 shows results rom large-scale simulation experiments using ile content data collected rom a set o 585 desktop ile systems. Section 6 discusses related work, and Section 7 concludes. *This paper is an extended version o a paper that appeared in the 22nd IEEE International Conerence on Distributed Computing Systems [5].

3 2. Background the Farsite ile system Farsite [8] is a scalable, serverless, distributed ile system under development at Microsot Research. It provides logically centralized ile storage that is secure, reliable, and highly available, by ederating the distributed storage and communication resources o a set o not-ullytrusted client computers, such as the desktop machines o a large corporation. These machines voluntarily contribute resources to the system in exchange or the ability to store iles in the collective ile store. Every participating machine unctions not only as a client device or its local user but also both as a ile host storing replicas o encrypted ile content on behal o the system and as a member o a directory group storing metadata or a portion o the ile-system namespace. Data privacy in Farsite is rooted in symmetric-key and public-key cryptography [27], and data reliability is rooted in replication. When a client writes a ile, it encrypts the data using the public keys o all authorized readers o that ile, and the encrypted ile is replicated and distributed to a set o untrusted ile hosts. The encryption prevents ile hosts rom unauthorized viewing o the ile contents, and the replication prevents any single ile host rom deliberately (or accidentally) destroying a ile. Typical replication actors are three or our replicas per ile [8, 4]. The integrity o ile content and o the system namespace is rooted in replicated state machines that communicate via a Byzantine-ault-tolerant protocol []. Directories are apportioned among groups o machines. The machines in each directory group jointly manage a region o the ile-system namespace, and the Byzantine protocol guarantees that the directory group operates correctly as long as ewer than one third o its constituent machines ail in any arbitrary or malicious manner. In addition to maintaining directory data and ile metadata, each directory group also determines which ile groups store replicas o the iles contained in its directories, using a distributed ile-replica-placement algorithm [4]. For security reasons, machines communicate with each other over cryptographically authenticated and secured channels, which are established using public-key cryptography. Thereore, each machine has its own public/private key pair (separate rom the key pairs held by users), and each machine computes a large (2-byte) unique identiier or itsel rom a cryptographically strong hash o its public key. Since the corresponding private key is known only by that machine, it is the only machine that can sign a certiicate that validates its own identiier, making machine identiiers veriiable and unorgeable. Each directory group needs to determine which o the iles it manages have contents that are identical to other iles that may be managed by another directory group. Each ile host needs to be able to coalesce identical iles that it stores, even i they have been encrypted separately. 3. Convergent encryption I Farsite were to use a conventional cryptosystem to encrypt its iles, then two identical iles encrypted with dierent users keys would have dierent encrypted representations, and the DFC subsystem could neither recognize that the iles are identical nor coalesce the encrypted iles into the space o a single ile, unless it had access to the users private keys, which would be a signiicant security violation. Thereore, we have developed a cryptosystem called convergent encryption that produces identical ciphertext iles rom identical plaintext iles, irrespective o their encryption keys. To encrypt a ile using convergent encryption, a client irst computes a cryptographically strong hash o the ile content. The ile is then encrypted using this hash value as a key. The hash value is then encrypted using the public keys o all authorized readers o the ile, and these encrypted values are attached to the ile as metadata. Formally, given a symmetric-key encryption unction E, a public-key encryption unction F, a cryptographic hash unction H, and a public/private key pair (K u, K u) or each user u in a set o users U o ile, convergently encrypted ile ciphertext C is a data, metadataò tuple given by unction Χ applied to ile plaintext P : C = ΧK ( P ) = c u, Μ () Where the ile data ciphertext c is the encryption o the ile data plaintext, using the plaintext hash as a key: c = E H ( P )( P ) (2) And the ciphertext metadata Μ is a set o encryptions o the plaintext hash, using the users public keys: { = ( H ( P ) u U } Μ = µ µ (3) u u F Ku Any authorized reader u can decrypt the ile by decrypting the hash value with the reader s private key K u and then decrypting the ile content using the hash as the key: ( ) P = Χ K C E F ( )( c ) u = K u µ u (4) Because the ile is encrypted using its own hash as a key, the ile data ciphertext c is ully determined by the ile data plaintext P. Thereore, the DFC subsystem, without knowledge o users keys, can ) determine that two iles are identical and 2) store them in the space o a single ile (plus a small amount o space per user s key). 3.. Convergent encryption proo o security Convergent encryption deliberately leaks a controlled amount o inormation, namely whether or not the plaintexts o two encrypted messages are identical. In this subsection, we prove a theorem stating that we are not accidentally leaking more inormation than we intend. To obtain a general result regarding our construction, not a result speciic to any realization using particular encryption or hash unctions, our proo is based on the random oracle model [4] o cryptographic primitives. 2

4 We are given a security parameter (key length) n and a plaintext string P o length m n, where P is selected uniormly at random rom a superpolynomial (in n) subset S o all possible strings o length m: P Œ r S ' "s [s Œ S s Œ {,} m ] Ÿ S = n ω(). This assumption is stronger than that allowed in a semantic security proo, since convergent encryption explicitly allows the contents o an encrypted message to be veriied without access to the encryption key. According to the random oracle model, the hash unction H is selected uniormly at random rom the set o all mappings o strings o length m to strings o length n: H Œ r {h ' h: {,} m {,} n }. The encryption unction E is selected uniormly at random rom the set o all permutations that map keys o length n and plaintext strings o length m to ciphertext strings o length m: E Œ r {e ' e: {,} n â {,} m {,} m Ÿ "k,p,p 2 [p p 2 e k (p ) e k (p 2 )] }. Furthermore, since the encryption unction is symmetric, there exists a unction E that is the inverse o unction E given k: "k,p [E k(e k (p)) = p]. (This is not an inverse unction in the traditional sense, since it yields only the plaintext and not the encryption key.) We assume that H, E, and E are all available to an attacker. However, since these unctions are random oracles, the only method by which the attacker can extract inormation rom these unctions is by querying their oracles. In other words, we assume that the attacker cannot break the encryption or hash primitives in any way; i.e., they are ideal cryptographic unctions. We compute c according to the deinition o convergent encryption: c = E H(P) (P). We allow an attacker to execute a program Σ containing a sequence o queries to the random oracles H, E, and E, interspersed with arbitrary amounts o computation between each pair o adjacent queries. We reer to the count o queries in the program as the length o the program. Theorem: Given ciphertext c, there exists no program Σ o length O(n ε ) that can output plaintext P with probability Ω(/n ε ) or any ixed ε, unless the attacker can a priori output P with probability Ω(/n 2ε ). Proo: Assume such a program exists. Let Σ be the shortest such program. Σ either does or does not include a query to oracle H or the value o H(P). I it does not, then the O(n ε ) queries to oracle H in program Σ can, by eliminating alternatives to H(P), increase the probability mass associated with H(P) by a actor o at most O(n ε ), so the probability or H(P) is o(/n ε ). Thus the probability or P = E H(P)(c) is also o(/n ε ) < Ω(/n ε ). Alternatively, i Σ does include a query or H(P), we can construct a new program Σ whose queries are a preix o Σ but which halts ater a random query q < Σ and outputs the value o P queried on step q + o program Σ. With probability Ω(/n ε ), Σ will output P, contradicting the assumption that Σ is the shortest such program. 4. Identiying duplicate iles SALAD Convergent encryption enables identical encrypted iles to be recognized as identical, but there remains the problem o perorming this identiication across a large number o machines in a robust and decentralized manner. We solve this problem by storing ile location and content inormation in a distributed data structure called a SALAD: a Sel-Arranging, Lossy, Associative Database. For scalability, the ile inormation is partitioned and dispersed among all machines in the system; and or ault-tolerance, each item o inormation is stored redundantly on multiple machines. Rather than using central coordination to orchestrate this partitioning, dispersal, and redundancy, SALAD employs simple statistical techniques, which have the unintended eect o making the database lossy. In our application, a small degree o lossiness is acceptable, so we have chosen to retain the (relative) simplicity o the system rather than to include additional machinery to rectiy this lossiness. 4.. SALAD record storage overview Logically, a SALAD appears to be a centralized database. Each record in the database contains inormation about the location and content o one ile. To add a new record to the database, a machine irst computes a ingerprint o a ile by hashing the ile s (convergently encrypted) content and prepending the ile size to the hash value. It then constructs a key, valueò record in which the key is the ile s ingerprint and the value is the identiier o the machine where the ile resides, and it inserts this record into the database. The database is indexed by ingerprint keys, so it can be associatively searched or records with matching ingerprints, thereby identiying and locating iles with (probably) identical contents. (With 2-byte hash values, the probability that a set o F iles contains even one pair o same-sized non-identical iles with the same hash value is F / 2 2 â 8 / 2 F â 24.) Physically, the database is partitioned among all machines in the system. Within the context o SALAD, each machine is called a lea (akin to a lea in a tree data structure). Each record is stored in a set o local databases on zero or more leaves. Leaves are grouped into cells, and all leaves within any given cell are responsible or storing the same set o records. Records are sorted into buckets according to the value o the ingerprint in each record. Each bucket o records is assigned to a particular cell, and all records in that bucket are stored redundantly on all leaves within the cell, as illustrated in Fig.. The number o cells grows linearly with the system size, and since the number o iles also grows linearly with the system size, the expected number o records stored by each lea is constant. 3

5 cell-id = cell-id = cell-id = cell-id = lea ile record cell bucket (machine) Fig. : Buckets o records in cells o leaves A SALAD has two coniguration parameters: its target redundancy actor Λ and its dimensionality D. Since each record is stored redundantly on all leaves in a particular cell, the degree o storage redundancy is equal to the mean number o leaves per cell. This value is known as the actual redundancy actor λ, and it is bounded (via the process described in Subsection 4.2) by the inequality: Λ λ < 2Λ (5) For large systems, it is ineicient or each lea to maintain a direct connection to every other lea, so the leaves are organized (via the process described in Subsection 4.3) into a D-diameter directed graph. Each record is passed rom lea to lea along the edges o the graph until, ater at most D hops, it reaches the appropriate storage leaves SALAD partitioning and redundancy The target redundancy actor Λ is combined with the lea count L (the count o leaves in the SALAD, also called the system size) to determine a cell-id width W, as ollows (where the notation lg means binary logarithm): L W = lg (6) Λ As described in Section 2 above, each lea has a large (2-byte), unique identiier i. The least-signiicant W bits o a lea s identiier or a record s ingerprint orm a value called the cell-id o that lea or record. (For convenience, we sometimes use the term identiier to mean either a lea s identiier or a record s ingerprint.) Formally, the cell-id o identiier i is given by: W c( i) = i mod 2 (7) Two identiiers are cell-aligned i their cell-id values are equal. Cell-aligned leaves share the same cell, and records are stored on leaves with which they are cell-aligned, as illustrated in Fig.. Beore introducing the dimensionality parameter D, we describe the simplest SALAD coniguration, in which D =, as in the example o Fig.. Each lea in the SALAD maintains an estimate o the system size L. From this, it calculates W according to Eq. 6, and it computes cell-id values or each lea in the system (including itsel) according to Eq. 7. Then, or each o its iles, it hashes the ile s content, creates a ingerprint record, and computes a cell-id or the record. The lea then sends each record to all leaves that it believes to be cell-aligned with the record. When each lea receives a record, it stores the record i it considers itsel to be cell-aligned with the record. This example illustrates the statistical partitioning, redundancy, and lossiness o record storage. With no central coordination, records are distributed among all leaves, and records with matching ingerprints end up on the same set o leaves, so their identicality can be detected with a purely local search. Since machine identiiers and ile content ingerprints are cryptographic hash values, they are evenly distributed, so the number o leaves on which each record is stored is governed by a Poisson distribution [2] with a mean o λ. Thereore, with probability e λ, a record will not be stored on any lea. Note that i two leaves have dierent estimates o the system size L, they may disagree about whether they are cell-aligned. However, this disagreement does not cause the SALAD to malunction, only to be less eicient. I a lea underestimates the system size, it may calculate an undersized cell-id width W. With ewer bits in each cell- ID, cell-ids are more likely to match each other, so the lea may store more records than it needs to, and it may send records to leaves that don t need to receive them. I a lea overestimates the system size, it may calculate an oversized cell-id width W, which causes cell-ids to be less likely to match each other, so the lea may store ewer records than it needs to, and it may not send records to leaves that should receive them. Thus, an underestimate o L increases a lea s workload, and an overestimate o L increases a lea s lossiness. Given F iles in the system, the mean count o records stored by each lea is R, calculated as ollows: F R = λ (8) L Since F µ L and λ < 2 Λ, R remains constant as the system size grows SALAD multi-hop inormation dispersal Cells in a SALAD are organized into a D-dimensional hypercube. (Technically, it is a rectilinear hypersolid, since its dimensions are not always equal, but this term is cumbersome.) Coordinates in D-dimensional hyperspace are given with respect to D Cartesian axes. In two- or three-dimensional spaces, it is common to reer to these axes as the x-axis, y-axis, and z-axis, but or arbitrary dimensions, it is more convenient to use numbers instead o letters, so we reer to the -axis, the -axis, and so orth. 4

6 W W identiier or ingerprint: identiier or ingerprint: coordinates: c : c : coordinates: c 2 : c : c : (a) ( W ) 2 ( W ) 2 ( W 2) 3 ( W ) 3 ( W ) 3 Fig. 2: Example extraction o cell-id and coordinates rom an identiier when (a) D = 2, (b) D = 3 Each cell-id is decomposed into D coordinates, as illustrated in Fig. 2. Successive bits o each coordinate are taken rom non-adjacent bits in the cell-id so that when the system size L grows and the width o each coordinate consequently increases, the value o each coordinate undergoes minimal change. A cell s location within the hypercube is determined by the coordinates o its cell-id. For example, in Fig. 2a, a lea with the shown identiier has cell coordinates c = 6 ( 2 ) and c = ( 2 ). Formally, or d < D, the bit width W d o the d-axis coordinate o an identiier is given by: W d W d = (9) D The d-axis coordinate o identiier i is deined by the ollowing ormula (where the notation b n (i) indicates the value o bit n in identiier i, and bit is the LSB), which merely ormalizes the procedure illustrated in Fig. 2: d W d k = k ( i) = 2 b ( i) c () D k + d Fig. 3 shows an example two-dimensional SALAD rom the perspective o the black lea. (Communication paths not relevant to the black lea are omitted rom this igure.) We reer to a row o cells that is parallel to any one o the Cartesian axes as a vector o cells. Two -axis -axis xz = xz = xz = xz = wy = wy = wy = wy = C Fig. 3: SALAD rom black lea s perspective (D=2) D A B E (b) identiiers are d-vector-aligned i they are both in a vector o cells that runs parallel to the d-axis. This means that at least D o their coordinates match, but their d-axis coordinates might not. Formally: ad ( i, j) k [ k d ck ( i) = ck ( j) ] () Identiiers are vector-aligned i they share any vector o cells. Thus, they are d-vector-aligned or some d, like leaves A and C in Fig. 3, but unlike A and E. Formally: a( i, j) d [ ad ( i, j) ] (2) Each lea maintains a lea table o all leaves that are vector-aligned with it, and these are the only leaves with which it communicates. The expected count o leaves in each vector is λ (L/λ) /D, so the mean lea table size is T: D D D T = Dλ ( Lλ) Dλ + λ Dλ L (3) This is not very large. With L =,, λ = 3, and D = 2, the mean lea table size is about 35 entries. Ater a new ile record is generated, it makes its way through the salad by moving in one Cartesian direction per step, until ater a maximum o D hops, it reaches leaves with which it is cell-aligned. A lea perorms the same set o steps either to insert a new record o its own into the SALAD or to deal with a record it receives rom another lea: In outline, each lea determines the highest dimension d in which all o the (less-than-d)-axis coordinates o its own identiier equal those o the record s ingerprint. I d < D, then it orwards the ingerprint record along its d-axis vector to those leaves whose d-axis coordinates equal that o the record s ingerprint. Ater a maximum o D such hops, the ingerprint record will reach leaves that are cell-aligned with the ingerprint. When a lea receives a cell-aligned record, the lea stores the record in its local database, searches or records whose ingerprints match the new record s ingerprint, and notiies the appropriate machines i any matches are ound. For example, when D = 2, cells are organized into a square (a two-dimensional hypercube), and each lea has entries in its lea table or other leaves that are either in its horizontal vector or in its vertical vector. In Fig. 3, the black lea has (binary) cell-id wxyz =, and its coordinates are c = xz = and c = wy =. Thus, it knows other leaves with cell-ids wy or xz, or any values o w, x, y, and z. (The igure shows directed connections to these known leaves via black arrows.) 5

7 receive record,lò // current lea s identiier is I d = while d < D // ind highest dimension d or which all (<d)-coordinates match i c d () c d (I) send record,lò to all leaves j or which a d (I,j) Ÿ c d () = c d (j) exit procedure // orward message along d axis and exit d = d + // this lea is cell-aligned with record s ingerprint i l = I // special case: this lea generated record send record,lò to all leaves j or which c() = c(j) search local database or records matching ingerprint or each matching record,kò notiy machine l that machine k has ile with ingerprint notiy machine k that machine l has ile with ingerprint store record,lò in local database Fig. 4: Procedure or lea I to insert record (,l), locally initiated or upon receipt rom another lea When the black lea generates a record or one i its iles, there are three cases: ) I the record s cell-id equals, the lea stores the record in its own database and sends it to the one other lea in its cell, lea A. 2) I the ingerprint cell-id is wy or wy, then the -axis coordinates are equal but the -axis coordinates are not, so the black lea sends the record along its -axis (horizontal) vector to leaves whose cell-id equals wy. For example, i the ingerprint cell-id is, it is sent directly to lea B. 3) I the ingerprint cell-id is wxyz or xz, then the -axis coordinates are not equal, so the black lea sends the record along its -axis (vertical) vector to leaves whose cell-id equals xz. In this third case, i the ingerprint s -axis coordinate wy does not equal, then the recipient leaves will orward the record (horizontally, via the gray paths in the igure) to the appropriate leaves. For example, i the ingerprint cell-id is, it is sent to leaves C and D, who each orward it to lea E. Formally, the pseudo-code o Fig. 4 illustrates the procedure by which a lea with identiier I inserts a new record, lò indicating that lea l contains a ile with ingerprint into the SALAD. Adding hops to the propagation o ingerprint records increases the system s lossiness. For a two-dimensional SALAD, a record will not be stored i it is not sent to any lea on either the irst or the second hop. When the system size L is very large, nearly all records require two hops, so the loss probability approaches ( e λ ) 2 2 e λ. In general, the loss probability or a D-dimensional salad is: λ D λ Ploss = ( e ) D e (4) For example, with λ = 3 and D = 2, P loss % SALAD maintenance adding leaves There are three aspects to maintaining a SALAD: Adding new leaves, removing dead leaves, and maintaining each lea s estimate o the lea count L. This subsection describes the irst o these operations. Each lea is supposed to know all other leaves with which it is vector-aligned. Thus, when a machine is added to a SALAD as a new lea, it needs to learn o all leaves that are vector-aligned with its identiier so it can add them to its lea table, and these leaves need to add the new lea to their lea tables. The machine irst discovers one or more arbitrary leaves in the SALAD by some out-o-band means (e.g. piggybacking on DHCP []). I the machine cannot ind any extant leaves, it starts a new SALAD with itsel as a singleton lea. I the machine does ind one or more leaves in a SALAD, it sends each o them a join message, and each o these messages is orwarded along D independent pathways o the hypercube until it reaches leaves that are vector-aligned with the new lea s identiier. These vector-aligned leaves send welcome messages to the new lea, which replies with welcomeacknowledge messages. These two types o messages cause the recipient to add an entry to its lea table and to update its estimate o the system size L. To ormalize the join-message orwarding process, we generalize the concepts o cell-alignment and vectoralignment introduced in Subsections 4.2 and 4.3, respectively. Two identiiers are δ-dimensionally-aligned i they are both in the same δ-dimensional hypersquare o a SALAD. (For example, i there is any square that includes them both, they are two-dimensionally aligned, and i there is any vector that includes them both, they are one-dimensionally aligned.) This means that at least D δ o their coordinates match. Formally: ( δ a ) ( i, j) [ ( ) ( )] = δ k k c i = k ck j (5) Note that a(i,j) a () (i,j), so the term vector-aligned is a synonym or one-dimensionally aligned ; similarly, the term cell-aligned is a synonym or zerodimensionally aligned. In Fig. 3, leaves C and D are - dimensionally aligned, leaves C and E are -dimensionally aligned, and leaves B and E are 2-dimensionally aligned. 6

8 We need to deine one more term, namely the eective dimensionality ð o a SALAD. A very small SALAD will not have enough leaves or even a minimal D-dimensional hypercube to contain a mean o Λ leaves per cell. For example, when W =, there are only two cells, irrespective o the value o D, so the salad is eectively one-dimensional. In general, when W < D, W d = or W d < D. Thus, a SALAD s eective dimensionality is: ð = min(w, D) (6) The cell or the new lea is located at the intersection o ð orthogonal vectors; the goal is or all leaves in these vectors to receive the new lea s join message. The general strategy is or one lea to send out ð batches o messages, each o which is routed (by a series o hops through cells with progressively lower dimensional alignment) to one o the vectors that will contain the new lea. When a lea in one o the inal vectors receives the join, it orwards it to all leaves in the vector. In Fig. 3, i a new lea (shown as a gray circle) joins with cell-id =, and i it sends a join message to the black lea, the black lea will send a batch o one message to lea B and a batch o two messages to leaves C and D. Lea B will orward the join to all leaves in its column, and leaves C and D will each orward the join to all leaves in their row. For this to work in a stateless ashion, the lea that initiates the process must not be δ-dimensionally aligned with the new lea or any δ < ð. In Fig. 3, i lea C initially received the new lea s join message, it orwards the message to the leaves in cell ; however, these leaves will not orward the message to the other leaves in the column, because cell-aligned leaves are expecting to be at the end o the chain o orwarded join messages. Thereore, i the initially contacted lea is δ-dimensionally aligned with the new lea or some δ < ð, it orwards a batch o join messages to a cell o leaves that are (δ+)- dimensionally aligned, and this orwarding continues until the join message reaches leaves whose minimal dimensional alignment is ð, which initiates the process described in the previous paragraph. Formally, when lea I receives a join message or a new lea n rom sender s, it perorms the procedure shown in the pseudo-code o Fig. 5. In outline, the lea determines the lowest dimensionality δ or which it is δ- dimensionally-aligned with the new lea, and it determines the lowest dimensionality δ or which the sender is δ dimensionally-aligned with the new lea. (We recalculate this value at each lea rather than passing it along with the message, because dierent leaves may have dierent receive join s,nò // my identiier is I δ = // δ is my lowest dimensional alignment with new lea δ = // δ is sender s lowest dimensional alignment with new lea = // is set o all dimensions in which I and n have non-matching coordinates or all d ' d < ð // loop to calculate above values i c d (n) c d (I) δ = δ + = «{d} i (c d (n) c d (s)) δ = δ + i (s = n) // i join received directly rom new lea... δ = - // sender s dimensional alignment is considered lower than all others i δ > δ // sender has higher dimensional alignment i δ > // i I am not vector-aligned, orward down one degree o alignment or all d Œ {d ' d Œ Ÿ (d + ) mod ð œ } send join I,nÒ to all known leaves j or which c d (n) = c d (j) else i δ = // i I am vector-aligned, orward to leaves in my vector or all d Œ // = δ =, so this loop only executes once send join I,nÒ to all known leaves j or which a d (I,j) else i δ < δ // sender has lower dimensional alignment i δ < ð // i my dimensional alignment < ð, orward up one degree o alignment or one d Œ r {d ' d < ð Ÿ d œ } or one c Œ r {c ' c < 2 Wd Ÿ c c d (n)} send join I,nÒ to all known leaves j or which c d (j) = c else i δ > // unless I m vector-aligned, initiate ð batches o messages or all d Œ send join I,nÒ to all known leaves j or which c d (n) = c d (j) else // I m vector-aligned, and ð =... send join I,nÒ to all known leaves // so orward join to everyone I know i δ < 2 // I m vector aligned with new lea... send welcome message to new lea n // so welcome new lea to SALAD Fig. 5: Procedure or lea I when receiving join message rom sending lea s to insert new lea n 7

9 estimates o L.) The lea then takes steps as described above to orward the join message to leaves o greater, lesser, or identical dimensional alignment, as appropriate. There are several annoying corner cases, which are handled appositely by the pseudo-code in Fig. 5. As L, each join message sent by the new lea results in D λ messages orwarded rom the initially contacted D-dimensionally-aligned lea to the (D )- dimensionally-aligned leaves. Each o these messages spawns λ additional messages or each dimensionality rom D 2 down to. Each d-vector-aligned lea at the end o a orwarding chain then orwards the message to λ /D L /D d-vector-aligned leaves. Thus, each initially contacted lea initiates a series o M messages or each new lea, where M is: D D D M = Dλ L (7) When a vector-aligned (or cell-aligned) lea receives a join message, it sends a welcome message to the new lea. When the new lea receives a welcome message rom an extant lea not in its lea table, then i the extant lea is vector-aligned with the new lea, the new lea takes three actions: ) It adds the extant lea to its lea table; 2) it updates its estimate o the SALAD system size (see 4.2.3); and 3) it replies to the extant lea with a welcomeacknowledge message. When an extant lea receives a welcome-acknowledge message rom a new lea, it perorms the irst two o these actions but does not reply with an acknowledgement SALAD maintenance removing leaves A new lea must explicitly notiy the SALAD that it wants to join, but a lea can depart without notice, particularly i its departure is due to permanent machine ailure. Thus, the SALAD must include a mechanism or removing stale lea table entries. We employ the standard technique [e.g. 6] o sending periodic reresh messages between leaves, and each lea lushes timed-out entries in its lea table. In addition, a lea that cleanly departs the SALAD sends explicit departure messages to all o the leaves in its lea table SALAD maintenance estimating lea count SALAD leaves use an estimate o the system size L to determine an appropriate value or the cell-id width W. Since each lea knows only the leaves or which it has entries in its lea table, it has to estimate L based on the size T o its lea table. The expected relationship between T and L is given by Eq. 3, so the lea eectively inverts this equation. The actual procedure is a little more complicated, because a change to the estimated value o L can cause a change to the value o W, which in turn can cause leaves to be added to or removed rom the lea table, changing the value o T. Whenever lea I modiies its lea table, it recalculates the appropriate cell-id width W via the procedure illustrated in the pseudo-code o Fig. 6. recalculate W // W is cell-id width calculate known lea ratio r // using Eq. 8 L = T / r // estimated lea count Λ = Λ / ( + ξ) // or decreasing width, use attenuated target redundancy Ŵ = lg (L / Λ ) // target cell-id width while Ŵ < W // while target width is less than actual width... W = W // decrement width request identiiers o newly vector-aligned leaves rom neighbors recalculate r L = T / r Ŵ = lg (L / Λ ) Ŵ = lg (L / Λ) // or increasing width, use non-attenuated target redundancy while Ŵ > W // while target width is greater than actual width T = count o remaining known leaves ater increasing width W = W + // tentative cell-id width r = visible lea raction based on W // tentative known lea ratio L = T / r // tentative estimated lea count Ŵ = lg (L / Λ) // tentative target cell-id width i Ŵ < W // i tentative target width less than tentative width... exit procedure // tentative width is unstable, so exit without incrementing W W = W // increment W orget leaves that are no longer vector-aligned r = r // update actual values to relect tentative values L = L Ŵ = Ŵ Fig. 6: Procedure ollowed by lea I to recalculate cell-id width W when lea table size T changes 8

10 In outline, the lea irst calculates the ratio r o vectoraligned cells to total cells in the SALAD, based on the current value o W: D Wd 2 D + d = (8) r = W 2 Since the mean count o leaves per cell is a constant (λ), this ratio is also the expected ratio o leaves in its lea table T to the total lea count L. So rom r and T, it estimates L, and it then calculates a target cell-id width Ŵ according to Eq. I the target cell-id width Ŵ is less than the current cell-id width W, then the lea decrements W, which olds the hypercube in hal along dimension d = (W ) mod D. This increases the count o leaves with which it is d -vector-aligned or all d d. To obtain identiiers and contact inormation or these leaves, the lea requests this inormation rom a set o leaves in its lea table that should have these newly vector-aligned leaves in their lea tables. In particular, it contacts λ leaves j with which it is d-vector aligned and or which c d (I) = c d (j) when calculated according to the new (smaller) value o W but not according to the old (larger) value o W. The lea then recalculates r, L, and Ŵ based on the new values o W and T, and it repeats the above process until the cell-id width is less than or equal to the target cell-id width. I the target cell-id width Ŵ is greater than the current cell-id width W, the procedure is similar to the above in reverse; however, we have to be careul to prohibit an unstable state. Since increasing the cell-id width causes leaves to be orgotten, it is possible that this will result in a smaller lea table size T that yields a smaller system size estimate L that in turn produces a target cell-id width Ŵ that is less than the newly increased cell-id width W. Thereore, the lea irst calculates the lea table size T that would result i the cell-id width were to be incremented. It then calculates tentative values W, r, L, and Ŵ based on the tentative lea table size T. I the tentative target cell-id width Ŵ is less than the tentative new cell-id width W, then the procedure exits instead o transiting to an unstable state. Otherwise, it increments W; orgets about leaves with which it is no longer vector-aligned; updates r, L, and Ŵ to relect the new values o W and T; and repeats the above process until the cell-id width is greater than or equal to the target cell-id width. The above procedure admits rapid luctuation when the lea table size vacillates around a threshold that separates two stable values o W. We prevent this luctuation with hysteresis: When considering a decrease to the cell-id width W, we calculate the target cell-id width Ŵ using an attenuated target redundancy actor Λ : Λ = Λ +ξ (9) This is shown in the pseudo-code o Fig. 6. The damping actor ξ determines the spread between an L large enough to increase W and an L small enough to decrease W. The damping actor ξ should be relatively small, since it allows a reduction in the actual redundancy actor o the SALAD SALAD attack resilience I SALAD were designed such that its leaves cooperatively determine the ranges o ingerprints that each lea stores, it might be possible or a set o malicious leaves to launch a targeted attack against a particular range o values, by arranging or themselves to be the designated record stores or this range. However, because o the SALAD s purely statistical construction, such an attack is greatly limited: Each lea determines its ingerprint range independently rom the ranges o all other leaves, so the most damage a malicious lea can do is to decrease the overall redundancy o the system. For D >, it is possible to target an attack, but only in a airly weak way. By choosing their own identiiers to be vector-aligned with a victim lea, a set o m malicious leaves can increase the size o the victim s lea table, thereby increasing its system size estimate L, which increases the lea s lossiness as described at the end o Subsection 4.2. The eective redundancy actor λ or the victim lea s records will be: m λ = λ (2) L Thus, not only does increasing a SALAD s dimensionality increase the loss probability or a given redundancy actor (Eq. 4), but also it increases the susceptibility o the system to attack. We thereore suggest constructing a SALAD with a dimensionality no higher than that needed to achieve lea tables o a manageably small size. 5. Simulation evaluation Since the current implementation o Farsite is not complete or stable enough to run on a corporate network, we evaluated the DFC subsystem via large-scale simulation using ile content data collected rom 585 desktop ile systems. We distributed a scanning program to a randomly selected set o Microsot employees and asked them to scan their desktop machines. The program computed a 36-byte cryptographically strong hash o each 64-Kbyte block o all iles on their systems, and it recorded these hashes along with ile sizes and other attributes. The scanned systems contain,54,5 iles in 73,87 directories, totaling 685 GB o ile data. There were 4,6,748 distinct ile contents totaling 368 GB o ile data, implying that coalescing duplicates could ideally reclaim up to 46% o all consumed space. D 9

11 consumed space (GB) messages / machine (thousands) K 32K 256K 2M 6M 28M G K 32K 256K 2M 6M 28M G m inim um ile s ize or coale scing (byte s ) m inim um ile s ize or coale scing (byte s ) ideal Λ =.5 Λ = 2 Λ = 2.5 Fig. 7: Consumed space vs. minimum ile size We ran a two-dimensional DFC system on 585 simulated machines, each o which held content rom one o the scanned desktop ile systems. The SALAD was initialized with a single lea, and the remaining 584 machines were each added to the SALAD by the procedure outlined in Subsection 4.4. We recorded the sizes o each machine s lea table and ingerprint database, as well as the number o messages exchanged. By setting a threshold on the minimum ile size eligible or coalescing, we can substantially reduce the message traic and ingerprint database sizes. Fig. 7 shows the consumed space in the system versus this minimum size. The eect on space consumption is negligible or thresholds below 4 Kbytes. This igure also shows that a target redundancy actor o Λ = 2.5 achieves nearly all possible space reclamation. We tested the resilience o the DFC system to machine ailure by randomly ailing the simulated machines. Fig. 8 shows the consumed space versus machine ailure probability. With Λ = 2.5, even when machines ail hal o the time, the system can still reclaim 38% o used space, comparing avorably to the optimal value o 46%. Λ =.5 Λ = 2 Λ = 2.5 Fig. 9: Message count vs. minimum ile size Fig. 9 shows the mean count o messages each machine sent and received versus the minimum ile-size threshold. By setting this threshold to 4 Kbytes, the mean message count is cut in hal without measurably reducing the eectiveness o the system (c. Fig. 7). Fig. shows the cumulative distribution o machines by count o messages sent and received, with no minimum ile-size threshold. The coeicients o variation [2] CoV Λ or these curves are CoV.5 =.64, CoV 2. =.39, and CoV 2.5 =.39. Combined with the smooth shape o the curves, these values imply that the machines share the communication load relatively evenly, especially as Λ is increased. Fig. shows the mean database size on each machine versus the minimum ile-size threshold. As with the message count, setting this threshold to 4 Kbytes halves the mean database size. Fig. 2 shows the cumulative distribution o machines by database size, with no minimum ile-size threshold. The coeicients o variation are CoV.5 =.28, CoV 2. =.3, and CoV 2.5 = 2.4â 5, which are airly small. However, all three curves show bimodal distributions: For Λ =.5 and Λ = 2., most machines store ewer than 4, records but a substantial minority store nearly 8, records. For Λ = 2.5, the consumed space (GB) m achine ailure probabilty cumulative requency m essage count (thousands) Λ =.5 Λ = 2 Λ = 2.5 Λ =.5 Λ = 2 Λ = 2.5 Fig. 8: Consumed space vs. machine ailure rate Fig. : CDF o machines by message count

12 mean database size (records) K 32K 256K 2M 6M 28M G consumed space (GB) m inim um ile s ize or coale scing (byte s ) database size lim it (records) Λ =.5 Λ = 2 Λ = 2.5 Λ =.5 Λ = 2 Λ = 2.5 Fig. : Database size vs. minimum ile size reverse is true. This suggests that the dierences in storage load among machines is due primarily to slight variations in machines estimates o L, iltered through the step discontinuity in the calculation o W (c. Eq. 6), rather than due to non-uniormity in the distribution o machine identiiers or ile content ingerprints. The substantial skew in Fig. 2 reveals an opportunity or reducing the record storage load in the system. We ran a set o experiments in which we limited the database size on each machine. When a machine receives a record that it should store, i its database size limit has been reached, it discards a record in the database with the lowest ingerprint value (corresponding to the smallest ile) and replaces it with the newly received record. I no record in the database has a lower ingerprint value than the new record, the machine discards the new record (ater orwarding it i appropriate). Fig. 3 shows the consumed space versus the database size limit. A limit o 4, records makes no measurable dierence in the consumed space or any Λ. For Λ = 2.5, even with a limit o 8 records (an order o magnitude smaller than the mean database size), the system can still reclaim 38% o used space, compared to the optimum o 46%. Fig. 3: Consumed space vs. database size limit For our inal experiment, we started with a singleton SALAD and incrementally increased the system size up to, simulated machines. Fig. 4 shows the mean lea table size versus system size. The square-root relationship predicted by Eq. 3 is evident in these curves, as is a periodic variation due to the discretization o W. Fig. 5 shows the cumulative distribution o machines by lea table size or L = 585 (the system size or our earlier experiments) and or L =, (the system size or our last experiment). When Λ =.5, the lossiness is so high that a signiicant (i small) raction o machines have nearly empty lea tables. For larger values o Λ, the curves show that there is reasonably close agreement among machines about the estimated system size, except in the case where Λ = 2.5 and L =,. In this case, lg(l/λ) is very close to an integer, so slight variations in machines estimates o L cause Eq. 6 to produce dierent cell-id widths or dierent machines, yielding this bimodal distribution. cumulative requency mean lea table size (entries) database size (records ) system size (m achines ) Λ =.5 Λ = 2 Λ = 2.5 Λ =.5 Λ = 2 Λ = 2.5 Fig. 2: CDF o machines by database size Fig. 4: Lea table size vs. system size

13 cumulative requency cumulative requency lea table size (entries) lea table size (entries) Λ =.5 Λ = 2 Λ = 2.5 Λ =.5 Λ = 2 Λ = 2.5 (a) Fig. 5: CDF o machines by lea table size or (a) L = 585, (b) L =, (b) 6. Related work To our knowledge, coalescing o identical iles is not perormed by any distributed storage system other than Farsite. The resulting increase in available space could beneit server-based distributed ile systems such as AFS [2] and Ficus [9], serverless distributed ile systems such as xfs [2] and Frangipani [37], content publishing systems such as Publius [38] and Freenet [2], and archival storage systems such as Intermemory [8]. Windows 2 [34] has a Single-Instance Store [7] that coalesces identical iles within a local ile system. LBFS [28] identiies identical portions o dierent iles to reduce network bandwidth rather than storage usage. Convergent encryption deliberately leaks inormation. Other research has studied unintentional leaks through side channels [22] such as computational timing [23], measured power consumption [24], or response to injected aults [5]. Like convergent encryption, BEAR [3] derives an encryption key rom a partial plaintext hash. Song et al. [35] developed techniques or searching encrypted data. SALAD has similarities to the distributed indexing systems Chord [36], Pastry [3], and Tapestry [4], all o which are based on Plaxton trees [29]. These systems use O(log n)-sized neighbor tables to route inormation to the appropriate node in O(log n) hops. Also similar is CAN [3], which uses O(d)-sized neighbor tables to route inormation to nodes in O(d n /d ) hops. SALAD complements these approaches by using O(d n /d )-sized neighbor tables to route in O(d) hops. These other systems are not lossy, but they appear less immune to targeted attack than SALAD is. SALAD s conigurable lossiness is similar to that o a Bloom ilter [6], although it yields alse negatives rather than alse positives. Farsite relocates identical iles to the same machines so their contents may be coalesced. Other research on ile relocation has been to balance the load o ile access [9, 39] to migrate replicas near points o high usage [, 7, 25, 33], or to improve ile availability [4, 26]. 7. Summary Farsite is a distributed ile system that provides security and reliability by storing encrypted replicas o each ile on multiple desktop machines. To ree space or storing these replicas, the system coalesces incidentally duplicated iles, such as shared documents among workgroups or multiple users copies o common application programs. This involves a cryptosystem that enables identical iles to be coalesced even i encrypted with dierent keys, a scalable distributed database to identiy identical iles, a ile-relocation system that co-locates identical iles on the same machines, and a single-instance store that coalesces identical iles while retaining separate-ile semantics. Simulation using ile content data rom 585 desktop ile systems shows that the duplicate-ile coalescing system is scalable, highly eective, and ault-tolerant. 2

Reclaiming Space from Duplicate Files in a Serverless Distributed File System

Reclaiming Space from Duplicate Files in a Serverless Distributed File System Reclaiming Space rom Duplicate Files in a Serverless Distributed File System John R. Douceur, Atul Adya, William J. Bolosky, Dan Simon, Marvin Theimer Microsot Research {johndo, adya, bolosky, dansimon,

More information

Neighbourhood Operations

Neighbourhood Operations Neighbourhood Operations Neighbourhood operations simply operate on a larger neighbourhood o piels than point operations Origin Neighbourhoods are mostly a rectangle around a central piel Any size rectangle

More information

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE for for March 10, 2006 Agenda for Peer-to-Peer Sytems Initial approaches to Their Limitations CAN - Applications of CAN Design Details Benefits for Distributed and a decentralized architecture No centralized

More information

Message authentication

Message authentication Message authentication -- Reminder on hash unctions -- MAC unctions hash based block cipher based -- Digital signatures (c) Levente Buttyán (buttyan@crysys.hu) Hash unctions a hash unction is a unction

More information

Computer Data Analysis and Use of Excel

Computer Data Analysis and Use of Excel Computer Data Analysis and Use o Excel I. Theory In this lab we will use Microsot EXCEL to do our calculations and error analysis. This program was written primarily or use by the business community, so

More information

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Outline System Architectural Design Issues Centralized Architectures Application

More information

MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM

MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM UDC 681.3.06 MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM V.K. Pogrebnoy TPU Institute «Cybernetic centre» E-mail: vk@ad.cctpu.edu.ru Matrix algorithm o solving graph cutting problem has been suggested.

More information

Measuring the Vulnerability of Interconnection. Networks in Embedded Systems. University of Massachusetts, Amherst, MA 01003

Measuring the Vulnerability of Interconnection. Networks in Embedded Systems. University of Massachusetts, Amherst, MA 01003 Measuring the Vulnerability o Interconnection Networks in Embedded Systems V. Lakamraju, Z. Koren, I. Koren, and C. M. Krishna Department o Electrical and Computer Engineering University o Massachusetts,

More information

ECE 6560 Multirate Signal Processing Chapter 8

ECE 6560 Multirate Signal Processing Chapter 8 Multirate Signal Processing Chapter 8 Dr. Bradley J. Bazuin Western Michigan University College o Engineering and Applied Sciences Department o Electrical and Computer Engineering 903 W. Michigan Ave.

More information

Computer Data Analysis and Plotting

Computer Data Analysis and Plotting Phys 122 February 6, 2006 quark%//~bland/docs/manuals/ph122/pcintro/pcintro.doc Computer Data Analysis and Plotting In this lab we will use Microsot EXCEL to do our calculations. This program has been

More information

Chapter 3 Image Enhancement in the Spatial Domain

Chapter 3 Image Enhancement in the Spatial Domain Chapter 3 Image Enhancement in the Spatial Domain Yinghua He School o Computer Science and Technology Tianjin University Image enhancement approaches Spatial domain image plane itsel Spatial domain methods

More information

MAPI Computer Vision. Multiple View Geometry

MAPI Computer Vision. Multiple View Geometry MAPI Computer Vision Multiple View Geometry Geometry o Multiple Views 2- and 3- view geometry p p Kpˆ [ K R t]p Geometry o Multiple Views 2- and 3- view geometry Epipolar Geometry The epipolar geometry

More information

Larger K-maps. So far we have only discussed 2 and 3-variable K-maps. We can now create a 4-variable map in the

Larger K-maps. So far we have only discussed 2 and 3-variable K-maps. We can now create a 4-variable map in the EET 3 Chapter 3 7/3/2 PAGE - 23 Larger K-maps The -variable K-map So ar we have only discussed 2 and 3-variable K-maps. We can now create a -variable map in the same way that we created the 3-variable

More information

Method estimating reflection coefficients of adaptive lattice filter and its application to system identification

Method estimating reflection coefficients of adaptive lattice filter and its application to system identification Acoust. Sci. & Tech. 28, 2 (27) PAPER #27 The Acoustical Society o Japan Method estimating relection coeicients o adaptive lattice ilter and its application to system identiication Kensaku Fujii 1;, Masaaki

More information

9.8 Graphing Rational Functions

9.8 Graphing Rational Functions 9. Graphing Rational Functions Lets begin with a deinition. Deinition: Rational Function A rational unction is a unction o the orm P where P and Q are polynomials. Q An eample o a simple rational unction

More information

Skill Sets Chapter 5 Functions

Skill Sets Chapter 5 Functions Skill Sets Chapter 5 Functions No. Skills Examples o questions involving the skills. Sketch the graph o the (Lecture Notes Example (b)) unction according to the g : x x x, domain. x, x - Students tend

More information

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Venugopalan Ramasubramanian Emin Gün Sirer Presented By: Kamalakar Kambhatla * Slides adapted from the paper -

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

1 Introduction 2. 2 A Simple Algorithm 2. 3 A Fast Algorithm 2

1 Introduction 2. 2 A Simple Algorithm 2. 3 A Fast Algorithm 2 Polyline Reduction David Eberly, Geometric Tools, Redmond WA 98052 https://www.geometrictools.com/ This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy

More information

A Survey of Peer-to-Peer Content Distribution Technologies

A Survey of Peer-to-Peer Content Distribution Technologies A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004 Presenter: Seung-hwan Baek Ja-eun Choi Outline Overview

More information

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm

A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Appears as Technical Memo MIT/LCS/TM-590, MIT Laboratory for Computer Science, June 1999 A Correctness Proof for a Practical Byzantine-Fault-Tolerant Replication Algorithm Miguel Castro and Barbara Liskov

More information

Piecewise polynomial interpolation

Piecewise polynomial interpolation Chapter 2 Piecewise polynomial interpolation In ection.6., and in Lab, we learned that it is not a good idea to interpolate unctions by a highorder polynomials at equally spaced points. However, it transpires

More information

AN ad hoc network is a collection of mobile nodes

AN ad hoc network is a collection of mobile nodes 96 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 19, NO. 7, JULY 008 A Sel-Stabilizing Leader Election Algorithm in Highly Dynamic Ad Hoc Mobile Networks Abdelouahid Derhab and Nadjib Badache

More information

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University CS 555: DISTRIBUTED SYSTEMS [P2P SYSTEMS] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Byzantine failures vs malicious nodes

More information

CS 161: Design and Analysis of Algorithms

CS 161: Design and Analysis of Algorithms CS 161: Design and Analysis o Algorithms Announcements Homework 3, problem 3 removed Greedy Algorithms 4: Human Encoding/Set Cover Human Encoding Set Cover Alphabets and Strings Alphabet = inite set o

More information

Global Constraints. Combinatorial Problem Solving (CPS) Enric Rodríguez-Carbonell (based on materials by Javier Larrosa) February 22, 2019

Global Constraints. Combinatorial Problem Solving (CPS) Enric Rodríguez-Carbonell (based on materials by Javier Larrosa) February 22, 2019 Global Constraints Combinatorial Problem Solving (CPS) Enric Rodríguez-Carbonell (based on materials by Javier Larrosa) February 22, 2019 Global Constraints Global constraints are classes o constraints

More information

Introduction to Peer-to-Peer Systems

Introduction to Peer-to-Peer Systems Introduction Introduction to Peer-to-Peer Systems Peer-to-peer (PP) systems have become extremely popular and contribute to vast amounts of Internet traffic PP basic definition: A PP system is a distributed

More information

1 The range query problem

1 The range query problem CS268: Geometric Algorithms Handout #12 Design and Analysis Original Handout #12 Stanford University Thursday, 19 May 1994 Original Lecture #12: Thursday, May 19, 1994 Topics: Range Searching with Partition

More information

Using a Projected Subgradient Method to Solve a Constrained Optimization Problem for Separating an Arbitrary Set of Points into Uniform Segments

Using a Projected Subgradient Method to Solve a Constrained Optimization Problem for Separating an Arbitrary Set of Points into Uniform Segments Using a Projected Subgradient Method to Solve a Constrained Optimization Problem or Separating an Arbitrary Set o Points into Uniorm Segments Michael Johnson May 31, 2011 1 Background Inormation The Airborne

More information

Squid: Enabling search in DHT-based systems

Squid: Enabling search in DHT-based systems J. Parallel Distrib. Comput. 68 (2008) 962 975 Contents lists available at ScienceDirect J. Parallel Distrib. Comput. journal homepage: www.elsevier.com/locate/jpdc Squid: Enabling search in DHT-based

More information

Differential Cryptanalysis

Differential Cryptanalysis Differential Cryptanalysis See: Biham and Shamir, Differential Cryptanalysis of the Data Encryption Standard, Springer Verlag, 1993. c Eli Biham - March, 28 th, 2012 1 Differential Cryptanalysis The Data

More information

Classifier Evasion: Models and Open Problems

Classifier Evasion: Models and Open Problems Classiier Evasion: Models and Open Problems Blaine Nelson 1, Benjamin I. P. Rubinstein 2, Ling Huang 3, Anthony D. Joseph 1,3, and J. D. Tygar 1 1 UC Berkeley 2 Microsot Research 3 Intel Labs Berkeley

More information

SUPER RESOLUTION IMAGE BY EDGE-CONSTRAINED CURVE FITTING IN THE THRESHOLD DECOMPOSITION DOMAIN

SUPER RESOLUTION IMAGE BY EDGE-CONSTRAINED CURVE FITTING IN THE THRESHOLD DECOMPOSITION DOMAIN SUPER RESOLUTION IMAGE BY EDGE-CONSTRAINED CURVE FITTING IN THE THRESHOLD DECOMPOSITION DOMAIN Tsz Chun Ho and Bing Zeng Department o Electronic and Computer Engineering The Hong Kong University o Science

More information

Secure Multiparty Computation

Secure Multiparty Computation CS573 Data Privacy and Security Secure Multiparty Computation Problem and security definitions Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information

SMD149 - Operating Systems - File systems

SMD149 - Operating Systems - File systems SMD149 - Operating Systems - File systems Roland Parviainen November 21, 2005 1 / 59 Outline Overview Files, directories Data integrity Transaction based file systems 2 / 59 Files Overview Named collection

More information

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today Network Science: Peer-to-Peer Systems Ozalp Babaoglu Dipartimento di Informatica Scienza e Ingegneria Università di Bologna www.cs.unibo.it/babaoglu/ Introduction Peer-to-peer (PP) systems have become

More information

A Novel Accurate Genetic Algorithm for Multivariable Systems

A Novel Accurate Genetic Algorithm for Multivariable Systems World Applied Sciences Journal 5 (): 137-14, 008 ISSN 1818-495 IDOSI Publications, 008 A Novel Accurate Genetic Algorithm or Multivariable Systems Abdorreza Alavi Gharahbagh and Vahid Abolghasemi Department

More information

08 Distributed Hash Tables

08 Distributed Hash Tables 08 Distributed Hash Tables 2/59 Chord Lookup Algorithm Properties Interface: lookup(key) IP address Efficient: O(log N) messages per lookup N is the total number of servers Scalable: O(log N) state per

More information

Random Oracles - OAEP

Random Oracles - OAEP Random Oracles - OAEP Anatoliy Gliberman, Dmitry Zontov, Patrick Nordahl September 23, 2004 Reading Overview There are two papers presented this week. The first paper, Random Oracles are Practical: A Paradigm

More information

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems. Chapter General Characteristics Chapter 2 Peer-to-Peer Systems Abstract In this chapter, a basic overview is given of P2P systems, architectures, and search strategies in P2P systems. More specific concepts that are outlined include

More information

Reflection and Refraction

Reflection and Refraction Relection and Reraction Object To determine ocal lengths o lenses and mirrors and to determine the index o reraction o glass. Apparatus Lenses, optical bench, mirrors, light source, screen, plastic or

More information

Scaling Out Key-Value Storage

Scaling Out Key-Value Storage Scaling Out Key-Value Storage COS 418: Distributed Systems Logan Stafman [Adapted from K. Jamieson, M. Freedman, B. Karp] Horizontal or vertical scalability? Vertical Scaling Horizontal Scaling 2 Horizontal

More information

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011 Lecture 6: Overlay Networks CS 598: Advanced Internetworking Matthew Caesar February 15, 2011 1 Overlay networks: Motivations Protocol changes in the network happen very slowly Why? Internet is shared

More information

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage Horizontal or vertical scalability? Scaling Out Key-Value Storage COS 418: Distributed Systems Lecture 8 Kyle Jamieson Vertical Scaling Horizontal Scaling [Selected content adapted from M. Freedman, B.

More information

Database Architectures

Database Architectures Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in

More information

THIN LENSES: BASICS. There are at least three commonly used symbols for object and image distances:

THIN LENSES: BASICS. There are at least three commonly used symbols for object and image distances: THN LENSES: BASCS BJECTVE: To study and veriy some o the laws o optics applicable to thin lenses by determining the ocal lengths o three such lenses ( two convex, one concave) by several methods. THERY:

More information

Introduction. Secret Key Cryptography. Outline. Secrets? (Cont d) Secret Keys or Secret Algorithms? Introductory Remarks Feistel Cipher DES AES

Introduction. Secret Key Cryptography. Outline. Secrets? (Cont d) Secret Keys or Secret Algorithms? Introductory Remarks Feistel Cipher DES AES Outline CSCI 454/554 Computer and Network Security Introductory Remarks Feistel Cipher DES AES Topic 3.1 Secret Key Cryptography Algorithms 2 Secret Keys or Secret Algorithms? Introduction Security by

More information

A STUDY ON COEXISTENCE OF WLAN AND WPAN USING A PAN COORDINATOR WITH AN ARRAY ANTENNA

A STUDY ON COEXISTENCE OF WLAN AND WPAN USING A PAN COORDINATOR WITH AN ARRAY ANTENNA A STUDY ON COEXISTENCE OF WLAN AND WPAN USING A PAN COORDINATOR WITH AN ARRAY ANTENNA Yuta NAKAO(Graduate School o Engineering, Division o Physics, Electrical and Computer Engineering, Yokohama National

More information

FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Appears in 5th Symposium on Operating Systems Design and Implementation (OSDI 2002), Boston, MA, December 2002 FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

More information

Digital Image Processing. Image Enhancement in the Spatial Domain (Chapter 4)

Digital Image Processing. Image Enhancement in the Spatial Domain (Chapter 4) Digital Image Processing Image Enhancement in the Spatial Domain (Chapter 4) Objective The principal objective o enhancement is to process an images so that the result is more suitable than the original

More information

Formal Expression of BBc-1 Mechanism and Its Security Analysis

Formal Expression of BBc-1 Mechanism and Its Security Analysis Formal Expression of BBc-1 Mechanism and Its Security Analysis Jun KURIHARA and Takeshi KUBO kurihara@ieee.org t-kubo@zettant.com October 31, 2017 1 Introduction Bitcoin and its core database/ledger technology

More information

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General 1 ELT-53206 Peer-to-Peer Networks Searching for Shared Resources: DHT in General Mathieu Devos Tampere University of Technology Department of Electronics and Communications Engineering Based on the original

More information

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General 1 ELT-53207 P2P & IoT Systems Searching for Shared Resources: DHT in General Mathieu Devos Tampere University of Technology Department of Electronics and Communications Engineering Based on the original

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Content Overlays. Nick Feamster CS 7260 March 12, 2007

Content Overlays. Nick Feamster CS 7260 March 12, 2007 Content Overlays Nick Feamster CS 7260 March 12, 2007 Content Overlays Distributed content storage and retrieval Two primary approaches: Structured overlay Unstructured overlay Today s paper: Chord Not

More information

Pentagon contact representations

Pentagon contact representations Pentagon contact representations Stean Felsner Institut ür Mathematik Technische Universität Berlin Hendrik Schrezenmaier Institut ür Mathematik Technische Universität Berlin August, 017 Raphael Steiner

More information

Binary recursion. Unate functions. If a cover C(f) is unate in xj, x, then f is unate in xj. x

Binary recursion. Unate functions. If a cover C(f) is unate in xj, x, then f is unate in xj. x Binary recursion Unate unctions! Theorem I a cover C() is unate in,, then is unate in.! Theorem I is unate in,, then every prime implicant o is unate in. Why are unate unctions so special?! Special Boolean

More information

Question Bank Subject: Advanced Data Structures Class: SE Computer

Question Bank Subject: Advanced Data Structures Class: SE Computer Question Bank Subject: Advanced Data Structures Class: SE Computer Question1: Write a non recursive pseudo code for post order traversal of binary tree Answer: Pseudo Code: 1. Push root into Stack_One.

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

THE FINANCIAL CALCULATOR

THE FINANCIAL CALCULATOR Starter Kit CHAPTER 3 Stalla Seminars THE FINANCIAL CALCULATOR In accordance with the AIMR calculator policy in eect at the time o this writing, CFA candidates are permitted to use one o two approved calculators

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

10. SOPC Builder Component Development Walkthrough

10. SOPC Builder Component Development Walkthrough 10. SOPC Builder Component Development Walkthrough QII54007-9.0.0 Introduction This chapter describes the parts o a custom SOPC Builder component and guides you through the process o creating an example

More information

Basic Concepts and Definitions. CSC/ECE 574 Computer and Network Security. Outline

Basic Concepts and Definitions. CSC/ECE 574 Computer and Network Security. Outline CSC/ECE 574 Computer and Network Security Topic 2. Introduction to Cryptography 1 Outline Basic Crypto Concepts and Definitions Some Early (Breakable) Cryptosystems Key Issues 2 Basic Concepts and Definitions

More information

Fig. 3.1: Interpolation schemes for forward mapping (left) and inverse mapping (right, Jähne, 1997).

Fig. 3.1: Interpolation schemes for forward mapping (left) and inverse mapping (right, Jähne, 1997). Eicken, GEOS 69 - Geoscience Image Processing Applications, Lecture Notes - 17-3. Spatial transorms 3.1. Geometric operations (Reading: Castleman, 1996, pp. 115-138) - a geometric operation is deined as

More information

Secure Data Deduplication with Dynamic Ownership Management in Cloud Storage

Secure Data Deduplication with Dynamic Ownership Management in Cloud Storage Secure Data Deduplication with Dynamic Ownership Management in Cloud Storage Dr.S.Masood Ahamed 1, N.Mounika 2, N.vasavi 3, M.Vinitha Reddy 4 HOD, Department of Computer Science & Engineering,, Guru Nanak

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture II: Indexing Part I of this course Indexing 3 Database File Organization and Indexing Remember: Database tables

More information

Optics Quiz #2 April 30, 2014

Optics Quiz #2 April 30, 2014 .71 - Optics Quiz # April 3, 14 Problem 1. Billet s Split Lens Setup I the ield L(x) is placed against a lens with ocal length and pupil unction P(x), the ield (X) on the X-axis placed a distance behind

More information

Client Server & Distributed System. A Basic Introduction

Client Server & Distributed System. A Basic Introduction Client Server & Distributed System A Basic Introduction 1 Client Server Architecture A network architecture in which each computer or process on the network is either a client or a server. Source: http://webopedia.lycos.com

More information

Edge and local feature detection - 2. Importance of edge detection in computer vision

Edge and local feature detection - 2. Importance of edge detection in computer vision Edge and local feature detection Gradient based edge detection Edge detection by function fitting Second derivative edge detectors Edge linking and the construction of the chain graph Edge and local feature

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Boundary control : Access Controls: An access control mechanism processes users request for resources in three steps: Identification:

Boundary control : Access Controls: An access control mechanism processes users request for resources in three steps: Identification: Application control : Boundary control : Access Controls: These controls restrict use of computer system resources to authorized users, limit the actions authorized users can taker with these resources,

More information

Scheduling in Multihop Wireless Networks without Back-pressure

Scheduling in Multihop Wireless Networks without Back-pressure Forty-Eighth Annual Allerton Conerence Allerton House, UIUC, Illinois, USA September 29 - October 1, 2010 Scheduling in Multihop Wireless Networks without Back-pressure Shihuan Liu, Eylem Ekici, and Lei

More information

TELCOM2125: Network Science and Analysis

TELCOM2125: Network Science and Analysis School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 2 Part 4: Dividing Networks into Clusters The problem l Graph partitioning

More information

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016 Distributed Systems 17. Distributed Lookup Paul Krzyzanowski Rutgers University Fall 2016 1 Distributed Lookup Look up (key, value) Cooperating set of nodes Ideally: No central coordinator Some nodes can

More information

Research on Image Splicing Based on Weighted POISSON Fusion

Research on Image Splicing Based on Weighted POISSON Fusion Research on Image Splicing Based on Weighted POISSO Fusion Dan Li, Ling Yuan*, Song Hu, Zeqi Wang School o Computer Science & Technology HuaZhong University o Science & Technology Wuhan, 430074, China

More information

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables Peer-to-Peer Systems and Distributed Hash Tables COS 418: Distributed Systems Lecture 7 Today 1. Peer-to-Peer Systems Napster, Gnutella, BitTorrent, challenges 2. Distributed Hash Tables 3. The Chord Lookup

More information

Clustering Using Graph Connectivity

Clustering Using Graph Connectivity Clustering Using Graph Connectivity Patrick Williams June 3, 010 1 Introduction It is often desirable to group elements of a set into disjoint subsets, based on the similarity between the elements in the

More information

Encrypted Data Deduplication in Cloud Storage

Encrypted Data Deduplication in Cloud Storage Encrypted Data Deduplication in Cloud Storage Chun- I Fan, Shi- Yuan Huang, Wen- Che Hsu Department of Computer Science and Engineering Na>onal Sun Yat- sen University Kaohsiung, Taiwan AsiaJCIS 2015 Outline

More information

Peer to Peer Networks

Peer to Peer Networks Sungkyunkwan University Peer to Peer Networks Prepared by T. Le-Duc and H. Choo Copyright 2000-2017 Networking Laboratory Presentation Outline 2.1 Introduction 2.2 Client-Server Paradigm 2.3 Peer-To-Peer

More information

5.2 Properties of Rational functions

5.2 Properties of Rational functions 5. Properties o Rational unctions A rational unction is a unction o the orm n n1 polynomial p an an 1 a1 a0 k k1 polynomial q bk bk 1 b1 b0 Eample 3 5 1 The domain o a rational unction is the set o all

More information

PATH PLANNING OF UNMANNED AERIAL VEHICLE USING DUBINS GEOMETRY WITH AN OBSTACLE

PATH PLANNING OF UNMANNED AERIAL VEHICLE USING DUBINS GEOMETRY WITH AN OBSTACLE Proceeding o International Conerence On Research, Implementation And Education O Mathematics And Sciences 2015, Yogyakarta State University, 17-19 May 2015 PATH PLANNING OF UNMANNED AERIAL VEHICLE USING

More information

L3S Research Center, University of Hannover

L3S Research Center, University of Hannover , University of Hannover Dynamics of Wolf-Tilo Balke and Wolf Siberski 21.11.2007 *Original slides provided by S. Rieche, H. Niedermayer, S. Götz, K. Wehrle (University of Tübingen) and A. Datta, K. Aberer

More information

TO meet the increasing bandwidth demands and reduce

TO meet the increasing bandwidth demands and reduce Assembling TCP/IP Packets in Optical Burst Switched Networks Xiaojun Cao Jikai Li Yang Chen Chunming Qiao Department o Computer Science and Engineering State University o New York at Bualo Bualo, NY -

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

Mobile Robot Static Path Planning Based on Genetic Simulated Annealing Algorithm

Mobile Robot Static Path Planning Based on Genetic Simulated Annealing Algorithm Mobile Robot Static Path Planning Based on Genetic Simulated Annealing Algorithm Wang Yan-ping 1, Wubing 2 1. School o Electric and Electronic Engineering, Shandong University o Technology, Zibo 255049,

More information

Current Topics in OS Research. So, what s hot?

Current Topics in OS Research. So, what s hot? Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general

More information

ITU - Telecommunication Standardization Sector. G.fast: Far-end crosstalk in twisted pair cabling; measurements and modelling ABSTRACT

ITU - Telecommunication Standardization Sector. G.fast: Far-end crosstalk in twisted pair cabling; measurements and modelling ABSTRACT ITU - Telecommunication Standardization Sector STUDY GROUP 15 Temporary Document 11RV-22 Original: English Richmond, VA. - 3-1 Nov. 211 Question: 4/15 SOURCE 1 : TNO TITLE: G.ast: Far-end crosstalk in

More information

ISA 562: Information Security, Theory and Practice. Lecture 1

ISA 562: Information Security, Theory and Practice. Lecture 1 ISA 562: Information Security, Theory and Practice Lecture 1 1 Encryption schemes 1.1 The semantics of an encryption scheme. A symmetric key encryption scheme allows two parties that share a secret key

More information

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Ramin Zabih Computer Science Department Stanford University Stanford, California 94305 Abstract Bandwidth is a fundamental concept

More information

Elliptic Curve Public Key Cryptography

Elliptic Curve Public Key Cryptography Why? Elliptic Curve Public Key Cryptography ECC offers greater security for a given key size. Why? Elliptic Curve Public Key Cryptography ECC offers greater security for a given key size. The smaller key

More information

Overview. Introduction. Internet Worm. Defenses. Recent Trends. Detection and Propagation Modeling of Internet Worms

Overview. Introduction. Internet Worm. Defenses. Recent Trends. Detection and Propagation Modeling of Internet Worms Overview Detection and Propagation Modeling o Internet Worms Ph.D. research proposal by: Parbati Kumar Manna Co-advised by: Dr. Sanjay Ranka and Dr. Shigang Chen Research opportunities in Internet worm

More information

LOAD BALANCING AND DEDUPLICATION

LOAD BALANCING AND DEDUPLICATION LOAD BALANCING AND DEDUPLICATION Mr.Chinmay Chikode Mr.Mehadi Badri Mr.Mohit Sarai Ms.Kshitija Ubhe ABSTRACT Load Balancing is a method of distributing workload across multiple computing resources such

More information

Binary Morphological Model in Refining Local Fitting Active Contour in Segmenting Weak/Missing Edges

Binary Morphological Model in Refining Local Fitting Active Contour in Segmenting Weak/Missing Edges 0 International Conerence on Advanced Computer Science Applications and Technologies Binary Morphological Model in Reining Local Fitting Active Contour in Segmenting Weak/Missing Edges Norshaliza Kamaruddin,

More information

Introduction to Cryptography. Lecture 3

Introduction to Cryptography. Lecture 3 Introduction to Cryptography Lecture 3 Benny Pinkas March 6, 2011 Introduction to Cryptography, Benny Pinkas page 1 Pseudo-random generator seed s (random, s =n) Pseudo-random generator G Deterministic

More information

Attribute-based encryption with encryption and decryption outsourcing

Attribute-based encryption with encryption and decryption outsourcing Edith Cowan University Research Online Australian Information Security Management Conference Conferences, Symposia and Campus Events 2014 Attribute-based encryption with encryption and decryption outsourcing

More information

: Scalable Lookup

: Scalable Lookup 6.824 2006: Scalable Lookup Prior focus has been on traditional distributed systems e.g. NFS, DSM/Hypervisor, Harp Machine room: well maintained, centrally located. Relatively stable population: can be

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

What did we talk about last time? Public key cryptography A little number theory

What did we talk about last time? Public key cryptography A little number theory Week 4 - Friday What did we talk about last time? Public key cryptography A little number theory If p is prime and a is a positive integer not divisible by p, then: a p 1 1 (mod p) Assume a is positive

More information

13. Power Management in Stratix IV Devices

13. Power Management in Stratix IV Devices February 2011 SIV51013-3.2 13. Power Management in Stratix IV Devices SIV51013-3.2 This chapter describes power management in Stratix IV devices. Stratix IV devices oer programmable power technology options

More information

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong

CS573 Data Privacy and Security. Cryptographic Primitives and Secure Multiparty Computation. Li Xiong CS573 Data Privacy and Security Cryptographic Primitives and Secure Multiparty Computation Li Xiong Outline Cryptographic primitives Symmetric Encryption Public Key Encryption Secure Multiparty Computation

More information