Coded Distributed Computing: Straggling Servers and Multistage Dataflows

Size: px
Start display at page:

Download "Coded Distributed Computing: Straggling Servers and Multistage Dataflows"

Transcription

1 Coded Distributed Computing: Straggling Servers and Multistage Dataflows Songze Li, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA Noia Bell Labs, Holmdel, NJ, USA Abstract In this paper, we first review the Coded Distributed Computing (CDC framewor, recently proposed to significantly slash the data shuffling load of distributed computing via coding, and then discuss the extension of the CDC techniues to cope with two major challenges in general distributed computing problems, namely the straggling servers and multistage computations. When faced with straggling servers in a distributed computing cluster, we describe a unified coding scheme that superimposes CDC with the Maximum-Distance-Separable (MDS coding on computation tass, which allows a flexible tradeoff between computation latency and communication load. On the other hand, for a general multistage computation tas expressed as a directed acyclic graph (DAG, we propose a coded framewor that given the load of computation on each vertex of the DAG, applies the generalized CDC scheme individually on each vertex to minimize the communication load. I. INRODUCION Recently in [] [3], coding was introduced into distributed computing, in order to reduce the overhead of shuffling intermediate results across computing servers, hence speeding up the overall computation. In a general MapReducetype distributed computing structure, input files are processed distributedly using designed Map functions across a server cluster, generating some intermediate values. hen the servers exchange the calculated intermediate values (a..a. data shuffling, in order to calculate the final output results distributedly using the designed Reduce functions. For such a structure, it was demonstrated in [3] that coding can be applied on both Map tas placement and data shuffling, significantly slashing the load of communication. A tradeoff between the communication load (normalized total number of shuffled bits and the computation load (normalized total number of computed Map functions was formalized and exactly characterized in [3]. In particular, for a distributed computing application run on K servers and a computation load of r, r {,..., K}, the minimum reuired communication load was characterized as L (r= r ( r K. A coded computing framewor, namely Coded Distributed Computing (CDC, was proposed in [3] to achieve this tradeoff. CDC utilizes a carefully designed repetitive mapping of input files at r distinct servers, creating coded multicast messages that simultaneously satisfy the data demands of r servers. Hence, compared with an uncoded data shuffling scheme, CDC reduces the communication load by exactly a factor of the computation load r. his effect is demonstrated in a numerical evaluation in Fig.. Communication Load (L Uncoded Scheme Coded Distributed Computing Computation Load (r Fig. : Comparison of the communication load achieved by Coded Distributed Computing with that of the uncoded scheme. For r {,..., K}, CDC is r times better than the uncoded scheme. In this paper, we consider two extensions of the CDC framewor, namely CDC with Straggling Servers and CDC for Multistage Dataflows, which focus on applying the principle of CDC into a broader class of distributed computing problems. A. CDC with Straggling Servers As mentioned before, the execution of a MapReduce-type distributed computing job consists of the Map phase, the Shuffle phase, and the Reduce phase. he CDC scheme proposed in [3] focus on minimizing the communication load in the Shuffle phase, and we term this coding approach as Minimum Bandwidth Code. On the other hand, in a recent wor [4], the authors proposed to apply Maximum-Distance- Separable (MDS codes to create some redundant Map tass, so that the run-time of the Map phase is not affected by up to a certain number of straggling servers. his coding scheme, which we term as Minimum Latency Code, results in a significant reduction of Map computation latency. We proposed in [5] a unified coding framewor for distributed computing with straggling servers, by introducing a tradeoff between latency of computation and load of communication, for a distributed matrix multiplication problem. We show that the Minimum Bandwidth Code in [3] and the Minimum Latency Code in [4] can then be viewed as special instances of the proposed coding framewor by considering two extremes of this tradeoff: minimizing either the load of communication or the latency of computation individually. B. CDC for Multistage Dataflows Unlie simple computation tass lie Grep, Join and Sort, many distributed computing applications contain multiples

2 stages of MapReduce computations. Examples of these applications include machine learning algorithms [6], SQL ueries for databases [7], [8], and scientific analytics [9]. One can express the computation logic of a multistage application as a directed acyclic graph (DAG [0], in which each vertex represents a logical step of data transformation, and each edge represents the dataflow across processing vertices. We formalize a distributed computing model for multistage dataflow applications. We express a multistage dataflow as a layered-dag, in which the processing vertices within a particular computation stage are grouped into a single layer. Each vertex represents a MapReduce-type computation, transforming a set of input files into a set of output files. he set of edges specifies the order of the computations such that the head vertex of an edge does not start its computation until the tail vertex finishes, and 2 the inputoutput relationships between vertices such that the input files of a vertex consist of the output files of all vertices connected to it through incoming edges. For a given layered-dag, we propose a coded computing scheme to achieve a set of computation-communication tuples, which characterizes the load of computation for each processing vertex, and the load of communication within each layer. he proposed scheme first specifies the computation loads of the Map and Reduce functions for each vertex (i.e., how many times a Map or a Reduce function should be calculated, and then exploits the CDC scheme in [3] to perform the computation for each vertex individually. II. OVERVIEW OF CODED DISRIBUED COMUING In this section, we first briefly describe the problem of distributed computing, and then review our results in [3] on characterizing the tradeoff between the computation load and the communication load. A. Distributed Computing Framewor In a distributed computing problem, the goal is to compute Q output functions from N input files. As shown in Fig. 2, the overall computation is decomposed into computing a set of Map functions, one for each input file, and a set of Reduce functions, one for each output function. In particular, each Map function computes Q intermediate values, one for each output function. Each Reduce function taes in all N intermediate values from all input files, and calculates the final output result. 2 N Map Map 2 Map N 2 2 N N Reduce Reduce Q Output Output Q Fig. 2: Illustration of a two-stage distributed computing framewor. he computation is carried out over K distributed computing servers, on which the computations of the Q output functions are uniformly distributed. Following the above decomposition, the computation proceeds in three phases: Map, Shuffle and Reduce. In the Map phase, each server computes a subset of Map functions locally. hen in the Shuffle phase, each server creates messages based on the local Map results and multicasts them to the intended servers. In the Reduce phase, each server recovers the reuired intermediate values from the received messages and the local Map results, and uses them to reduce the assigned output functions. he computation load, denoted by r, r K, is defined as the total number of Map functions computed across the K servers, normalized by the number of files N. he communication load, denoted by L, 0 L, is defined as the total number of bits communicated in the Shuffle phase, normalized the total number of bits in all QN intermediate values. A computation-communication pair (r, L is feasible if there exist a placement of the Map tass and a data shuffling scheme such that all output functions can be successfully reduced. he computation-communication function of this framewor is defined as L (r inf{l : (r, L is feasible}. ( B. Computation-Communication radeoff he computation-communication function was exactly characterized in [3], and stated in the following theorem. heorem. he computation-communication function of the distributed computing framewor, L (r is given by L (r = L coded (r r ( r K, r {,..., K}, (2 for sufficiently large N. For general r K, L (r is the lower convex envelop of the above points. he tradeoff in heorem is achieved by the Coded Distributed Computing (CDC scheme proposed in [3]. he ey idea of the scheme is to repeat each Map computation across servers following a specific pattern, in order to create coded multicast messages in the Shuffle phase that are simultaneously useful for multiple servers. In [3], the CDC scheme was also generalized to tacle a cascaded distributed computing framewor, in which each output function is computed by s servers, for some s {,..., K}. he computation-communication function for the cascaded framewor, which is achieved by a generalized CDC scheme, was stated in the following theorem. heorem 2. he computation-communication function of the cascaded distributed computing framewor, for r {,..., K}, is characterized by L (r, s=l coded (r, s min{r+s,k} l=max{r+,s} ( l 2 ( r l l r l s r ( ( K K, (3 r s for sufficiently large Q and N, and s {,..., K}. For general r K, L (r, s is the lower convex envelop of the above points {(r, L coded (r, s : r {,..., K}}. III. CDC WIH SRAGGLING SERVERS We introduced in [5], a MapReduce-type distributed computing framewor for a matrix multiplication problem. When the CDC scheme (or the Minimum Bandwidth Code is applied to this framewor, while the shuffling load is minimized, a high Map phase latency would occur since the system needs to wait for all straggling servers to finish their

3 Map computations. In order to extend the CDC scheme to optimize the performance of systems with straggling servers, we formalized in [5] a tradeoff between the computation latency in the Map phase and the communication load in the Shuffle phase, and proposed a unified coding scheme that systematically concatenates the Minimum Bandwidth Code in [3] and the Minimum Latency Code in [4]. Next, we first describe the considered distributed matrix multiplication problem, then state our main results, and finally demonstrate the proposed unified coding scheme using an illustrative example. A. roblem Formulation System Model: We consider a matrix multiplication problem in which given a matrix A F m n for some integers 2, m and n, and N input vectors x,..., x N F n 2, we want to compute N output vectors y =Ax,..., y N = Ax N. We perform the computations using K distributed servers. Each server has a local memory of size µmn bits, for some K µ. We allow applying linear codes for storing the rows of A at each server. Specifically, Server, {,..., K}, designs an encoding matrix E F µm m, and stores 2 U = E A. (4 he collection of the encoding matrices {E } K = is denoted as storage design. 2 Distributed Computing Model: We assume that the input vectors x,..., x N are nown to all the servers. he computation proceeds in Map, Shuffle and Reduce phases. Map hase. For all j =,..., N, Server, =,..., K, computes the intermediate vectors z j, = U x j = E Ax j = E y j. (5 We denote the latency for Server to compute z,,..., z N, as S. S,..., S K are i.i.d. random variables. We denote the th order statistic, i.e., the th smallest variable of S,..., S K as S (, for all {,..., K}, and focus on a class of distributions of S such that E{S ( } = µng(k,, (6 for some function g(k,. he Map phase terminates when a subset of servers, denoted by Q {,..., K}, have finished their Map computations in (5. A necessary condition for selecting Q is that the output vectors y..., y N can be re-constructed by jointly utilizing the intermediate vectors calculated by the servers in Q, i.e., {z j, : j =,..., N, Q}. Definition (Computation Latency. We define the computation latency, denoted by D, as the average amount of time spent in the Map phase. After the Map phase, the job of computing the output vectors y..., y N continues exclusively over the servers in Q. he final computations of the output vectors are distributed uniformly across the servers in Q. 2 Shuffle hase. Each server in Q generates a message X from the locally computed intermediate vectors hus enough information to recover the entire matrix A can be stored collectively on the K servers. 2 We assume that N K, and Q divides N for all Q {,..., K}. z,,..., z N, through an encoding function φ, i.e., X = φ (z,,..., z N,, such that upon receiving all messages {X : Q}, every server Q can reduce the assigned output vectors. We assume that the servers are connected by a shared bus lin. After generating X, Server multicasts X to all the other servers in Q. Definition 2 (Communication Load. We define the communication load, denoted by L, as the average total number of bits in all messages {X : Q}, normalized by m (i.e., the total number of bits in an output vector. Reduce hase. Server, Q, uses the locally computed vectors z,,..., z N, and the received multicast messages {X : Q} to reduce the assigned N/ Q output vectors. For such a distributed computing system, we say a latencyload pair (D, L R 2 is feasible if there exist a storage design {E } K =, a Map phase computation with latency D, and a shuffling scheme with communication load L, such that all output vectors can be successfully reduced. Definition 3. We define the latency-load region, as the closure of the set of all feasible (D, L pairs. 3 Illustrating Example: In order to clarify the formulation, we use the following simple example to illustrate the latency-load pairs achieved by the two coded approaches discussed in Section I. We consider a matrix A consisting of m = 2 rows a,..., a 2. We have N = 4 input vectors x,..., x 4, and the computation is performed on K = 4 servers each has a storage size µ = 2. We assume that the Map latency S, =,..., 4, has a shifted-exponential distribution function F S (t = e ( t µn, t µn, (7 and by e.g., [], the average latency for the fastest, 4, servers to finish the Map computations is D( = E{S ( } = µn + j=k + j. (8 Minimum Bandwidth Code (or CDC [3]. As shown in Fig. 3(a, a Minimum Bandwidth Code repeats the multiplication of each row of A with all input vectors x,..., x 4, µk = 2 times across the 4 servers, according to the mapping strategy of CDC. he Map phase continues until all servers have finished their computations, achieving a computation latency D(4 = 2 ( + 4 j= j = For =,..., 4, Server will be reducing output vector y. In the Shuffle phase, every server multicasts 3 bit-wise XORs, each of which is simultaneously useful for two other servers. Hence, the Minimum Bandwidth Code achieves a communication load L = 3 4/2 =. Minimum Latency Code [4]. A Minimum Latency Code first has each server, =,..., 4, independently and randomly generate 6 random linear combinations of the rows of A, denoted by c 6( +,..., c 6( +6 (see Fig. 3(b, achieving a (24, 2 MDS code. herefore, for any subset D {,..., 24} of size D = 2, using the intermediate values {c i x j : i D} can recover the output vector y j. he Map phase terminates once the fastest 2 servers have finished their computations (e.g., Server and 3, achieving a com-

4 putation latency D(2=2 ( hen Server continues to reduce y and y 2, and Server 3 continues to reduce y 3 and y 4. As illustrated in Fig. 3(b, Server and 3 respectively unicasts the intermediate values it has calculated and needed by the other server to complete the computation, achieving a communication load L=6 4/2=2. Map Shuffle = 9 Server Server 2 Server 3 Server 4 (a Minimum Bandwidth Code. Every row of A is multiplied with the input vectors twice. For =, 2, 3, 4, Server reduces the output vector y. In the Shuffle phase, each server multicasts 3 bit-wise XORs, denoted by, of the calculated intermediate values, each of which is simultaneously useful for two other servers. Map Server Server 2 Server 3 Server 4 he latency-load pairs in heorem are achieved by a unified coding framewor that organically superimposes the Minimum Bandwidth Code and the Minimum Latency Code. he ey idea is to appropriately concatenate the MDS code and the repetitive computations specified by the CDC scheme for Map computations, in order to tae advantage of the redundancies to both combat the stragglers and slash the shuffling load. We demonstrate this unified scheme through an illustrative example in the next subsection. Remar. he Minimum Latency Code and the Minimum Bandwidth Code correspond to = µ and = K, and achieve the two end points (E{S ( µ }, N N/ µ and (E{S (K }, N µk /K µk respectively. Communication Load (L roposed Coded Framewor 20 Outer Bound Shuffle (b Minimum Latency Code. A is encoded into 24 coded rows c..., c 24. Server and 3 finish their Map computations first. hey then exchange enough number (6 for each output vector of intermediate values to reduce y, y 2 at Server and y 3, y 4 at Server 3. Fig. 3: Illustration of the Minimum Bandwidth Code in [3] and the Minimum Latency Code in [4]. Minimum Bandwidth Code spends about twice of the time in the Map phase compared with the Minimum Latency Code, and achieves half of the communication load in the Shuffle phase. hey represent the two end points of a general latencyload tradeoff characterized in the next subsection. B. Main Results he main results of [5] are, a characterization of a set of achievable latency-load pairs by developing a unified coded framewor, 2 an outer bound of the latency-load region, which are stated in the following two theorems. heorem 3. For a distributed matrix multiplication problem of computing N output vectors using K servers, each with a storage size µ K, the latency-load region contains the lower convex envelop of the points {(D(, L( : = µ,..., K}, (9 in which D( = E{S ( } = µng(k,, (0 L( = N µ j=s B j j + N min { µ µ j=s B j, Bs s ( where S ( is the th smallest latency of the K i.i.d. latencies S,..., S K with some distribution F, g(k, is a function of K and computed from F, µ µ and s inf{s : µ j=s B j µ}., B j ( j }, ( µ j K K ( µ K, Computation Latency (D Fig. 4: Comparison of the latency-load pairs achieved by the proposed scheme with the outer bound, for computing N = 80 output vectors using K = 8 servers each with a storage size µ = /3, assuming the the distribution function in (7. Remar 2. As numerically evaluated in Fig. 4, the tradeoff achieved by the unified coding framewor approximately exhibits an inverse-linearly proportional relationship between the latency and the load. For instance, doubling the latency from 20 to 240 results in a drop of the communication load from 43 to 23 by a factor of.87. heorem 4. he latency-load region is contained in the lower convex envelop of the points {(D(, L( : = µ,..., K}, (2 in which D( is given by (0 and min{tµ, } L( = N max t=,..., t ( t. (3 For each = µ,..., K, the lower bound L( was proved as a cut-set bound on multiple instances of the problem, each corresponding to a specific assignment of the output vectors. At two end points of the tradeoff, the unified coding scheme was shown in [5] to achieve the lower bound to within a constant multiplicative gap. C. Unified Coding Framewor In this subsection, we demonstrate the ey ideas of the unified coding framewor that achieves the latency-load pairs in (9, through the following example. We consider a problem of multiplying a matrix A F m n 2 of m = 20 rows with N = 2 input vectors x,..., x 2 to compute 2 output vectors y = Ax..., y 2 = Ax 2, using K = 6 servers each with a storage size µ = 2. We assume that we can afford to wait for = 4 servers to finish their Map computations.

5 Storage Design. As illustrated in Fig 5, we first independently generate 30 random linear combinations c,..., c 30 F n 2 of the 20 rows of A. hen we partition these coded rows c,..., c 30 into 5 batches each of size 2, and store every batch of coded rows at a uniue pair of servers. (30,20 MDS Code artition 5 Batches Storage Server Server 2 Server 3 Server 4 Server 5 Server 6 Fig. 5: Storage Design when the Map phase is terminated when 4 servers have finished the computations. WLOG, due to the symmetry of the storage design, we assume that Servers, 2, 3 and 4 are the first 4 servers that finish their Map computations. hen we assign the Reduce tass such that Server reduces the output vectors y 3( +, y 3( +2 and y 3( +3, for all {,..., 4}. Since Server has computed {c x j,..., c 0 x j : j =,..., 2}, for it to reduce y = Ax, it needs any subset of 0 intermediate values c i x with i {,..., 30} from Server 2, 3 and 4 in the Shuffle phase. Similar data demands hold for all 4 servers and the output vectors they are reducing. Coded Shuffle. We first group the 4 servers into 4 subsets of size 3 and perform coded shuffling within each subset. We illustrate the coded shuffling scheme for Servers, 2 and 3 in Fig. 6. Each server multicasts 3 bit-wise XORs, denoted by, of the locally computed intermediate values to the other two. After receiving 2 multicast messages, each server recovers 6 needed intermediate values. Server Server 2 Server 3 Fig. 6: Multicasting 9 coded intermediate values across Servers, 2 and 3. Similar coded multicast communications are performed for another 3 subsets of 3 servers. Similarly, we perform the above coded shuffling for another 3 subsets of 3 servers. Each server recovers 8 needed intermediate values (6 for each output vector it is reducing. As mentioned before, since each server needs a total of 3 (20 0 = 30 intermediate values to reduce the 3 assigned output vectors, it needs another 30 8 = 2 after decoding all multicast messages. We satisfy the residual data demands by simply having the servers unicast enough (i.e., 2 4 = 48 intermediate values for reduction. Overall, = 84 (possibly coded intermediate values are communicated, achieving a communication load of L = 4.2. IV. CDC FOR MULISAGE DAAFLOWS While the distributed computing model in [3] deals with a single pair of Map and Reduce operations, the logical dataflow of a general distributed computing application consists of multiple stages of MapReduce computations. We can express a multistage dataflow as a directed acyclic graph (DAG. he DAG of an application, denoted by G, consists of a set of vertices V and a set of directed edges A, i.e., G = (V, A. he vertices represent the user-defined operations on the data, e.g., MapReduce, and the edges represent the flow of data between operation vertices. In this section, we formalize a multistage computation tas represented by a DAG, and propose a general coded scheme for DAGs as an extension of the CDC scheme in [3]. A. roblem Formulation: Layered-DAG We consider a computing tas that processes N input files w,..., w N F 2 F to generate Q output files u,..., u Q F 2 B, for some parameters F, B N. he overall computation is represented by a layered-dag G = (V, A, in which the set of vertices V is composed of D layers, denoted by L,..., L D, for some D N. For each d =,..., D, we label the ith vertex in Layer d as m, for all i =,..., L d. See Fig. 7 for the illustration of a 4-layer DAG. m, m,2 m 2, m 2,2 m 2,3 m 3, m 3,2 Layer Layer 2 Layer 3 Fig. 7: A 4-layer DAG. m 4, Layer 4 Each vertex m processes N input files w,..., w N F 2 F, and computes Q output files u..., u Q F 2 B, for some system parameters N, Q, F, B N. In particular, the input files of G are distributed as the inputs to the vertices in Layer, i.e., {w,..., w N } = i=,..., L {w,i,..., w,i N,i }, and the output files of G are distributed as the outputs of the vertices in Layer D, i.e., {u,..., u Q } = i=,..., L D {ud,i,..., u D,i Q D,i }. Edges in A are between vertices in consecutive layers, i.e., A (m, m d+,j. (4 d=,...,d i=,..., L d,j=,..., L d+ he input files of a vertex in Layer d, d = 2,..., D, consist of the output files of the vertices it connects to in the preceding layer. More specifically, for any d {2,..., D} and i {,..., L d }, N = Q d,j and {w,..., w N } = j:(m d,j,m A j:(m d,j,m A {,...,Q d,j } ud,j. (5 For example in Fig. 7, the input files to the vertex m 3, consist of the output files of the vertices m 2, and m 2,3. As a result, other than the number of input files for the vertices in Layer, we only need the number of output files at each vertex as the system parameters. he computation of the output file u, =,..., Q, of the vertex m, for all d =,..., D, i =,..., L d, is

6 decomposed as follows: u (w,..., w N = h where he Map functions g n (g, (w,..., g,n (w N, (6 (F 2 Q, n {,..., N } maps the input file wn into Q length- intermediate values {v,n = g,n(w n F 2 : =,..., Q }, for some N. he Reduce functions h : (F 2 N F 2 B, {,..., Q } maps the intermediate values of the output function u u = h (v = (g,n,..., g Q,n : F 2 F in all input files into the output file,,..., v We compute the above layered-dag using a K-server cluster, for some K N. At each time instance, the servers only perform the computations of the vertices within a single layer. Each vertex in a layer is computed by a subset of servers. We denote the set of servers computing the vertex m as K {,..., K}, where the selection of K is a design parameter. For each K, Server computes a subset of Map functions of m with indices M,N. {,..., N }, and a subset of Reduce functions with indices W {,..., Q }, where M and W are design parameters. We denote the placements of the Map and Reduce functions for m as M {M : K } and W {W : K } respectively. Data Locality. We prohibit transferring input files (or output files calculated in the preceding layer across servers, i.e., every node either stores the needed input files to compute the assigned Map functions (only to initiate the computations in Layer or computes them locally from the assigned Reduce functions in the preceding layer. his implementation provides a better fault-tolerance since the Reduce functions have to be calculated independently across servers. he computation of Layer d, d =,..., D, proceeds in three phases: Map, Shuffle, and Reduce. Map phase. For each vertex m, i =,..., L d, in Layer D, each server in K computes its assigned Map functions g n (wn = (v,n,..., v Q,n, for all n M. Definition 4 (Computation Load. We define the computation load of vertex m, d {,..., D}, i {,..., L d }, denoted by, as the total number of Map functions of m computed across the servers in K, normalized by the K M number of input files N, i.e., N. Shuffle phase. Each server, {,..., K}, creates a message X d as a function, denoted by ψd, of the intermediate values from all input files it has mapped in Layer d, i.e., X d = ψ d ({ v,n : {,..., Q }, n M } Ld i=, and multicasts it to a subset of j K servers. Definition 5 (Communication Load. We define the communication load of Layer d, denoted by L d, as the total number of bits communicated in the Shuffle phase of Layer d. By the end of the Shuffle phase, each server, =,..., K, recovers all reuired intermediate values for the assigned Reduce functions in Layer d, i.e., {v,,..., v,n : W i=, from either the local Map computations or the multicast messages from the other servers. Reduce phase. Each server, =,..., K, computes the assigned Reduce functions to generate the output files of the vertices in Layer d, i.e., {u = h (v,,..., v,n : W }, for all i =,..., L d. We say that a computation-communication tuple {(r d,,..., r d, Ld, L d } D d= is achievable if there exists an assignment of the Map and Reduce computations { M d,, W d,..., Md, Ld, W d, Ld } D d=, and D shuffling schemes such that Server, =,..., K, can successfully } L d compute all the Reduce functions in W, for all d {,..., D} and i {,..., L d }. Definition 6. We define the computation-communication region of a layered-dag G = (V, A, denoted by C(G, as the closure of the set of all achievable computationcommunication tuples. B. CDC for Layered-DAG We propose a general Coded Distributed Computing (CDC scheme for an arbitrary layered-dag, which achieves the computation-communication tuples characterized in the following theorem. heorem 5. For a layered-dag G = (V, A of D layers, the following computation-communication tuples are ahievable {(r d,,..., r d, Ld, L u d} D d=, {r d,,...,r d, Ld {,...,K}} D d= Ld where L u d = i= L coded(, s, KQ N. min{r+s,k} l( Here L coded (r, s, K K l ( r ( l 2 l s r, r l=max{r+,s} r s max s r d+,j, d < D, j:(m,m d+,j A, and, d = D. N = Q d,j, d = 2,..., D. j:(m d,j,m A he above computation-communication tuples are achieved by the proposed CDC scheme for the layered-dag, which first designs the parameters {s d,,..., s d, Ld } D d= that specify the placements of the computations of the Reduce functions, and then applies the CDC scheme for a cascaded distributed computing framewor (see heorem 2 to compute each of the vertices individually. Remar 3. he achieved communication load for vertex m, L coded (, s, KQ N decreases as increases (more locally available Map results and s decreases (more data demands. Due to the specific way the parameter s is chosen in heorem, increasing the computation load r d+,j of some vertex m d+,j connected to m can cause s to increase. In general, while more Map computations result in a smaller communication load in the current layer, they impose a larger communication load on the preceding layer. Next, we describe and analyze the proposed general CDC scheme to compute a layered-dag. o start, we employ a uniform resource allocation such that every vertex is computed over all K servers, i.e., K = {,..., K} for all d =,..., D and i =,..., L d.

7 Remar 4. We note that the communication load L coded (r, s, K in heorem 7 is a decreasing function of K. hat is, for fixed r and s, performing the computation of a vertex over a smaller number of serves yields a smaller communication load. However, the disadvantages of using less servers are Each server needs to compute more Map and Reduce functions, incurring a higher computation load. 2 It may affect the symmetry of the data placement, increasing the communication load in the next layer (see discussions in the next subsection. For each vertex m, i =,..., L d, in Layer d, we specify a computation load {,..., K}, such that the computation of each Map function of m is placed on servers. We also define the reduce factor of m, denoted by s {,..., K}, as the number of servers that compute each Reduce function of m. o satisfy the data locality reuirements (explained later, we select the reduce factor s eual to the largest computation load of the vertex connected to m in Layer d +, i.e., max s = r d+,j, d < D, j:(m,m d+,j A (7, d = D. As an example, for a diamond DAG in Fig. 8, since the output files of m will be used as the inputs for both m 2 and m 3, we should compute each Reduce function of m at s = max{r 2, r 3 } servers. Also, since m 2 and m 3 both only connect to m 4, we shall choose s 2 = s 3 = r 4. m 2 m m 4 m 3 Layer Layer 2 Layer 3 Fig. 8: A diamond DAG. he reduce factors s,..., s 4 are determined by the computation loads r 2, r 3, r 4. Having selected the computation load and the reduce factor s, we employ the CDC scheme in [3] to compute the vertex m, over all K servers. We next briefly describe the CDC computation for m. Map hase Design. he N input files are evenly partitioned into ( K N disjoint batches of size, each of which is labelled by a subset {,..., K} of size : {,..., N } = {B : {,..., K}, = }, (8 where B denotes the batch corresponding to the subset. Given this partition, Server, {,..., K} maps the files in B if. Reduce Functions Assignment. he Q Reduce functions are evenly partitioned into ( K s disjoint batches of size Q s, each of which is labelled by a subset of s nodes: {,..., Q }={D : {,..., K}, =s }, (9 where D denotes the batch corresponding to the subset. Given this partition, Server, {,..., K} computes the Reduce functions whose indices are in D if. Coded Data Shuffling. In the Shuffle phase, within a subset of max{ +, s } l min{ + s, K} servers, every of them shared some intermediate values that are simultaneously needed by the remaining l servers. Each server multicasts enough linear combinations of the segments of these intermediate values until they can be decoded by all the intended servers. his achieves a communication load L coded (, s, K for vertex m, where L coded (r, s, K = min{r+s,k} l l ( r ( l 2 l s r is given in heorem 5. r r s l=max{r+,s} Next we demonstrate that, the above CDC scheme can be applied to compute every vertex subject to the data locality constraint, using the reduce factors s specified in (7. o do that, we focus on the computation of a vertex m in Layer d. WLOG, we assume that m only connects to a single vertex m d, in Layer d, hence the input files of m are the output files of m d, and N = Q d,. Out of all vertices in Layer d connected to m d,, say vertex m d,j has the largest computation load such that by (7, s d, = r d,j, and each of the output files of m d, is available on r d,j servers after the computation of Layer d. By the above assignment of the Reduce functions, a batch of Q d, r d,j output files of m d, (or input files of m, denoted by D d,, are available at all r d,j servers in a subset. o execute the Map phase of m, we first evenly partition the D d, into ( r d,j Q sub-batches of size d,, each of r d,j ( r d,j which is sub-labelled by a subset of of nodes: D d, = {D, :, = }, (20 where D, denotes the sub-batch corresponding to. hen for each server, it maps all files in D, if. Finally, we repeat this Map process for all subsets of size r d,j. Since every subset of servers are contained in r d,j subsets of size rd,j, they map a Q total of d, r d,j ( r d,j Q r d,j = d, input files of m. his is consistent with the above Map phase design for m, i.e., for all {,..., K} of size, B = D,, (2 {,...,K}: =r d,j where B, as defined in (8, is the batch of input files of m mapped by servers in. We demonstrate in Fig. 9, the Map computations of the vertices m 2 and m 3 of the diamond DAG in Fig. 8 with Q = 6 output files of m, computation loads r 2 = 2 and r 3 =, using K = 3 servers. First we select the reduce factor of m, s = max{r 2, r 3 } = 2, such that every output file of m, u,..., u 6, is reduced on two servers. Having computed the output files of m that are also input files of m 2 and m 3, each server computes the Map functions of m 2 on all locally available files. However, since m 3 has a computation load r 3 =, each file is only mapped once on one server, e.g., u 3 and u 4 are both available on Server and 2 after computing m, but u 3 is mapped only on Server and u 4 is mapped

8 only on Server 4 in the Map phase of m 3. m 3 Map m 2 Map m Reduce Server Server 2 Server 3 Fig. 9: Illustration of the mapped files in the Map phases of the vertices m 2 and m 3 in the diamond DAG, for the case Q = N 2 = N 3 = 6, r 2 = 2, and r 3 =. Using the above CDC scheme for each vertex, we can achieve a communication load L u d in Layer d, d =,..., D: L d L u d = L coded (, s, KQ N. (22 i= aing the union over all combinations of the computation loads achieves the computation-communication tuples in heorem 5. Remar 5. Having characterized a set of computationcommunication tuples using CDC, one can optimize the overall job execution time over the computation loads. Varying computation loads affects the Map time, the Shuffle time and the Reduce time in each layer in different ways. For example, a smaller computation load can lead to a shorter Map time in the current layer and also a shorter Reduce time in the preceding layer, but may cause a long Shuffle phase in the current layer. In general, the design of optimum computation loads depends on the system parameters including input/output sizes, sizes of the intermediate values, server processing speeds and the networ bandwidth. C. Is the Uniform Resource Allocation Optimal? In the above proposed CDC scheme for layered-dags, we allocate all processing resources to compute each vertex in the DAG. However, this uniform resource allocation strategy does not always lead to a better performance. We demonstrate this phenomenon through the following example. Consider again the diamond DAG in Fig. 8 in which the vertices m 2 and m 3 have the same number of output functions, i.e., Q 2 = Q 3. When computing m 2 and m 3 in Layer 2, we split the computation resources such that half of the servers exclusively compute m 2 and the remaining half exclusively compute m 3. hat is, we select K 2 and K 3 such that K 2 K 3 = and K 2 = K 3 = K 2. We choose the reduce factors s 2 = s 3 = r 4, and then apply the CDC scheme on K 2 and K 3 to compute m 2 and m 3 respectively. his achieves a total communication load in Layer 2: L s 2 =(L coded (r 2, r 4, K 2 2+L coded (r 3, r 4, K 2 3Q Q 2. (23 he above communication load is less than the load L u 2 = (L coded (r 2, r 4, K 2 + L coded (r 3, r 4, K 3 Q Q 2 in Layer 2 achieved using the uniform resource allocation.his is because that when using less number of servers to compute a vertex, each server will compute more Map functions and obtain more useful local information (i.e., needed intermediate values available locally, and thus less amount of information needs to be transferred over the networ. However, when computing the vertex m 4 in Layer 3, since the output results of the preceding vertices m 2 and m 3 reside on completely separate sets of servers, no coding can be applied for communication between K 2 and K 3. We can use CDC within K 2 and K 3 respectively to achieve a communication load of ( r 4 2 K Q 2Q 4 4, and uncoded communication between K 2 and K 3 that incurs another communication load of Q 2 Q 4 4. Hence, the total communication load achieved in Layer 3 when splitting the computation resources is L s 3 = ( r 4 2 K + Q 2Q 4 4. (24 On the other hand, computing m 2 and m 3 using all servers in Layer 2 induces a more symmetric placement of the input files of m 4, over all K servers. herefore, it can tae a better advantage of the coding opportunities in data shuffling of Layer 3, achieving a smaller communication load L u 3 = ( 2 r 4 2 K Q 2Q 4 4 L s 3. o summarize, for a different resource allocation strategy in which the computation resources are split to compute m 2 and m 3 in the second layer, we can achieve a smaller communication load in Layer 2 at the cost of a higher communication load in Layer 3. V. CONCLUSION We describe two extensions of the CDC scheme in [3] to solve distributed computing problems with straggling servers and multistage computations respectively. In particular, when faced with straggling servers, we present a unified coding scheme that superimposes the CDC scheme on top of the MDS code, achieving a flexible tradeoff between computation latency and communication load. On the other hand, for a multistage computation expressed as a DAG, we propose a general coded scheme that first specifies the computation load for each processing vertex of the DAG, and then applies the CDC scheme to each vertex individually. REFERENCES [] S. Li, M. A. Maddah-Ali, and A. S. Avestimehr, Coded MapReduce, 53rd Allerton Conference, Sept [2], Fundamental tradeoff between computation and communication in distributed computing, IEEE ISI, July 206. [3] S. Li, M. A. Maddah-Ali, Q. Yu, and A. S. Avestimehr, A fundamental tradeoff between computation and communication in distributed computing, e-print arxiv: , 206, submitted to IEEE rans. Inf. heory. [4] K. Lee, M. Lam, R. edarsani, D. apailiopoulos, and K. Ramchandran, Speeding up distributed machine learning using codes, e-print arxiv: , Dec [5] S. Li, M. A. Maddah-Ali, and A. S. Avestimehr, A unified coding framewor for distributed computing with straggling servers, e-print arxiv: , Sept. 206, a shorter version to appear in IEEE NetCod 206. [6] C. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradsi, A. Y. Ng, and K. Oluotun, Map-Reduce for machine learning on multicore, Advances in neural information processing systems, vol. 9, [7] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, Dryad: distributed data-parallel programs from seuential building blocs, in ACM SIGOS Operating Systems Review, vol. 4, no. 3, June [8] A. Abouzeid, K. Bajda-awliowsi, D. Abadi, A. Silberschatz, and A. Rasin, Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical worloads, roceedings of the VLDB Endowment, vol. 2, no., pp , [9] J. Eanayae,. Gunarathne, G. Fox, A. S. Balir, C. oulain, N. Araujo, and R. Barga, DryadLINQ for scientific analyses, in Fifth IEEE International Conference on e-science, 2009, pp [0] B. Saha, H. Shah, S. Seth, G. Vijayaraghavan, A. Murthy, and C. Curino, Apache ez: A unifying framewor for modeling and building data processing applications, in roceedings of the 205 ACM SIGMOD, 205, pp [] B. C. Arnold, N. Balarishnan, and H. N. Nagaraja, A first course in order statistics. Siam, 992, vol. 54.

A Unified Coding Framework for Distributed Computing with Straggling Servers

A Unified Coding Framework for Distributed Computing with Straggling Servers A Unified Coding Framewor for Distributed Computing with Straggling Servers Songze Li, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr Department of Electrical Engineering, University of Southern California,

More information

Coded Distributed Computing: Fundamental Limits and Practical Challenges

Coded Distributed Computing: Fundamental Limits and Practical Challenges Coded Distributed Computing: Fundamental Limits and Practical Challenges Songze Li, Qian Yu, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr Department of Electrical Engineering, University of Southern

More information

Block-Diagonal Coding for Distributed Computing With Straggling Servers

Block-Diagonal Coding for Distributed Computing With Straggling Servers Block-Diagonal Coding for Distributed Computing With Straggling Servers Albin Severinson, Alexandre Graell i Amat, and Eirik Rosnes Department of Electrical Engineering, Chalmers University of Technology,

More information

A New Combinatorial Design of Coded Distributed Computing

A New Combinatorial Design of Coded Distributed Computing A New Combinatorial Design of Coded Distributed Computing Nicholas Woolsey, Rong-Rong Chen, and Mingyue Ji Department of Electrical and Computer Engineering, University of Utah Salt Lake City, UT, USA

More information

Cascaded Coded Distributed Computing on Heterogeneous Networks

Cascaded Coded Distributed Computing on Heterogeneous Networks Cascaded Coded Distributed Computing on Heterogeneous Networks Nicholas Woolsey, Rong-Rong Chen, and Mingyue Ji Department of Electrical and Computer Engineering, University of Utah Salt Lake City, UT,

More information

Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication

Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication Qian Yu, Mohammad Ali Maddah-Ali, and A. Salman Avestimehr Department of Electrical Engineering, University of Southern

More information

Coded TeraSort. arxiv: v1 [cs.dc] 16 Feb 2017

Coded TeraSort. arxiv: v1 [cs.dc] 16 Feb 2017 Coded TeraSort Songze Li, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, and Salman Avestimehr University of Southern California, Nokia Bell Labs Email: {songzeli,supittay}@usc.edu, mohammadali.maddah-ali@alcatel-lucent.com,

More information

Block-Diagonal and LT Codes for Distributed Computing With Straggling Servers

Block-Diagonal and LT Codes for Distributed Computing With Straggling Servers Block-Diagonal and L Codes for Distributed Computing With Straggling Servers Downloaded from: https://research.chalmers.se, 209-03-7 6:3 UC Citation for the original published paper (version of record:

More information

Novel Decentralized Coded Caching through Coded Prefetching

Novel Decentralized Coded Caching through Coded Prefetching ovel Decentralized Coded Caching through Coded Prefetching Yi-Peng Wei Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland College Park, MD 2072 ypwei@umd.edu ulukus@umd.edu

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

Coded Caching for a Large Number Of Users

Coded Caching for a Large Number Of Users Coded Caching for a Large Number Of Users 1 Mohammad Mohammadi Amiri, Qianqian Yang, and Deniz Gündüz Abstract arxiv:1605.01993v1 [cs.it] 6 May 2016 Information theoretic analysis of a coded caching system

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can 208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer

More information

Fundamental Limits of Caching: Improved Bounds For Small Buffer Users

Fundamental Limits of Caching: Improved Bounds For Small Buffer Users Fundamental Limits of Caching: Improved Bounds For Small Buffer Users Zhi Chen Member, IEEE Pingyi Fan Senior Member, IEEE and Khaled Ben Letaief Fellow, IEEE 1 Abstract arxiv:1407.1935v2 [cs.it] 6 Nov

More information

Enabling Node Repair in Any Erasure Code for Distributed Storage

Enabling Node Repair in Any Erasure Code for Distributed Storage Enabling Node Repair in Any Erasure Code for Distributed Storage K. V. Rashmi, Nihar B. Shah, and P. Vijay Kumar, Fellow, IEEE Abstract Erasure codes are an efficient means of storing data across a network

More information

A Connection between Network Coding and. Convolutional Codes

A Connection between Network Coding and. Convolutional Codes A Connection between Network Coding and 1 Convolutional Codes Christina Fragouli, Emina Soljanin christina.fragouli@epfl.ch, emina@lucent.com Abstract The min-cut, max-flow theorem states that a source

More information

Monotone Paths in Geometric Triangulations

Monotone Paths in Geometric Triangulations Monotone Paths in Geometric Triangulations Adrian Dumitrescu Ritankar Mandal Csaba D. Tóth November 19, 2017 Abstract (I) We prove that the (maximum) number of monotone paths in a geometric triangulation

More information

Graphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs

Graphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs Graphs and Network Flows IE411 Lecture 21 Dr. Ted Ralphs IE411 Lecture 21 1 Combinatorial Optimization and Network Flows In general, most combinatorial optimization and integer programming problems are

More information

On Data Parallelism of Erasure Coding in Distributed Storage Systems

On Data Parallelism of Erasure Coding in Distributed Storage Systems On Data Parallelism of Erasure Coding in Distributed Storage Systems Jun Li, Baochun Li Department of Electrical and Computer Engineering, University of Toronto, Canada {junli, bli}@ece.toronto.edu Abstract

More information

FAULT TOLERANT SYSTEMS

FAULT TOLERANT SYSTEMS FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 6 Coding I Chapter 3 Information Redundancy Part.6.1 Information Redundancy - Coding A data word with d bits is encoded

More information

Diversity Coloring for Distributed Storage in Mobile Networks

Diversity Coloring for Distributed Storage in Mobile Networks Diversity Coloring for Distributed Storage in Mobile Networks Anxiao (Andrew) Jiang and Jehoshua Bruck California Institute of Technology Abstract: Storing multiple copies of files is crucial for ensuring

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Exact Optimized-cost Repair in Multi-hop Distributed Storage Networks

Exact Optimized-cost Repair in Multi-hop Distributed Storage Networks Exact Optimized-cost Repair in Multi-hop Distributed Storage Networks Majid Gerami, Ming Xiao Communication Theory Lab, Royal Institute of Technology, KTH, Sweden, E-mail: {gerami, mingx@kthse arxiv:14012774v1

More information

Correlation-Aware Distributed Caching and Coded Delivery

Correlation-Aware Distributed Caching and Coded Delivery Correlation-Aware Distributed Caching and Coded Delivery P. Hassanzadeh, A. Tulino, J. Llorca, E. Erkip arxiv:1609.05836v1 [cs.it] 19 Sep 2016 Abstract Cache-aided coded multicast leverages side information

More information

Benefits of Coded Placement for Networks with Heterogeneous Cache Sizes

Benefits of Coded Placement for Networks with Heterogeneous Cache Sizes Benefits of Coded Placement for Networks with Heterogeneous Cache Sizes Abdelrahman M. Ibrahim, Ahmed A. Zewail, and Aylin Yener ireless Communications and Networking Laboratory CAN Electrical Engineering

More information

Generalized Interlinked Cycle Cover for Index Coding

Generalized Interlinked Cycle Cover for Index Coding Generalized Interlinked Cycle Cover for Index Coding Chandra Thapa, Lawrence Ong, and Sarah J. Johnson School of Electrical Engineering and Computer Science, The University of Newcastle, Newcastle, Australia

More information

Mitigating Data Skew Using Map Reduce Application

Mitigating Data Skew Using Map Reduce Application Ms. Archana P.M Mitigating Data Skew Using Map Reduce Application Mr. Malathesh S.H 4 th sem, M.Tech (C.S.E) Associate Professor C.S.E Dept. M.S.E.C, V.T.U Bangalore, India archanaanil062@gmail.com M.S.E.C,

More information

Device-to-Device Networking Meets Cellular via Network Coding

Device-to-Device Networking Meets Cellular via Network Coding Device-to-Device Networking Meets Cellular via Network Coding Yasaman Keshtkarjahromi, Student Member, IEEE, Hulya Seferoglu, Member, IEEE, Rashid Ansari, Fellow, IEEE, and Ashfaq Khokhar, Fellow, IEEE

More information

Interleaving Schemes on Circulant Graphs with Two Offsets

Interleaving Schemes on Circulant Graphs with Two Offsets Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical

More information

Space vs Time, Cache vs Main Memory

Space vs Time, Cache vs Main Memory Space vs Time, Cache vs Main Memory Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 4435 - CS 9624 (Moreno Maza) Space vs Time, Cache vs Main Memory CS 4435 - CS 9624 1 / 49

More information

Wireless Map-Reduce Distributed Computing with Full-Duplex Radios and Imperfect CSI

Wireless Map-Reduce Distributed Computing with Full-Duplex Radios and Imperfect CSI 1 Wireless Map-Reduce Distributed Computing with Full-Duplex Radios and Imperfect CSI Sukjong Ha 1, Jingjing Zhang 2, Osvaldo Simeone 2, and Joonhyuk Kang 1 1 KAIST, School of Electrical Engineering, South

More information

CHAPTER 3 FUZZY RELATION and COMPOSITION

CHAPTER 3 FUZZY RELATION and COMPOSITION CHAPTER 3 FUZZY RELATION and COMPOSITION The concept of fuzzy set as a generalization of crisp set has been introduced in the previous chapter. Relations between elements of crisp sets can be extended

More information

Hierarchical Coded Caching

Hierarchical Coded Caching Hierarchical Coded Caching ikhil Karamchandani, Urs iesen, Mohammad Ali Maddah-Ali, and Suhas Diggavi Abstract arxiv:403.7007v2 [cs.it] 6 Jun 204 Caching of popular content during off-peak hours is a strategy

More information

Fractal Graph Optimization Algorithms

Fractal Graph Optimization Algorithms Fractal Graph Optimization Algorithms James R. Riehl and João P. Hespanha Abstract We introduce methods of hierarchically decomposing three types of graph optimization problems: all-pairs shortest path,

More information

Optimization of Heterogeneous Caching Systems with Rate Limited Links

Optimization of Heterogeneous Caching Systems with Rate Limited Links IEEE ICC Communication Theory Symposium Optimization of Heterogeneous Caching Systems with Rate Limited Links Abdelrahman M Ibrahim, Ahmed A Zewail, and Aylin Yener Wireless Communications and Networking

More information

Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix

Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix Time Complexity and Parallel Speedup to Compute the Gamma Summarization Matrix Carlos Ordonez, Yiqun Zhang Department of Computer Science, University of Houston, USA Abstract. We study the serial and parallel

More information

Multi-Cluster Interleaving on Paths and Cycles

Multi-Cluster Interleaving on Paths and Cycles Multi-Cluster Interleaving on Paths and Cycles Anxiao (Andrew) Jiang, Member, IEEE, Jehoshua Bruck, Fellow, IEEE Abstract Interleaving codewords is an important method not only for combatting burst-errors,

More information

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 57, NO 8, AUGUST 2011 5227 Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction K V Rashmi,

More information

A Communication Architecture for Large Heterogeneous Wireless Networks

A Communication Architecture for Large Heterogeneous Wireless Networks A Communication Architecture for Large Heterogeneous Wireless Networks Urs Niesen Bell Laboratories, Alcatel-Lucent Murray Hill, NJ 07974 urs.niesen@alcatel-lucent.com Piyush upta Bell Laboratories, Alcatel-Lucent

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

Fountain Codes Based on Zigzag Decodable Coding

Fountain Codes Based on Zigzag Decodable Coding Fountain Codes Based on Zigzag Decodable Coding Takayuki Nozaki Kanagawa University, JAPAN Email: nozaki@kanagawa-u.ac.jp Abstract Fountain codes based on non-binary low-density parity-check (LDPC) codes

More information

Cache-Aided Coded Multicast for Correlated Sources

Cache-Aided Coded Multicast for Correlated Sources Cache-Aided Coded Multicast for Correlated Sources P. Hassanzadeh A. Tulino J. Llorca E. Erkip arxiv:1609.05831v1 [cs.it] 19 Sep 2016 Abstract The combination of edge caching and coded multicasting is

More information

RECURSIVE PATCHING An Efficient Technique for Multicast Video Streaming

RECURSIVE PATCHING An Efficient Technique for Multicast Video Streaming ECUSIVE ATCHING An Efficient Technique for Multicast Video Streaming Y. W. Wong, Jack Y. B. Lee Department of Information Engineering The Chinese University of Hong Kong, Shatin, N.T., Hong Kong Email:

More information

Twiddle Factor Transformation for Pipelined FFT Processing

Twiddle Factor Transformation for Pipelined FFT Processing Twiddle Factor Transformation for Pipelined FFT Processing In-Cheol Park, WonHee Son, and Ji-Hoon Kim School of EECS, Korea Advanced Institute of Science and Technology, Daejeon, Korea icpark@ee.kaist.ac.kr,

More information

Summary of Raptor Codes

Summary of Raptor Codes Summary of Raptor Codes Tracey Ho October 29, 2003 1 Introduction This summary gives an overview of Raptor Codes, the latest class of codes proposed for reliable multicast in the Digital Fountain model.

More information

DISTRIBUTED NETWORK RESOURCE ALLOCATION WITH INTEGER CONSTRAINTS. Yujiao Cheng, Houfeng Huang, Gang Wu, Qing Ling

DISTRIBUTED NETWORK RESOURCE ALLOCATION WITH INTEGER CONSTRAINTS. Yujiao Cheng, Houfeng Huang, Gang Wu, Qing Ling DISTRIBUTED NETWORK RESOURCE ALLOCATION WITH INTEGER CONSTRAINTS Yuao Cheng, Houfeng Huang, Gang Wu, Qing Ling Department of Automation, University of Science and Technology of China, Hefei, China ABSTRACT

More information

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Greedy Algorithms (continued) The best known application where the greedy algorithm is optimal is surely

More information

Coding for Improved Throughput Performance in Network Switches

Coding for Improved Throughput Performance in Network Switches Coding for Improved Throughput Performance in Network Switches Rami Cohen, Graduate Student Member, IEEE, and Yuval Cassuto, Senior Member, IEEE 1 arxiv:1605.04510v1 [cs.ni] 15 May 2016 Abstract Network

More information

Some Advanced Topics in Linear Programming

Some Advanced Topics in Linear Programming Some Advanced Topics in Linear Programming Matthew J. Saltzman July 2, 995 Connections with Algebra and Geometry In this section, we will explore how some of the ideas in linear programming, duality theory,

More information

Cache-Aided Private Information Retrieval with Partially Known Uncoded Prefetching

Cache-Aided Private Information Retrieval with Partially Known Uncoded Prefetching Cache-Aided Private Information Retrieval with Partially Known Uncoded Prefetching Yi-Peng Wei Karim Banawan Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College

More information

Cost Models for Query Processing Strategies in the Active Data Repository

Cost Models for Query Processing Strategies in the Active Data Repository Cost Models for Query rocessing Strategies in the Active Data Repository Chialin Chang Institute for Advanced Computer Studies and Department of Computer Science University of Maryland, College ark 272

More information

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini DM545 Linear and Integer Programming Lecture 2 The Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. 2. 3. 4. Standard Form Basic Feasible Solutions

More information

DRYAD: DISTRIBUTED DATA- PARALLEL PROGRAMS FROM SEQUENTIAL BUILDING BLOCKS

DRYAD: DISTRIBUTED DATA- PARALLEL PROGRAMS FROM SEQUENTIAL BUILDING BLOCKS DRYAD: DISTRIBUTED DATA- PARALLEL PROGRAMS FROM SEQUENTIAL BUILDING BLOCKS Authors: Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly Presenter: Zelin Dai WHAT IS DRYAD Combines computational

More information

6. Lecture notes on matroid intersection

6. Lecture notes on matroid intersection Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans May 2, 2017 6. Lecture notes on matroid intersection One nice feature about matroids is that a simple greedy algorithm

More information

Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13

Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13 Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13 *Dimakis, A.G.; Godfrey, P.B.; Wu, Y.; Wainwright, M.J.; Ramchandran, K. "Network Coding for Distributed

More information

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER Akhil Kumar and Michael Stonebraker EECS Department University of California Berkeley, Ca., 94720 Abstract A heuristic query optimizer must choose

More information

Observations on Client-Server and Mobile Agent Paradigms for Resource Allocation

Observations on Client-Server and Mobile Agent Paradigms for Resource Allocation Observations on Client-Server and Mobile Agent Paradigms for Resource Allocation M. Bahouya, J. Gaber and A. Kouam Laboratoire SeT Université de Technologie de Belfort-Montbéliard 90000 Belfort, France

More information

Max-Flow Protection using Network Coding

Max-Flow Protection using Network Coding Max-Flow Protection using Network Coding Osameh M. Al-Kofahi Department of Computer Engineering Yarmouk University, Irbid, Jordan Ahmed E. Kamal Department of Electrical and Computer Engineering Iowa State

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Graph based codes for distributed storage systems

Graph based codes for distributed storage systems /23 Graph based codes for distributed storage systems July 2, 25 Christine Kelley University of Nebraska-Lincoln Joint work with Allison Beemer and Carolyn Mayer Combinatorics and Computer Algebra, COCOA

More information

On the Complexity of Multi-Dimensional Interval Routing Schemes

On the Complexity of Multi-Dimensional Interval Routing Schemes On the Complexity of Multi-Dimensional Interval Routing Schemes Abstract Multi-dimensional interval routing schemes (MIRS) introduced in [4] are an extension of interval routing schemes (IRS). We give

More information

Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected

Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected graph G with at least 2 vertices contains at least 2

More information

HadoopDB: An open source hybrid of MapReduce

HadoopDB: An open source hybrid of MapReduce HadoopDB: An open source hybrid of MapReduce and DBMS technologies Azza Abouzeid, Kamil Bajda-Pawlikowski Daniel J. Abadi, Avi Silberschatz Yale University http://hadoopdb.sourceforge.net October 2, 2009

More information

MapReduce in Streaming Data

MapReduce in Streaming Data MapReduce in Streaming Data October 21, 2013 Instructor : Dr. Barna Saha Course : CSCI 8980 : Algorithmic Techniques for Big Data Analysis Scribe by : Neelabjo Shubhashis Choudhury Introduction to MapReduce,

More information

Realizing Common Communication Patterns in Partitioned Optical Passive Stars (POPS) Networks

Realizing Common Communication Patterns in Partitioned Optical Passive Stars (POPS) Networks 998 IEEE TRANSACTIONS ON COMPUTERS, VOL. 47, NO. 9, SEPTEMBER 998 Realizing Common Communication Patterns in Partitioned Optical Passive Stars (POPS) Networks Greg Gravenstreter and Rami G. Melhem, Senior

More information

Definition: A graph G = (V, E) is called a tree if G is connected and acyclic. The following theorem captures many important facts about trees.

Definition: A graph G = (V, E) is called a tree if G is connected and acyclic. The following theorem captures many important facts about trees. Tree 1. Trees and their Properties. Spanning trees 3. Minimum Spanning Trees 4. Applications of Minimum Spanning Trees 5. Minimum Spanning Tree Algorithms 1.1 Properties of Trees: Definition: A graph G

More information

Improving VoD System Efficiency with Multicast and Caching

Improving VoD System Efficiency with Multicast and Caching Improving VoD System Efficiency with Multicast and Caching Jack Yiu-bun Lee Department of Information Engineering The Chinese University of Hong Kong Contents 1. Introduction 2. Previous Works 3. UVoD

More information

2.3 Algorithms Using Map-Reduce

2.3 Algorithms Using Map-Reduce 28 CHAPTER 2. MAP-REDUCE AND THE NEW SOFTWARE STACK one becomes available. The Master must also inform each Reduce task that the location of its input from that Map task has changed. Dealing with a failure

More information

CHARACTERIZING the capacity region of wireless

CHARACTERIZING the capacity region of wireless IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 5, MAY 2010 2249 The Balanced Unicast and Multicast Capacity Regions of Large Wireless Networks Abstract We consider the question of determining the

More information

ARELAY network consists of a pair of source and destination

ARELAY network consists of a pair of source and destination 158 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 55, NO 1, JANUARY 2009 Parity Forwarding for Multiple-Relay Networks Peyman Razaghi, Student Member, IEEE, Wei Yu, Senior Member, IEEE Abstract This paper

More information

Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit Integer Functions

Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit Integer Functions Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit nteger Functions L. Li, Alex Fit-Florea, M. A. Thornton, D. W. Matula Southern Methodist University,

More information

Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce

Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Parallelizing Structural Joins to Process Queries over Big XML Data Using MapReduce Huayu Wu Institute for Infocomm Research, A*STAR, Singapore huwu@i2r.a-star.edu.sg Abstract. Processing XML queries over

More information

An Optimal Disk Allocation Strategy for Partial Match Queries on Non-Uniform. 1.1 Cartesian Product Files

An Optimal Disk Allocation Strategy for Partial Match Queries on Non-Uniform. 1.1 Cartesian Product Files An Optimal Disk Allocation Strategy for Partial Match Queries on Non-Uniform Cartesian Product Files Sajal K. Das Department of Computer Science University of North Texas Denton, TX 76203-1366 E-mail:

More information

V10 Metabolic networks - Graph connectivity

V10 Metabolic networks - Graph connectivity V10 Metabolic networks - Graph connectivity Graph connectivity is related to analyzing biological networks for - finding cliques - edge betweenness - modular decomposition that have been or will be covered

More information

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2 Introduction :- Today single CPU based architecture is not capable enough for the modern database that are required to handle more demanding and complex requirements of the users, for example, high performance,

More information

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems International Journal of Information and Education Technology, Vol., No. 5, December A Level-wise Priority Based Task Scheduling for Heterogeneous Systems R. Eswari and S. Nickolas, Member IACSIT Abstract

More information

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PAUL BALISTER Abstract It has been shown [Balister, 2001] that if n is odd and m 1,, m t are integers with m i 3 and t i=1 m i = E(K n) then K n can be decomposed

More information

Degrees of Freedom in Cached Interference Networks with Limited Backhaul

Degrees of Freedom in Cached Interference Networks with Limited Backhaul Degrees of Freedom in Cached Interference Networks with Limited Backhaul Vincent LAU, Department of ECE, Hong Kong University of Science and Technology (A) Motivation Interference Channels 3 No side information

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

Matching and Planarity

Matching and Planarity Matching and Planarity Po-Shen Loh June 010 1 Warm-up 1. (Bondy 1.5.9.) There are n points in the plane such that every pair of points has distance 1. Show that there are at most n (unordered) pairs of

More information

BELOW, we consider decoding algorithms for Reed Muller

BELOW, we consider decoding algorithms for Reed Muller 4880 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 11, NOVEMBER 2006 Error Exponents for Recursive Decoding of Reed Muller Codes on a Binary-Symmetric Channel Marat Burnashev and Ilya Dumer, Senior

More information

Randomized rounding of semidefinite programs and primal-dual method for integer linear programming. Reza Moosavi Dr. Saeedeh Parsaeefard Dec.

Randomized rounding of semidefinite programs and primal-dual method for integer linear programming. Reza Moosavi Dr. Saeedeh Parsaeefard Dec. Randomized rounding of semidefinite programs and primal-dual method for integer linear programming Dr. Saeedeh Parsaeefard 1 2 3 4 Semidefinite Programming () 1 Integer Programming integer programming

More information

On the Robustness of Distributed Computing Networks

On the Robustness of Distributed Computing Networks 1 On the Robustness of Distributed Computing Networks Jianan Zhang, Hyang-Won Lee, and Eytan Modiano Lab for Information and Decision Systems, Massachusetts Institute of Technology, USA Dept. of Software,

More information

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context 1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes

More information

SPARSE COMPONENT ANALYSIS FOR BLIND SOURCE SEPARATION WITH LESS SENSORS THAN SOURCES. Yuanqing Li, Andrzej Cichocki and Shun-ichi Amari

SPARSE COMPONENT ANALYSIS FOR BLIND SOURCE SEPARATION WITH LESS SENSORS THAN SOURCES. Yuanqing Li, Andrzej Cichocki and Shun-ichi Amari SPARSE COMPONENT ANALYSIS FOR BLIND SOURCE SEPARATION WITH LESS SENSORS THAN SOURCES Yuanqing Li, Andrzej Cichocki and Shun-ichi Amari Laboratory for Advanced Brain Signal Processing Laboratory for Mathematical

More information

Lecture Notes 2: The Simplex Algorithm

Lecture Notes 2: The Simplex Algorithm Algorithmic Methods 25/10/2010 Lecture Notes 2: The Simplex Algorithm Professor: Yossi Azar Scribe:Kiril Solovey 1 Introduction In this lecture we will present the Simplex algorithm, finish some unresolved

More information

Optimal Algorithms for Cross-Rack Communication Optimization in MapReduce Framework

Optimal Algorithms for Cross-Rack Communication Optimization in MapReduce Framework Optimal Algorithms for Cross-Rack Communication Optimization in MapReduce Framework Li-Yung Ho Institute of Information Science Academia Sinica, Department of Computer Science and Information Engineering

More information

INTERLEAVING codewords is an important method for

INTERLEAVING codewords is an important method for IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 51, NO. 2, FEBRUARY 2005 597 Multicluster Interleaving on Paths Cycles Anxiao (Andrew) Jiang, Member, IEEE, Jehoshua Bruck, Fellow, IEEE Abstract Interleaving

More information

Module 7. Independent sets, coverings. and matchings. Contents

Module 7. Independent sets, coverings. and matchings. Contents Module 7 Independent sets, coverings Contents and matchings 7.1 Introduction.......................... 152 7.2 Independent sets and coverings: basic equations..... 152 7.3 Matchings in bipartite graphs................

More information

Part II. Graph Theory. Year

Part II. Graph Theory. Year Part II Year 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2017 53 Paper 3, Section II 15H Define the Ramsey numbers R(s, t) for integers s, t 2. Show that R(s, t) exists for all s,

More information

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742 Availability of Coding Based Replication Schemes Gagan Agrawal Department of Computer Science University of Maryland College Park, MD 20742 Abstract Data is often replicated in distributed systems to improve

More information

New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs

New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs Hung Q Ngo Ding-Zhu Du Abstract We propose two new classes of non-adaptive pooling designs The first one is guaranteed to be -error-detecting

More information

arxiv: v2 [cs.ds] 18 May 2015

arxiv: v2 [cs.ds] 18 May 2015 Optimal Shuffle Code with Permutation Instructions Sebastian Buchwald, Manuel Mohr, and Ignaz Rutter Karlsruhe Institute of Technology {sebastian.buchwald, manuel.mohr, rutter}@kit.edu arxiv:1504.07073v2

More information

SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES

SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES Haotian Zhao, Yinlong Xu and Liping Xiang School of Computer Science and Technology, University of Science

More information

Pebble Sets in Convex Polygons

Pebble Sets in Convex Polygons 2 1 Pebble Sets in Convex Polygons Kevin Iga, Randall Maddox June 15, 2005 Abstract Lukács and András posed the problem of showing the existence of a set of n 2 points in the interior of a convex n-gon

More information

Mobility-Aware Coded Storage and Delivery

Mobility-Aware Coded Storage and Delivery Mobility-Aware Coded Storage and Delivery Emre Ozfatura and Deniz Gündüz Information Processing and Communications Lab Department of Electrical and Electronic Engineering Imperial College London Email:

More information

6 Randomized rounding of semidefinite programs

6 Randomized rounding of semidefinite programs 6 Randomized rounding of semidefinite programs We now turn to a new tool which gives substantially improved performance guarantees for some problems We now show how nonlinear programming relaxations can

More information