The Half Cleaner Lemma: Constructing Efficient Interconnection Networks from Sorting Networks

Size: px
Start display at page:

Download "The Half Cleaner Lemma: Constructing Efficient Interconnection Networks from Sorting Networks"

Transcription

1 Parallel Processing Letters c World Scientific Publishing Company The Half Cleaner Lemma: Constructing Efficient Interconnection Networks from Sorting Networks Tripti Jain and Klaus Schneider Department of Computer Science, TU Kaiserslautern, Germany Received (received date) Revised (revised date) Communicated by (Name of Editor) ABSTRACT In general, efficient non-blocking interconnection networks can be derived from sorting networks, and to this end, one may either follow the merge-based or the radix-based sorting paradigm. Both paradigms require special modifications to handle partial permutations. In this article, we present a general lemma about half cleaner modules that were introduced as building blocks in Batcher s bitonic sorting network. This lemma is the key to prove the correctness of many known optimizations of interconnection networks. In particular, we first show how to use any ternary sorter and a half cleaner for implementing an efficient split module as required for radix-based sorting networks for partial permutations. Second, our lemma formally proves the correctness of another known optimization of the Batcher-Banyan network. Keywords: interconnection networks, sorting networks, concentrators, half cleaners 1. Introduction Non-blocking interconnection networks were initially introduced for telecommunication networks and distributed multiprocessor systems [14, 17] and are nowadays also discussed as on-chip interconnection networks [1, 4]. A network with inputs x 0,..., x n 1 and outputs y 0,..., y n 1 is thereby called non-blocking if every possible permutation π : {0,..., n 1} {0,..., n 1} can be established in the sense that all inputs x i are routed in parallel to outputs y π(i) for i {0,..., n 1}. There is a plethora of non-blocking interconnection networks that cannot be discussed here (see e.g. [1, 4, 14, 17]), so we just consider two well-known examples of non-blocking networks: Crossbars have an optimal depth of O(log(n)), but a size of O(n 2 ) that becomes unacceptable for large n. Beneš networks improve the size of their switching network to O(n log(n)) while keeping the depth in O(log(n)), but require difficult algorithms to determine their configurations [24, 27] that have a depth larger than the network itself, i.e., O(log(n) 2 ). As an alternative, non-blocking interconnection networks can also be derived from sorting networks in that the input messages x i are sorted with respect to their 1

2 2 Parallel Processing Letters MBS(n) := Merge(n) RBS(n) := Split(n) RBS(n/2) RBS(n/2) Fig. 1. Merge-based (MBS) versus radix-based sorting (RBS) paradigms: MBS merges already sorted halves with Merge modules into a sorted output sequence while RBS splits input sequences by Split modules so that the inputs are already in the right halves where they are sorted recursively. target addresses. In general, there are merge-based and radix-based sorting networks as sketched in Figure 1: In the merge-based approach, a sorting network MBS(n) for n inputs is recursively constructed by splitting the given sequence into two halves, recursively sorting these by sorting networks MBS( n 2 ), and then merging the sorted halves by a merge module Merge(n) into a single sorted output sequence. Wellknown sorting networks following this paradigm are Batcher s famous bitonic and oddeven sorters [3] and related networks [28]. The second approach follows the (binary) radix sorting paradigm: The given input sequence is first split into two parts by a Split module according to the most significant bits of the target addresses. Thus, after the Split module, the entries have already been routed to the right halves, so that the remaining problems can be dealt with recursively in the same way (ignoring now the most significant bits of the target addresses to generate local addresses). Both paradigms have already been used for the construction of sorting and interconnection networks. For n = 2 p inputs, both paradigms lead to fat binary trees of depth O(log(n)). For example, Figure 2 shows Batcher s bitonic sorting network a for n = 8 inputs where the three levels of Merge modules (bitonic sorters) where highlighted with brown color. Moreover, each Merge module consists again of a binary tree of half cleaners (pink color; see Definition 2.1), and each half cleaner consists of one column of compare-and-swap modules (blue color, the arrows show where the smaller input is moved to). To be precise, a half cleaner with 2n inputs/outputs consists of n rows of 2 2 crossbar switches where the switch in row i is connected with inputs i and i + n and also with outputs i and i + n. In general, sorting networks can be used as non-blocking interconnection networks in that the routes are simply determined by sorting the messages according to their target addresses. This works well as long as all inputs shall be connected a It has been observed in [25, 30, 31] that the Merge modules can be arranged as reverse banyan butterfly networks [19] as shown in Figure 2 with an ingoing perfect shuffle permutation.

3 Instructions for Typesetting Camera-Ready Manuscripts 3 Fig. 2. Batcher s bitonic sorting network as an example of a merge-based sorting network: Its Merge modules (bitonic sorters) are again tree structures built by half cleaners (columns of compare-and-swap modules). to some output, i.e., as long as every input has a valid target address. However, if a partial permutation shall be established, i.e., if some inputs do not have to be connected to some outputs, the merge-based sorters do not work anymore, while the radix-based sorters still work correctly b. For that reason, one has to either generate the missing addresses to complete the partial permutation or one has to use an additional Banyan network as suggested in [18, 25]. In [25], Narasimha has proven the correctness of the Batcher-Banyan network by first observing that the Ω-network [23], the merge modules of Batcher s bitonic sorting network [3], and the generalized cube network [30] are all topologically isomorphic [19, 32]. It follows that the configurations of these networks can be mapped to each other. In addition to this observation that has been used in [25] to prove the correctness of the Batcher-Banyan network, two optimizations of interconnection networks have been suggested in the same paper. In this paper, we review these optimizations, and prove their correctness (which has not been done in [25]). To that end, we found a simple and general lemma about half cleaner modules that turned out to be the key to all of these optimizations. The outline of the rest of the article is as follows: In the next section, we define half cleaner modules, list Batcher s lemma about half cleaners, and prove then our new lemma. Even though our lemma can be derived from Batcher s original lemma, we give an independent proof, and show then that it is sufficient to prove all the optimizations mentioned above: Section 3 shows an optimization for the implementation of RBS networks using half cleaners, and Section 4 shows another optimization for the implementation of Batcher-Banyan networks using our half cleaner lemma. b The Split modules must become ternary sorters in that case: The inputs are again sorted by the most significant bits of their target addresses, where we use a third value for those inputs not having a target address with the ordering 0 1 (see Lemma 2.2).

4 4 Parallel Processing Letters 2. The Half Cleaner Lemma In this section, we first define half cleaners and then prove our technical lemma that is used in the following two sections to prove the correctness of two known optimizations of interconnection networks. Sorting networks are usually constructed by compare-and-swap switches which have two inputs x 0 and x 1 that are routed to the two outputs y 0 and y 1 such that y 0 := min{x 0, x 1 } and y 1 := max{x 0, x 1 } holds. Since the ordering of the inputs does not matter, we do not label them in the figures, and use an arrow to denote the minimum output y 0. Definition 2.1 (Half Cleaner) A half cleaner with 2n inputs x 0,..., x 2n 1 of some totally ordered set and outputs y 0,..., y 2n 1 consists of n compare-and-swap switches such that switch i has inputs x i and x i+n and outputs y i and y i+n that are computed as y i := min{x i, x n+i } and y n+i := max{x i, x n+i } for i = 0,..., n 1. x7 x6 x5 x4 x3 x2 x1 x0 y7 y6 y5 y4 y3 y2 y1 y0 Fig. 3. A half cleaner module with 8 inputs and 8 outputs consisting of four 2 2 compare-andswap switches (the arrows of compare-and-swap switches point towards the minimum output). Half cleaner modules were introduced in [3] as building blocks for the Merge modules used in Batcher s bitonic sorting network (Figure 2). To prove its correctness, Batcher proved the following Lemma in [3]: Lemma 2.1 (Half Cleaner Lemma (I)) Given a bitonic c sequence x 0,..., x 2n 1 with elements x i of some totally ordered set as input to a half cleaner, the following holds for its outputs y 0,..., y 2n 1 : y i y n+j for all i, j {0,..., n 1} and both halves y 0,..., y n 1 and y n,..., y 2n 1 are bitonic sequences. c The sequence x 0,..., x 2n 1 is (strongly) bitonic if there is some k such that x 0... x k... x 2n 1 holds, and it is bitonic if it is a rotation of a strongly bitonic sequence.

5 Instructions for Typesetting Camera-Ready Manuscripts 5 Batcher used the above lemma for the recursive construction of Merge modules as shown in Figure 9 so that the MBS paradigm shown in Figure 1 yields the sorting network shown in Figure 2: Note that every Merge module itself is recursively constructed by one half cleaner module and two Merge modules of half the size. We prove a simpler lemma about the half cleaner module which will be useful for optimizations of interconnection networks that we discuss in the next two sections. Our lemma is the following one: Lemma 2.2 (Half Cleaner Lemma (II)) Given sorted lists a 0... a n 1 and b 0... b n 1 with elements a i, b i {0,, 1} with the total order 0 1 such that the following holds (i.e., both the numbers of 0s and 1s contained are n): {i {0,..., n 1} a i = 0} + {i {0,..., n 1} b i = 0} n {i {0,..., n 1} a i = 1} + {i {0,..., n 1} b i = 1} n If the input sequence a 0,..., a n 1, b 0,..., b n 1 is given as input to a half cleaner, we obviously obtain the following outputs y 0,..., y 2n 1 for i = 0,..., n 1: y i := min{a i, b i } and y n+i := max{a i, b i } It then follows that we have y i {0, } and y n+i {, 1} for i = 0,..., n 1, so that the left half y 0,..., y n 1 contains all values 0 while the right half y n,..., y 2n 1 contains all values 1 of the input lists. Exchanging input sequences a 0... a n 1 and b 0... b n 1 with each other yields the same results. Proof: We first prove that the two values a i and b i that arrive at a switch of the half cleaner can be neither both 0 nor can they be both 1: Assuming that both a i and b i would be 0, it would follow that at least all a 0,..., a i and also all b i,..., b n 1 would have to be 0 since the input lists a and b are sorted. However, these are (i + 1) + (n i) = n + 1 values, which is in contradiction to the assumption that at most n values are 0. Assuming that both a i and b i would be 1, it would follow that at least all a i,..., a n 1 and also all b 0,..., b i would have to be 1 since the input lists a and b are sorted. However, these are (n i) + (i + 1) = n + 1 values, which is in contradiction to the assumption that at most n values are 1. Hence, it is impossible that two input values a i, b i of switch i in the half cleaner would be both 0 or both 1 (but both could be ). Therefore, y i := min{a i, b i } {0, } and y n+i := max{a i, b i } {, 1} for i = 0,..., n 1 as can be seen by the table in Figure 4. We will show in the next two sections how the above lemma can be used to implement a Split module for the construction of an RBS network, and how we can prove the correctness of Narasimha s optimization of the Batcher-Banyan network.

6 6 Parallel Processing Letters a i b i y i := min{a i, b i } y n+i := max{a i, b i } Fig. 4. Possible/impossible inputs and outputs of compare-and-swap switches of the half cleaner under the assumptions given in Lemma 2.2: The first and the last input/output rows cannot occur, and therefore y i 1 and y i+n 0 holds for i = 0,..., n From Sorters to Splitters As can be seen in Figure 1, MBS and RBS sorting networks are completely determined by their Merge and Split modules, respectively. For RBS networks, the construction of Split modules can be reduced to the construction of so-called concentrators [11, 29]. A (n, m)-concentrator is thereby a circuit with n inputs and m n outputs that can route any given subset of k m valid inputs to k of its m outputs. To construct a Split module by concentrators, we make use of two (n, n 2 )- concentrators: The first one routes inputs with most significant target address bit 0 to its n 2 outputs, while the other one routes the inputs with most significant target address bit 1 to its n 2 outputs, so that combining the outputs, we obtain a Split module. This construction of Split modules by concentrators can also be used for partial permutations, where a Split module has to consider ternary inputs {0,, 1}, so that RBS networks can also be used for partial permutations. However, the construction of efficient concentrators is a very difficult problem that has been considered in many previous research papers (see [19] for further references). While theoretical results indicate that O(log(n)) depth and O(n) size concentrators exist [11, 29], the practical concentrators have a depth of O(log(n) a ) and a size of O(n log(n) b ) for small constants a, b in terms of circuit gates [9, 20 22]. It is also possible to use sorting networks for binary sequences to implement Split modules. In fact, for total permutations, any binary sorter could be used to sort the inputs by the most significant bits of their target addresses. Hence, binary sorters as described in [9, 10, 21, 22, 26] could be used for that purpose, since exactly half of the inputs have a target address with most significant bit 0 (same for 1, of course). For partial permutations, we have to replace the binary sorters by ternary sorters, since we then find target addresses with most significant bits 0,, 1 where we use as most significant bit of those inputs that have no associated target address.

7 Instructions for Typesetting Camera-Ready Manuscripts 7 The size and depth of ternary sorters obviously grows with the number of inputs n. For example, the bitonic sorting network for n inputs has depth 1 2 log(n)(log(n)+ 1) and uses n 4 log(n)(log(n) + 1) compare-swap switches, while the Batcher oddeven sorting network for n inputs has the same depth 1 2 log(n)(log(n) + 1), but uses only log(n)(log(n) 1) + n 1 compare-swap switches. n 4 Split(n) := Sort(n/2) Sort(n/2) HC(n) Fig. 5. Implementing Split modules by two ternary sorting networks and a half cleaner module. Fortunately, Narasimha observed in [25] (Section V) that one can construct Split modules with n inputs/outputs using two sorting networks with n 2 inputs/outputs and a half cleaner with n inputs/outputs as shown in Figure 5. The same construction has been implicitly used in [22], but no correctness proofs were reported so far. Definition 3.1 (Half Cleaner Optimization of Split Modules) Given two sorting networks with n inputs/outputs and a half cleaner with 2n inputs/outputs, one can construct a Split module with 2n inputs/outputs as shown in Figure 5. Clearly, we assume that the constructed Split module should work as expected so that we can use it in RBS networks even for partial permutations. Narasimha did not prove the correctness of his construction, but it can be almost immediately derived from our half cleaner lemma: Theorem 1 (Half Cleaner Optimization of Split Modules) Given an input sequence x 0,..., x 2n 1 where x i {0,, 1} and where at most n of the inputs x i are 0 and also at most n of the inputs x i are 1, then the outputs y 0,..., y 2n 1 of the Split module shown in Figure 5 will satisfy the following: y i {0, } for i = 0,..., n 1 y n+i {, 1} for i = 0,..., n 1 Hence, all x i = 0 are routed to the lower half y 0,..., y n 1 and all x i = 1 are routed to the upper half y n,..., y 2n 1. Proof: Given a ternary input sequence x 0,..., x 2n 1, we can sort its lower half x 0,..., x n 1 and its upper half x n,..., x 2n 1 separately using the sorting networks shown in Figure 5. The outputs of the lower and upper sorting networks are the sequences b 0... b n 1 and a 0... a n 1 mentioned in Lemma 2.2, respectively. Thus, the above theorem follows now directly from Lemma 2.2.

8 8 Parallel Processing Letters According to the above theorem, the output sequence y 0,..., y 2n 1 is partitioned such that the 0s are in the lower half, the 1s are in the upper half, and inputs may occur in both halves. Hence, the half cleaner and the two sorting networks implement a Split module as required for the recursive construction of a RBS network even in case of partial permutations. Sort(4) HC(4) HC(8) Sort(4) HC(4) Fig. 6. A RBS network for 8 inputs constructed by Split modules according to Figure 5. Figure 6 shows a general RBS network for 8 inputs constructed by Split modules according to Figure 5. Size and depth of these Split modules are mainly determined by the used sorting networks, and while [25] used bitonic sorters, there are better ones in terms of the size like the Batcher oddeven network: Using oddeven sorters, we obtain RBS networks of depth 1 6 log(n)(log(n)2 + 5) and exactly n 12 log(n)(log(n)2 3 log(n)+20) 2n+2 compare-swap switches, while the use of bitonic sorters would require n 12 log(n)(log(n)2 + 5) switches (having the same depth). Fig. 7. A RBS network for 8 inputs constructed by Split modules according to Figure 5 where bitonic sorters were used as sorting networks.

9 Instructions for Typesetting Camera-Ready Manuscripts 9 Figure 7 expands the sorting networks in Figure 6 by bitonic sorting networks, and Figure 8 does the same using Batcher s oddeven sorters instead. Note that the networks are simply ternary comparators. Using these sorting networks, we finally obtain RBS networks of depth O(log(n) 3 ) and size O(n log(n) 3 ). Fig. 8. A RBS network for 8 inputs constructed by Split modules according to Figure 5 where oddeven sorters were used as sorting networks. While Batcher s sorting networks are still the best practical networks that can be recursively constructed for any power of 2, it is known that sorting networks of depth O(log(n)) and size O(n log(n)) exist [2]. While the latter are practically useless due to the large constants of their asympotic complexity, they still prove that there are Split modules of the mentioned complexities. We further remark that for particular numbers n, optimal-depth and optimal-size sorting networks are known: Currently, optimal-depth networks are known up to size n = 17 and optimal-size networks are known up to size n = 10 [5 8, 12, 12, 13, 15, 16]. They can obviously be used to construct efficient Split modules with the help of half cleaners. Note that in RBS networks, we only need ternary sorters to compare the most significant bits of the target addresses with each other. Above, we reduced Batcher s sorting networks to the ternary case, but we can also use special generalizations of binary sorters. This way, one can also implement RBS networks of depth O(log(n) 3 ) and size O(n log(n) 3 ) which are even better than those obtained with Batcher s sorters, since these binary sorters exploit the fact that their inputs are binary sequences. A ternary sorter can be easily implemented by two binary sorters that consider as 0 and 1, respectively, and then taking the 1s from the first, and the 0s from the second, declaring all remaining outputs as invalid ones. In [20], we have implemented different Split modules for the binary case according to the optimization described in this section. The added value of this section is to prove its correctness also for partial permutations which is required in practice.

10 10 Parallel Processing Letters 4. Optimizing Merge-Based Interconnection Networks Using sorting networks and half cleaners to implement Split modules according to Figure 5, we finally obtain RBS networks of depth O(log(n) 3 ) and size O(n log(n) 3 ). MBS networks have only a depth of O(log(n) 2 ) and size O(n log(n) 2 ) in terms of compare-and-swap switches. However, their comparators have to compare the full target addresses in each compare-and-swap switch. Since a network for n inputs has addresses with log(n) bits that can be compared with circuits of size O(log(n)) and depth O(log(log(n))) (in terms of circuit gates), these MBS networks yield interconnection networks of depth O(log(n) 2 log(log(n))) and size O(n log(n) 3 ) (in terms of circuit gates). Hence, we obtain the same circuit size, but a better depth. For this reason, MBS networks are also of interest, even though they alone cannot route partial permutations. To generalize them to partial permutations, a further Banyan network must be added to construct a Batcher-Banyan network [18, 25]. In [25], an optimization of the Batcher-Banyan has been suggested whose correctness can be again easily proved by our Lemma 2.2 as we will see in this section. Merge(n) := HC(n) Fig. 9. Recursive construction of Merge networks by means of half cleaner modules due to [3]. These Merge networks are actually bitonic sorters, i.e., able to sort all bitonic sequences [3]. According to Batcher [3], bitonic merge modules can be constructed by half cleaners as shown in Figure 9. Their correctness can be proved by induction using Lemma 2.1: If a given sequence is bitonic, then the half cleaner will generate two sequences according to Lemma 2.1 which are both bitonic and where all elements of one of the two sequences are less than or equal to all elements in the other sequence. For this reason, all elements are already in the right halves, and since the sequences are also bitonic, one can recursively sort them using bitonic sorters as shown in Figure 9 where in this case Merge modules are bitonic sorters. MBS(n) Merge(n) HC(n) := := Fig. 10. Expanding the construction of MBS networks by Figure 9.

11 Instructions for Typesetting Camera-Ready Manuscripts 11 For this reason, the recursive definition of the MBS network expands as shown in Figure 10 to Batcher s bitonic sorting network. Batcher s bitonic sorting network can sort all input sequences, and can therefore be used as a non-blocking interconnection network for total permutations. However, it cannot directly be used to route partial permutations which usually occur in practice. To that end, [18] extended the sorting network by an additional Banyan network which is then called the Batcher-Banyan network. The Batcher-Banyan network consists of a MBS sorting network like Batcher s bitonic sorting network, and a further Banyan network. As discussed in [25], the Banyan network can be any permutation network that must be able to route at least all bitonic sequences. In particular, we might directly use a Merge module as shown in Figure 9. Therefore, an instance of the Batcher-Banyan network can be implemented by instantiating the MBS sorting network on the left hand side of Figure 11 by Batcher s bitonic sorting network, and by instantiating the Banyan network (drawn as Merge module in Figure 11) using Batcher s bitonic sorter, i.e., the Merge modules given in Figure 9. If we expand both modules, we then obtain the structure shown on the right hand side of Figure 11 that can be improved as follows according to [25]: MBS(n) Merge(n) := HC(n) 1 : : 0 HC(n) sorting routing sorting routing Fig. 11. Optimization of the Batcher-Banyan network due to Narasimha [25] where Batcher s bitonic sorting network is used as MBS sorting network and Batcher s bitonic sorter is used as Banyan network. Theorem 2 (Half Cleaner Optimization of Batcher-Banyan Networks) The second half cleaner module of the Batcher-Banyan network shown in Figure 11 can be removed without changing the behavior of the network. Proof: Consider an input sequence x 0,..., x 2n 1 where each input x i consists of a message and a potential target address (in case there is no target address, we use again the symbol as target address). The MBS sorting networks shown on the right hand side in Figure 11 first sort the halves x 0,..., x n 1 and x n,..., x 2n 1 independently using the ordering 0... n 1 n... 2n 1. For this reason, the most significant bits of the outputs of the MBS sorting networks are of the form a 0... a n 1 and b 0... b n 1, respectively, with elements a i, b i {0,, 1} using the total order 0 1. By Lemma 2.2, it then follows that the first half cleaner routes the values already in their right

12 12 Parallel Processing Letters halves, however, not yet as sorted sequences. This is the important observation to understand that the second half cleaner is not required since the inputs are already in the right halves after the first half cleaner. By Lemma 2.1, the sequences after the first half cleaner are bitonic so that the two Merge modules in the next column will sort them with respect to the ordering 0... n 1 n... 2n 1. The combined output sequence of both Merge modules is therefore a sorted sequence and since the input sequence was a partial permutation, it follows that the sequences of most significant bits of these lower and upper halves are of the form 0 and 1, respectively (note that the number of inputs with most significant bits 0 and 1 can be at most half of the number of inputs). Now, consider the routing part where the switches are no longer configured by comparing the full target addresses of their inputs using the ordering. Instead, each one of the log(n) columns of switches in that part only considers one address bit of the target addresses as if it would be a RBS network, and to that end, the ordering 0 1 is used. By our half cleaner lemma, i.e., Lemma 2.2, it therefore follows that the values still remain in the halves where they came from. Since its inputs are already sorted sequences with respect to and since this is consistent with the ordering 0 1 of their most significant bits (but not of other bits), we can safely remove the second half cleaner. 5. Conclusions This paper proves a lemma about half cleaner modules that can also be derived from Batcher s original lemma about the half cleaner. However, our lemma focuses directly on ternary input sequences, i.e., sequences with elements taken from the set {0,, 1}. The proof of the lemma can therefore be done in a simpler way, and the lemma itself can be directly applied to interconnection networks for partial permutations that are constructed by means of half cleaners. In particular, this is the case for the construction of Split modules for interconnection networks based on radix-sorting: Using our lemma, we can construct a Split module for n inputs/outputs using two ternary sorters of size n 2 and a half cleaner with n inputs/outputs. Our lemma can be used to prove the correctness of RBS networks if these ternary Split modules are used. Second, our lemma has been applied to prove the correctness of an optimization of the Batcher-Banyan network [18] as suggested by Narasimha [25]. Even though Narasimha proved the correctness of the Batcher-Banyan network in [25], he did not prove the correctness of his optimization. Our lemma simplifies the correctness proof of his optimization and makes it more comprehensive.

13 REFERENCES 13 References [1] A. Agarwal, C. Iskander, and R. Shankar. Survey of network on chip (NoC) architectures and contributions. Journal of Engineering, Computing and Architecture, 3(1), [2] M. Ajtai, J. Komlos, and E. Szemeredi. An O(n log(n)) sorting network. In Symposium on Theory of Computing (STOC), pages 1 9. ACM, [3] K. Batcher. Sorting networks and their applications. In AFIPS Spring Joint Computer Conference, volume 32, pages , [4] T. Bjerregaard and S. Mahadevan. A survey of research and practices of network-on-chip. ACM Computing Surveys (CSUR), 38(1):1 51, March [5] D. Bundala, M. Codish, L. Cruz-Filipe, P. Schneider-Kamp, and J. Závodný. Optimal-depth sorting networks. arxiv Report arxiv: , Cornell University Library, [6] D. Bundala, M. Codish, L. Cruz-Filipe, P. Schneider-Kamp, and J. Závodný. Optimal-depth sorting networks. Journal of Computer and System Sciences, 84: , [7] D. Bundala and J. Zavodny. Optimal sorting networks. In A.-H. Dediu, C. Martin-Vide, J.-L. Sierra-Rodríguez, and B. Truthe, editors, Language and Automata Theory and Applications (LATA), volume 8370 of LNCS, pages , Madrid, Spain, Springer. [8] D. Bundala and J. Závodný. Optimal sorting networks. arxiv Report arxiv: , Cornell University Library, [9] W.-J. Cheng and W.-T. Chen. A new self-routing permutation network. IEEE Transactions on Computers, 45(5): , May [10] M. Chien and A. Oruç. High performance concentrators and superconcentrators using multiplexing schemes. IEEE Transactions on Communications, 42(11): , November [11] F. Chung. On concentrators, superconcentrators, generalizers, and nonblocking networks. The Bell Systems Technical Journal, 58(8): , October [12] M. Codish, L. Cruz-Filipe, T. Ehlers, M. Müller, and P. Schneider-Kamp. Sorting networks: to the end and back again. arxiv Report arxiv: v1, Cornell University Library, July [13] M. Codish, L. Cruz-Filipe, M. Frank, and P. Schneider-Kamp. Twenty-five comparators is optimal when sorting nine inputs (and twenty-nine for ten). arxiv.org Report arxiv: v3, Cornell University Library, [14] W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann, [15] T. Ehlers and M. Müller. Faster sorting networks for 17, 19 and 20 inputs. arxiv Report arxiv: , Cornell University Library, October [16] T. Ehlers and M. Müller. New bounds on optimal sorting networks. arxiv Report arxiv: v1, Cornell University Library, January 2015.

14 14 REFERENCES [17] T. Feng. A survey of interconnection networks. IEEE Computer, 14(12):12 27, December [18] A. Huang and S. Knauer. Starlite: A wideband digital switch. In Global Telecommunications Conference (GLOBECOM), pages , [19] T. Jain and K. Schneider. Verifying the concentration property of permutation networks by BDDs. In E. Leonard and K. Schneider, editors, Formal Methods and Models for Codesign (MEMOCODE), pages 43 53, Kanpur, India, IEEE Computer Society. [20] T. Jain, K. Schneider, and A. Jain. Deriving concentrators from binary sorters using half cleaners. In Reconfigurable Computing and FPGAs (ReConFig), Cancun, Mexico, IEEE Computer Society. [21] T. Jain, K. Schneider, and A. Jain. An efficient self-routing and non-blocking interconnection network on chip. In Network on Chip Architectures (NoCArc), pages 4:1 4:6, Boston, MA, USA, ACM. [22] D. Koppelman and A. Oruç. A self-routing permutation network. Journal of Parallel and Distributed Computing, 10(2): , [23] D. Lawrie. Access and alignment of data in an array processor. IEEE Transactions on Computers (T-C), 24: , [24] C.-Y. Lee and A. Oruç. A fast parallel algorithm for routing unicast assignments in Beneš networks. IEEE Transactions on Parallel and Distributed Systems, 6(3): , March [25] M. Narasimha. The Batcher-Banyan self-routing network: universality and simplification. IEEE Transactions on Communications, 36(10): , October [26] M. Narasimha. A recursive concentrator structure with applications to selfrouting switching networks. IEEE Transactions on Communications, 42(2-4): , February/March/April [27] D. Nassimi and S. Sahni. Parallel algorithms to set up the Beneš permutation network. IEEE Transactions on Computers (T-C), 31(2): , February [28] I. Parberry. The pairwise sorting network. Parallel Processing Letters (PPL), 2(2-3): , [29] M. Pinsker. On the complexity of a concentrator. In International Teletraffic Conference (ITC), pages 318:1 318:4, Stockholm, Sweden, [30] H. Siegel and S. Smith. Study of multistage SIMD interconnection networks. In International Symposium on Computer Architecture (ISCA), pages ACM, [31] H. Stone. Parallel processing with the perfect shuffle. IEEE Transactions on Computers (T-C), 20: , [32] C. Wu and T. Feng. On a class of multistage interconnection networks. IEEE Transactions on Computers (T-C), 29(8): , August 1980.

Sorting Networks. March 2, How much time does it take to compute the value of this circuit?

Sorting Networks. March 2, How much time does it take to compute the value of this circuit? Sorting Networks March 2, 2005 1 Model of Computation Can we do better by using different computes? Example: Macaroni sort. Can we do better by using the same computers? How much time does it take to compute

More information

arxiv: v1 [cs.ds] 10 Oct 2014

arxiv: v1 [cs.ds] 10 Oct 2014 Faster Sorting Networks for 17 19 and 20 Inputs Thorsten Ehlers and Mike Müller Institut für Informatik Christian-Albrechts-Universität zu Kiel D-24098 Kiel Germany. {themimu}@informatik.uni-kiel.de arxiv:1410.2736v1

More information

Chapter 19. Sorting Networks Model of Computation. By Sariel Har-Peled, December 17,

Chapter 19. Sorting Networks Model of Computation. By Sariel Har-Peled, December 17, Chapter 19 Sorting Networks By Sariel Har-Peled, December 17, 2012 1 19.1 Model of Computation It is natural to ask if one can perform a computational task considerably faster by using a different architecture

More information

Distributed Sorting. Chapter Array & Mesh

Distributed Sorting. Chapter Array & Mesh Chapter 9 Distributed Sorting Indeed, I believe that virtually every important aspect of programming arises somewhere in the context of sorting [and searching]! Donald E. Knuth, The Art of Computer Programming

More information

Design and Implementation of Multistage Interconnection Networks for SoC Networks

Design and Implementation of Multistage Interconnection Networks for SoC Networks International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October 212 Design and Implementation of Multistage Interconnection Networks for SoC Networks Mahsa

More information

Space-division switch fabrics. Copyright 2003, Tim Moors

Space-division switch fabrics. Copyright 2003, Tim Moors 1 Space-division switch fabrics 2 Outline: Space-division switches Single-stage Crossbar, Knockout Staged switches: Multiple switching elements between input and output Networks of basic elements Clos

More information

Network Algorithms. Distributed Sorting

Network Algorithms. Distributed Sorting Network Algorithms Distributed Sorting Stefan Schmid @ T-Labs, 2 Sorting Distributed Sorting Graph with n nodes {v,,v n } and n values. Goal: node vi should store i-th smallest value. v 6 v 2 2 v 3 7 v

More information

Prefix Computation and Sorting in Dual-Cube

Prefix Computation and Sorting in Dual-Cube Prefix Computation and Sorting in Dual-Cube Yamin Li and Shietung Peng Department of Computer Science Hosei University Tokyo - Japan {yamin, speng}@k.hosei.ac.jp Wanming Chu Department of Computer Hardware

More information

Efficient Bitonic Communication for the Parallel Data Exchange

Efficient Bitonic Communication for the Parallel Data Exchange , October 19-21, 2011, San Francisco, USA Efficient Bitonic Communication for the Parallel Data Exchange Ralf-Peter Mundani, Ernst Rank Abstract Efficient communication is still a severe problem in many

More information

DAVID M. KOPPELMAN 2735 East 65th Street Brooklyn, NY 11234

DAVID M. KOPPELMAN 2735 East 65th Street Brooklyn, NY 11234 A SELF ROUTING PERMUTATION NETWORK DAVID M. KOPPELMAN 735 East 65th Street Brooklyn, NY 34 A. YAVUZ ORUÇ Electrical Engineering Department University of Maryland and Institute for Advanced Computer Studies

More information

A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and

A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and Parallel Processing Letters c World Scientific Publishing Company A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS DANNY KRIZANC Department of Computer Science, University of Rochester

More information

Efficient Prefix Computation on Faulty Hypercubes

Efficient Prefix Computation on Faulty Hypercubes JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 17, 1-21 (21) Efficient Prefix Computation on Faulty Hypercubes YU-WEI CHEN AND KUO-LIANG CHUNG + Department of Computer and Information Science Aletheia

More information

Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube

Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Kavish Gandhi April 4, 2015 Abstract A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long

More information

A 12-STEP SORTING NETWORK FOR 22 ELEMENTS

A 12-STEP SORTING NETWORK FOR 22 ELEMENTS A 12-STEP SORTING NETWORK FOR 22 ELEMENTS SHERENAZ W. AL-HAJ BADDAR Department of Computer Science, Kent State University Kent, Ohio 44240, USA KENNETH E. BATCHER Department of Computer Science, Kent State

More information

Efficient Algorithms for Checking the Equivalence of Multistage Interconnection Networks

Efficient Algorithms for Checking the Equivalence of Multistage Interconnection Networks Efficient Algorithms for Checking the Equivalence of Multistage Interconnection Networks Tiziana Calamoneri Dip. di Informatica Annalisa Massini Università di Roma La Sapienza, Italy. via Salaria 113-00198

More information

Lecture 4: September 11, 2003

Lecture 4: September 11, 2003 Algorithmic Modeling and Complexity Fall 2003 Lecturer: J. van Leeuwen Lecture 4: September 11, 2003 Scribe: B. de Boer 4.1 Overview This lecture introduced Fixed Parameter Tractable (FPT) problems. An

More information

arxiv: v3 [cs.dm] 24 Jun 2014

arxiv: v3 [cs.dm] 24 Jun 2014 Twenty-Five Comparators is Optimal when Sorting Nine Inputs (and Twenty-Nine for Ten) Michael Codish 1, Luís Cruz-Filipe 2, Michael Frank 1, and Peter Schneider-Kamp 2 1 Department of Computer Science,

More information

Question 7.11 Show how heapsort processes the input:

Question 7.11 Show how heapsort processes the input: Question 7.11 Show how heapsort processes the input: 142, 543, 123, 65, 453, 879, 572, 434, 111, 242, 811, 102. Solution. Step 1 Build the heap. 1.1 Place all the data into a complete binary tree in the

More information

Clustering Using Graph Connectivity

Clustering Using Graph Connectivity Clustering Using Graph Connectivity Patrick Williams June 3, 010 1 Introduction It is often desirable to group elements of a set into disjoint subsets, based on the similarity between the elements in the

More information

A quasi-nonblocking self-routing network which routes packets in log 2 N time.

A quasi-nonblocking self-routing network which routes packets in log 2 N time. A quasi-nonblocking self-routing network which routes packets in log 2 N time. Giuseppe A. De Biase Claudia Ferrone Annalisa Massini Dipartimento di Scienze dell Informazione, Università di Roma la Sapienza

More information

Balance of Processing and Communication using Sparse Networks

Balance of Processing and Communication using Sparse Networks Balance of essing and Communication using Sparse Networks Ville Leppänen and Martti Penttonen Department of Computer Science University of Turku Lemminkäisenkatu 14a, 20520 Turku, Finland and Department

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Introduction to Parallel Computing George Karypis Sorting Outline Background Sorting Networks Quicksort Bucket-Sort & Sample-Sort Background Input Specification Each processor has n/p elements A ordering

More information

On Convex Polygons in Cartesian Products

On Convex Polygons in Cartesian Products On Convex Polygons in Cartesian Products Jean-Lou De Carufel 1, Adrian Dumitrescu 2, Wouter Meulemans 3, Tim Ophelders 3, Claire Pennarun 4, Csaba D. Tóth 5, and Sander Verdonschot 6 1 University of Ottawa,

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Design and Analysis of Algorithms CSE 5311 Lecture 8 Sorting in Linear Time Junzhou Huang, Ph.D. Department of Computer Science and Engineering CSE5311 Design and Analysis of Algorithms 1 Sorting So Far

More information

The strong chromatic number of a graph

The strong chromatic number of a graph The strong chromatic number of a graph Noga Alon Abstract It is shown that there is an absolute constant c with the following property: For any two graphs G 1 = (V, E 1 ) and G 2 = (V, E 2 ) on the same

More information

John Mellor-Crummey Department of Computer Science Rice University

John Mellor-Crummey Department of Computer Science Rice University Parallel Sorting John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 23 6 April 2017 Topics for Today Introduction Sorting networks and Batcher s bitonic

More information

Lecture 8 Parallel Algorithms II

Lecture 8 Parallel Algorithms II Lecture 8 Parallel Algorithms II Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Original slides from Introduction to Parallel

More information

Constructing arbitrarily large graphs with a specified number of Hamiltonian cycles

Constructing arbitrarily large graphs with a specified number of Hamiltonian cycles Electronic Journal of Graph Theory and Applications 4 (1) (2016), 18 25 Constructing arbitrarily large graphs with a specified number of Hamiltonian cycles Michael School of Computer Science, Engineering

More information

CSC 447: Parallel Programming for Multi- Core and Cluster Systems

CSC 447: Parallel Programming for Multi- Core and Cluster Systems CSC 447: Parallel Programming for Multi- Core and Cluster Systems Parallel Sorting Algorithms Instructor: Haidar M. Harmanani Spring 2016 Topic Overview Issues in Sorting on Parallel Computers Sorting

More information

Sorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Sorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Sorting Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview Issues in Sorting on Parallel

More information

An Improved Upper Bound for the Sum-free Subset Constant

An Improved Upper Bound for the Sum-free Subset Constant 1 2 3 47 6 23 11 Journal of Integer Sequences, Vol. 13 (2010), Article 10.8.3 An Improved Upper Bound for the Sum-free Subset Constant Mark Lewko Department of Mathematics University of Texas at Austin

More information

ARITHMETIC operations based on residue number systems

ARITHMETIC operations based on residue number systems IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 2, FEBRUARY 2006 133 Improved Memoryless RNS Forward Converter Based on the Periodicity of Residues A. B. Premkumar, Senior Member,

More information

An NC Algorithm for Sorting Real Numbers

An NC Algorithm for Sorting Real Numbers EPiC Series in Computing Volume 58, 2019, Pages 93 98 Proceedings of 34th International Conference on Computers and Their Applications An NC Algorithm for Sorting Real Numbers in O( nlogn loglogn ) Operations

More information

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. Fall Parallel Sorting

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. Fall Parallel Sorting Parallel Systems Course: Chapter VIII Sorting Algorithms Kumar Chapter 9 Jan Lemeire ETRO Dept. Fall 2017 Overview 1. Parallel sort distributed memory 2. Parallel sort shared memory 3. Sorting Networks

More information

Discrete Applied Mathematics. A revision and extension of results on 4-regular, 4-connected, claw-free graphs

Discrete Applied Mathematics. A revision and extension of results on 4-regular, 4-connected, claw-free graphs Discrete Applied Mathematics 159 (2011) 1225 1230 Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam A revision and extension of results

More information

An Optimal Parallel Algorithm for Merging using Multiselection

An Optimal Parallel Algorithm for Merging using Multiselection An Optimal Parallel Algorithm for Merging using Multiselection Narsingh Deo Amit Jain Muralidhar Medidi Department of Computer Science, University of Central Florida, Orlando, FL 32816 Keywords: selection,

More information

An 11-Step Sorting Network for 18 Elements. Sherenaz W. Al-Haj Baddar, Kenneth E. Batcher

An 11-Step Sorting Network for 18 Elements. Sherenaz W. Al-Haj Baddar, Kenneth E. Batcher An -Step Sorting Network for 8 Elements Sherenaz W. Al-Haj Baddar, Kenneth E. Batcher Kent State University Department of Computer Science Kent, Ohio 444 salhajba@cs.kent.edu batcher@cs.kent.edu Abstract

More information

An Energy-Delay Efficient Subword Permutation Unit

An Energy-Delay Efficient Subword Permutation Unit An Energy-Delay Efficient Subword Permutation Unit Giorgos Dimitrakopoulos, Christos Mavrokefalidis, Costas Galanopoulos, and Dimitris Nikolos Technology and Computer Architecture ab Computer Engineering

More information

6.856 Randomized Algorithms

6.856 Randomized Algorithms 6.856 Randomized Algorithms David Karger Handout #4, September 21, 2002 Homework 1 Solutions Problem 1 MR 1.8. (a) The min-cut algorithm given in class works because at each step it is very unlikely (probability

More information

Triangle Graphs and Simple Trapezoid Graphs

Triangle Graphs and Simple Trapezoid Graphs JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 18, 467-473 (2002) Short Paper Triangle Graphs and Simple Trapezoid Graphs Department of Computer Science and Information Management Providence University

More information

Path Delay Fault Testing of a Class of Circuit-Switched Multistage Interconnection Networks

Path Delay Fault Testing of a Class of Circuit-Switched Multistage Interconnection Networks Path Delay Fault Testing of a Class of Circuit-Switched Multistage Interconnection Networks M. Bellos 1, D. Nikolos 1,2 & H. T. Vergos 1,2 1 Dept. of Computer Engineering and Informatics, University of

More information

Note Improved fault-tolerant sorting algorithm in hypercubes

Note Improved fault-tolerant sorting algorithm in hypercubes Theoretical Computer Science 255 (2001) 649 658 www.elsevier.com/locate/tcs Note Improved fault-tolerant sorting algorithm in hypercubes Yu-Wei Chen a, Kuo-Liang Chung b; ;1 a Department of Computer and

More information

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models.

These notes present some properties of chordal graphs, a set of undirected graphs that are important for undirected graphical models. Undirected Graphical Models: Chordal Graphs, Decomposable Graphs, Junction Trees, and Factorizations Peter Bartlett. October 2003. These notes present some properties of chordal graphs, a set of undirected

More information

ON MERGING NETWORKS. A Technical Report By Levy Tamir Litman Ami

ON MERGING NETWORKS. A Technical Report By Levy Tamir Litman Ami ON MERGING NETWORKS A Technical Report By Levy Tamir Litman Ami Faculty of computer science, Technion, Haifa, Israel {levyt,litman @ cs.technion.ac.il } Abstract Many algorithms for oblivious merging have

More information

Chapter 2. Splitting Operation and n-connected Matroids. 2.1 Introduction

Chapter 2. Splitting Operation and n-connected Matroids. 2.1 Introduction Chapter 2 Splitting Operation and n-connected Matroids The splitting operation on an n-connected binary matroid may not yield an n-connected binary matroid. In this chapter, we provide a necessary and

More information

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. November Parallel Sorting

Parallel Systems Course: Chapter VIII. Sorting Algorithms. Kumar Chapter 9. Jan Lemeire ETRO Dept. November Parallel Sorting Parallel Systems Course: Chapter VIII Sorting Algorithms Kumar Chapter 9 Jan Lemeire ETRO Dept. November 2014 Overview 1. Parallel sort distributed memory 2. Parallel sort shared memory 3. Sorting Networks

More information

MC 302 GRAPH THEORY 10/1/13 Solutions to HW #2 50 points + 6 XC points

MC 302 GRAPH THEORY 10/1/13 Solutions to HW #2 50 points + 6 XC points MC 0 GRAPH THEORY 0// Solutions to HW # 0 points + XC points ) [CH] p.,..7. This problem introduces an important class of graphs called the hypercubes or k-cubes, Q, Q, Q, etc. I suggest that before you

More information

Equivalent Permutation Capabilities Between Time-Division Optical Omega Networks and Non-Optical Extra-Stage Omega Networks

Equivalent Permutation Capabilities Between Time-Division Optical Omega Networks and Non-Optical Extra-Stage Omega Networks 518 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 9, NO. 4, AUGUST 2001 Equivalent Permutation Capabilities Between Time-Division Optical Omega Networks and Non-Optical Extra-Stage Omega Networks Xiaojun Shen,

More information

Connected Components of Underlying Graphs of Halving Lines

Connected Components of Underlying Graphs of Halving Lines arxiv:1304.5658v1 [math.co] 20 Apr 2013 Connected Components of Underlying Graphs of Halving Lines Tanya Khovanova MIT November 5, 2018 Abstract Dai Yang MIT In this paper we discuss the connected components

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Sorting lower bound and Linear-time sorting Date: 9/19/17 5.1 Introduction You should all know a few ways of sorting in O(n log n)

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

Virtual Circuit Blocking Probabilities in an ATM Banyan Network with b b Switching Elements

Virtual Circuit Blocking Probabilities in an ATM Banyan Network with b b Switching Elements Proceedings of the Applied Telecommunication Symposium (part of Advanced Simulation Technologies Conference) Seattle, Washington, USA, April 22 26, 21 Virtual Circuit Blocking Probabilities in an ATM Banyan

More information

Computation with No Memory, and Rearrangeable Multicast Networks

Computation with No Memory, and Rearrangeable Multicast Networks Discrete Mathematics and Theoretical Computer Science DMTCS vol. 16:1, 2014, 121 142 Computation with No Memory, and Rearrangeable Multicast Networks Serge Burckel 1 Emeric Gioan 2 Emmanuel Thomé 3 1 ERMIT,

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

Improved upper and lower bounds on the feedback vertex numbers of grids and butterflies

Improved upper and lower bounds on the feedback vertex numbers of grids and butterflies Improved upper and lower bounds on the feedback vertex numbers of grids and butterflies Florent R. Madelaine and Iain A. Stewart Department of Computer Science, Durham University, Science Labs, South Road,

More information

1 Introduction and Results

1 Introduction and Results On the Structure of Graphs with Large Minimum Bisection Cristina G. Fernandes 1,, Tina Janne Schmidt,, and Anusch Taraz, 1 Instituto de Matemática e Estatística, Universidade de São Paulo, Brazil, cris@ime.usp.br

More information

Superconcentrators of depth 2 and 3; odd levels help (rarely)

Superconcentrators of depth 2 and 3; odd levels help (rarely) Superconcentrators of depth 2 and 3; odd levels help (rarely) Noga Alon Bellcore, Morristown, NJ, 07960, USA and Department of Mathematics Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv

More information

On the interconnection of message passing systems

On the interconnection of message passing systems Information Processing Letters 105 (2008) 249 254 www.elsevier.com/locate/ipl On the interconnection of message passing systems A. Álvarez a,s.arévalo b, V. Cholvi c,, A. Fernández b,e.jiménez a a Polytechnic

More information

V1.0: Seth Gilbert, V1.1: Steven Halim August 30, Abstract. d(e), and we assume that the distance function is non-negative (i.e., d(x, y) 0).

V1.0: Seth Gilbert, V1.1: Steven Halim August 30, Abstract. d(e), and we assume that the distance function is non-negative (i.e., d(x, y) 0). CS4234: Optimisation Algorithms Lecture 4 TRAVELLING-SALESMAN-PROBLEM (4 variants) V1.0: Seth Gilbert, V1.1: Steven Halim August 30, 2016 Abstract The goal of the TRAVELLING-SALESMAN-PROBLEM is to find

More information

Lecture 3: Sorting 1

Lecture 3: Sorting 1 Lecture 3: Sorting 1 Sorting Arranging an unordered collection of elements into monotonically increasing (or decreasing) order. S = a sequence of n elements in arbitrary order After sorting:

More information

HMMT February 2018 February 10, 2018

HMMT February 2018 February 10, 2018 HMMT February 2018 February 10, 2018 Combinatorics 1. Consider a 2 3 grid where each entry is one of 0, 1, and 2. For how many such grids is the sum of the numbers in every row and in every column a multiple

More information

MT5821 Advanced Combinatorics

MT5821 Advanced Combinatorics MT5821 Advanced Combinatorics 4 Graph colouring and symmetry There are two colourings of a 4-cycle with two colours (red and blue): one pair of opposite vertices should be red, the other pair blue. There

More information

Storage Coding for Wear Leveling in Flash Memories

Storage Coding for Wear Leveling in Flash Memories Storage Coding for Wear Leveling in Flash Memories Anxiao (Andrew) Jiang Robert Mateescu Eitan Yaakobi Jehoshua Bruck Paul H Siegel Alexander Vardy Jack K Wolf Department of Computer Science California

More information

BIN PACKING PROBLEM: A LINEAR CONSTANT- SPACE -APPROXIMATION ALGORITHM

BIN PACKING PROBLEM: A LINEAR CONSTANT- SPACE -APPROXIMATION ALGORITHM International Journal on Computational Science & Applications (IJCSA) Vol.5,No.6, December 2015 BIN PACKING PROBLEM: A LINEAR CONSTANT- SPACE -APPROXIMATION ALGORITHM Abdolahad Noori Zehmakan Department

More information

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19

Treaps. 1 Binary Search Trees (BSTs) CSE341T/CSE549T 11/05/2014. Lecture 19 CSE34T/CSE549T /05/04 Lecture 9 Treaps Binary Search Trees (BSTs) Search trees are tree-based data structures that can be used to store and search for items that satisfy a total order. There are many types

More information

Distributed Recursion

Distributed Recursion Distributed Recursion March 2011 Sergio Rajsbaum Instituto de Matemáticas, UNAM joint work with Eli Gafni, UCLA Introduction Though it is a new field, computer science already touches virtually every aspect

More information

The hierarchical model for load balancing on two machines

The hierarchical model for load balancing on two machines The hierarchical model for load balancing on two machines Orion Chassid Leah Epstein Abstract Following previous work, we consider the hierarchical load balancing model on two machines of possibly different

More information

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation

Run Times. Efficiency Issues. Run Times cont d. More on O( ) notation Comp2711 S1 2006 Correctness Oheads 1 Efficiency Issues Comp2711 S1 2006 Correctness Oheads 2 Run Times An implementation may be correct with respect to the Specification Pre- and Post-condition, but nevertheless

More information

Finding a winning strategy in variations of Kayles

Finding a winning strategy in variations of Kayles Finding a winning strategy in variations of Kayles Simon Prins ICA-3582809 Utrecht University, The Netherlands July 15, 2015 Abstract Kayles is a two player game played on a graph. The game can be dened

More information

Parameterized graph separation problems

Parameterized graph separation problems Parameterized graph separation problems Dániel Marx Department of Computer Science and Information Theory, Budapest University of Technology and Economics Budapest, H-1521, Hungary, dmarx@cs.bme.hu Abstract.

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

A 2k-Kernelization Algorithm for Vertex Cover Based on Crown Decomposition

A 2k-Kernelization Algorithm for Vertex Cover Based on Crown Decomposition A 2k-Kernelization Algorithm for Vertex Cover Based on Crown Decomposition Wenjun Li a, Binhai Zhu b, a Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation, Changsha

More information

Are Hopfield Networks Faster Than Conventional Computers?

Are Hopfield Networks Faster Than Conventional Computers? Are Hopfield Networks Faster Than Conventional Computers? Ian Parberry* and Hung-Li Tsengt Department of Computer Sciences University of North Texas P.O. Box 13886 Denton, TX 76203-6886 Abstract It is

More information

group 0 group 1 group 2 group 3 (1,0) (1,1) (0,0) (0,1) (1,2) (1,3) (3,0) (3,1) (3,2) (3,3) (2,2) (2,3)

group 0 group 1 group 2 group 3 (1,0) (1,1) (0,0) (0,1) (1,2) (1,3) (3,0) (3,1) (3,2) (3,3) (2,2) (2,3) BPC Permutations n The TIS-Hypercube ptoelectronic Computer Sartaj Sahni and Chih-fang Wang Department of Computer and Information Science and ngineering University of Florida Gainesville, FL 32611 fsahni,wangg@cise.u.edu

More information

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition. 18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have

More information

Binary Search to find item in sorted array

Binary Search to find item in sorted array Binary Search to find item in sorted array January 15, 2008 QUESTION: Suppose we are given a sorted list A[1..n] (as an array), of n real numbers: A[1] A[2] A[n]. Given a real number x, decide whether

More information

Lecture 2 February 4, 2010

Lecture 2 February 4, 2010 6.897: Advanced Data Structures Spring 010 Prof. Erik Demaine Lecture February 4, 010 1 Overview In the last lecture we discussed Binary Search Trees(BST) and introduced them as a model of computation.

More information

A Reduction of Conway s Thrackle Conjecture

A Reduction of Conway s Thrackle Conjecture A Reduction of Conway s Thrackle Conjecture Wei Li, Karen Daniels, and Konstantin Rybnikov Department of Computer Science and Department of Mathematical Sciences University of Massachusetts, Lowell 01854

More information

BOOLEAN ALGEBRA AND CIRCUITS

BOOLEAN ALGEBRA AND CIRCUITS UNIT 3 Structure BOOLEAN ALGEBRA AND CIRCUITS Boolean Algebra and 3. Introduction 3. Objectives 3.2 Boolean Algebras 3.3 Logic 3.4 Boolean Functions 3.5 Summary 3.6 Solutions/ Answers 3. INTRODUCTION This

More information

KRUSKALIAN GRAPHS k-cographs

KRUSKALIAN GRAPHS k-cographs KRUSKALIAN GRAPHS k-cographs Ling-Ju Hung Ton Kloks Department of Computer Science and Information Engineering National Chung Cheng University, Min-Hsiung, Chia-Yi 62102, Taiwan Email: hunglc@cs.ccu.edu.tw

More information

Lecture 9: Zero-Knowledge Proofs

Lecture 9: Zero-Knowledge Proofs Great Ideas in Theoretical Computer Science Summer 2013 Lecture 9: Zero-Knowledge Proofs Lecturer: Kurt Mehlhorn & He Sun A zero-knowledge proof is an interactive protocol (game) between two parties, a

More information

Optimal Subcube Fault Tolerance in a Circuit-Switched Hypercube

Optimal Subcube Fault Tolerance in a Circuit-Switched Hypercube Optimal Subcube Fault Tolerance in a Circuit-Switched Hypercube Baback A. Izadi Dept. of Elect. Eng. Tech. DeV ry Institute of Technology Columbus, OH 43209 bai @devrycols.edu Fiisun ozgiiner Department

More information

Sorting (Chapter 9) Alexandre David B2-206

Sorting (Chapter 9) Alexandre David B2-206 Sorting (Chapter 9) Alexandre David B2-206 1 Sorting Problem Arrange an unordered collection of elements into monotonically increasing (or decreasing) order. Let S = . Sort S into S =

More information

On a Fast Interconnections

On a Fast Interconnections IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.8, August 2010 75 On a Fast Interconnections Ravi Rastogi and Nitin* Department of Computer Science & Engineering and Information

More information

Algorithms and Applications

Algorithms and Applications Algorithms and Applications 1 Areas done in textbook: Sorting Algorithms Numerical Algorithms Image Processing Searching and Optimization 2 Chapter 10 Sorting Algorithms - rearranging a list of numbers

More information

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems International Journal of Information and Education Technology, Vol., No. 5, December A Level-wise Priority Based Task Scheduling for Heterogeneous Systems R. Eswari and S. Nickolas, Member IACSIT Abstract

More information

Lecture 19 Thursday, March 29. Examples of isomorphic, and non-isomorphic graphs will be given in class.

Lecture 19 Thursday, March 29. Examples of isomorphic, and non-isomorphic graphs will be given in class. CIS 160 - Spring 2018 (instructor Val Tannen) Lecture 19 Thursday, March 29 GRAPH THEORY Graph isomorphism Definition 19.1 Two graphs G 1 = (V 1, E 1 ) and G 2 = (V 2, E 2 ) are isomorphic, write G 1 G

More information

An Investigation of the Planarity Condition of Grötzsch s Theorem

An Investigation of the Planarity Condition of Grötzsch s Theorem Le Chen An Investigation of the Planarity Condition of Grötzsch s Theorem The University of Chicago: VIGRE REU 2007 July 16, 2007 Abstract The idea for this paper originated from Professor László Babai

More information

Design of memory efficient FIFO-based merge sorter

Design of memory efficient FIFO-based merge sorter LETTER IEICE Electronics Express, Vol.15, No.5, 1 11 Design of memory efficient FIFO-based merge sorter Youngil Kim a), Seungdo Choi, and Yong Ho Song Department of Electronics and Computer Engineering,

More information

DISCRETE MATHEMATICS

DISCRETE MATHEMATICS DISCRETE MATHEMATICS WITH APPLICATIONS THIRD EDITION SUSANNA S. EPP DePaul University THOIVISON * BROOKS/COLE Australia Canada Mexico Singapore Spain United Kingdom United States CONTENTS Chapter 1 The

More information

Testing Isomorphism of Strongly Regular Graphs

Testing Isomorphism of Strongly Regular Graphs Spectral Graph Theory Lecture 9 Testing Isomorphism of Strongly Regular Graphs Daniel A. Spielman September 26, 2018 9.1 Introduction In the last lecture we saw how to test isomorphism of graphs in which

More information

arxiv: v2 [cs.ds] 18 May 2015

arxiv: v2 [cs.ds] 18 May 2015 Optimal Shuffle Code with Permutation Instructions Sebastian Buchwald, Manuel Mohr, and Ignaz Rutter Karlsruhe Institute of Technology {sebastian.buchwald, manuel.mohr, rutter}@kit.edu arxiv:1504.07073v2

More information

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides

Jana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides Jana Kosecka Linear Time Sorting, Median, Order Statistics Many slides here are based on E. Demaine, D. Luebke slides Insertion sort: Easy to code Fast on small inputs (less than ~50 elements) Fast on

More information

Homework Assignment #1: Topology Kelly Shaw

Homework Assignment #1: Topology Kelly Shaw EE482 Advanced Computer Organization Spring 2001 Professor W. J. Dally Homework Assignment #1: Topology Kelly Shaw As we have not discussed routing or flow control yet, throughout this problem set assume

More information

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C

Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of C Optimum Alphabetic Binary Trees T. C. Hu and J. D. Morgenthaler Department of Computer Science and Engineering, School of Engineering, University of California, San Diego CA 92093{0114, USA Abstract. We

More information

PAPER Constructing the Suffix Tree of a Tree with a Large Alphabet

PAPER Constructing the Suffix Tree of a Tree with a Large Alphabet IEICE TRANS. FUNDAMENTALS, VOL.E8??, NO. JANUARY 999 PAPER Constructing the Suffix Tree of a Tree with a Large Alphabet Tetsuo SHIBUYA, SUMMARY The problem of constructing the suffix tree of a tree is

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics 310 (2010) 2769 2775 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: www.elsevier.com/locate/disc Optimal acyclic edge colouring of grid like graphs

More information

Improved Bounds for Intersecting Triangles and Halving Planes

Improved Bounds for Intersecting Triangles and Halving Planes Improved Bounds for Intersecting Triangles and Halving Planes David Eppstein Department of Information and Computer Science University of California, Irvine, CA 92717 Tech. Report 91-60 July 15, 1991 Abstract

More information

PERFORMANCE AND IMPLEMENTATION OF 4x4 SWITCHING NODES IN AN INTERCONNECTION NETWORK FOR PASM

PERFORMANCE AND IMPLEMENTATION OF 4x4 SWITCHING NODES IN AN INTERCONNECTION NETWORK FOR PASM PERFORMANCE AND IMPLEMENTATION OF 4x4 SWITCHING NODES IN AN INTERCONNECTION NETWORK FOR PASM Robert J. McMillen, George B. Adams III, and Howard Jay Siegel School of Electrical Engineering, Purdue University

More information

Consistency and Set Intersection

Consistency and Set Intersection Consistency and Set Intersection Yuanlin Zhang and Roland H.C. Yap National University of Singapore 3 Science Drive 2, Singapore {zhangyl,ryap}@comp.nus.edu.sg Abstract We propose a new framework to study

More information