Wei Shu and Min-You Wu. Abstract. partitioning patterns, and communication optimization to achieve a speedup.
|
|
- Katherine Horton
- 5 years ago
- Views:
Transcription
1 Sparse Implementation of Revised Simplex Algorithms on Parallel Computers Wei Shu and Min-You Wu Abstract Parallelizing sparse simplex algorithms is one of the most challenging problems. Because of very sparse matrices and very heavy communication, the ratio of computation to communication is extremely low. It becomes necessary to carefully select parallel algorithms, partitioning patterns, and communication optimization to achieve a speedup. Two implementations on Intel hypercubes are presented in this paper. The experimental results show that a nearly linear speedup can be obtained with the basic revised simplex algorithm. However, the basic revised simplex algorithm has many ll-ins. We also implement a revised simplex algorithm with LU decomposition. It is a very sparse algorithm, and it is very dicult to achieve a speedup with it. 1 Introduction Linear programming is a fundamental problem in the eld of operations research. Many methods are available for solving the linear programming problems, in which the simplex method is the most widely used one for its simplicity and speed. Although it is well-known that the computational complexity of the simplex method is not polynomial in the number of equations, in practice it can quickly solve many linear programming problems. The interior-point algorithms are polynomial and potentially easy to parallelize [7, 1]. However, since the interior-point algorithm has not been well-developed yet, the simplex algorithm is still applied to most real applications. Given a linear programming problem, the simplex method starts from an initial feasible solution and moves toward the optimal solution. The execution is carried out iteratively. At each iteration, it improves the value of the objective function. This procedure will terminate after a nite number of iterations. The simplex method was rst designed in 1947 by Dantzig [3]. The revised simplex method is a modication of the original one which signicantly reduces the total number of calculations to be performed at each iteration [4]. In this paper, we will use the revised simplex algorithm to solve the linear programming problem. Parallel implementations of the linear programming algorithms have been studied on dierent machine architectures, including both distributed and shared memory parallel machines. Sheu et al. presented a mapping technique of linear programming problems on the BNN Buttery parallel computer [10]. The performance of parallel simplex algorithms was studied on the Sequent Balance shared memory machine and a 16-processor Transputer system from INMOS Corporation [12]. The parallel simplex algorithms for a loosely coupled message-passing based parallel systems were presented in [5]. The implementations This work appeared in the proceeding of the Sixth SIAM Conference on Parallel Processing for Scientic Computing, March 22-24, 1993, Norfolk. This research was partially supported by NSF grants CCR and CCR
2 2 Shu and Wu of simplex algorithm on the xed-size hypercubes was proposed by Ho et al. [6]. The performance of the simplex and revised simplex method on the Intel ipsc/2 hypercube was examined in [11]. All existing works are for dense algorithms and there is no sparse implementation. Parallel computers are used to solve large application problems. Large-scale linear programming problems are very sparse. Therefore, dense algorithms are not of practical value. We implemented two sparse algorithms. The rst one, the revised simplex algorithm, can be a good parallel algorithm but has many ll-ins. It is easy to parallelize and can yield good speedup. The second one, the revised simplex algorithm with LU decomposition [2], is one of the best sequential simplex algorithms. It is a sophisticated algorithm with few ll-ins, however, it is extremely dicult to parallelize and to obtain a speedup, especially on distributed memory computers. Actually, it is one of the most dicult algorithms we ever implemented. 2 Revised Simplex Algorithm The linear programming problems can be formulated as follows: solve Ax = b while minimizing z = c T x where A is a m n matrix representing m constraints, b is the right hand side vector, c is the coecient vector that denes the objective function z, and x is a column of n variables. A fundamental theorem of linear programming states that an optimal solution, if it exists, occurs when (n? m) components of x are set to zero. In most practical problems, some or all constraints may be specied by linear inequalities, which can be easily converted to linear equality constraints with the addition of nonnegative slack variables. Similarly, articial variables may be added to maintain x nonnegative [8]. We use the two-phase revised simplex method, in which the articial variables are driven to zero in the rst phase, and an optimal solution is obtained in the second phase. A basis consists of m linearly independent columns of matrix A, represented as B = [A j1 ; A j2 ; :::; A jm ]. A basic solution can be obtained by setting (n? m) components of x to zero and solving the resulting m m linear equations. These m components are called the basic variables; the rest of (n? m) components are called the nonbasic variables. Each set of basic variables represents a solution, but not necessarily the optimal one. To approach the optimal, we need to vary the basis by exchanging a basic variable with a nonbasic variable each time, such that the new basis represents a better solution than the one formed previously. The variable that enters the basis is called the entering variable, and the variable that leaves the basis is called the leaving variable. This process forms a kernel of the revised simplex method algorithm as shown in Fig 1. It will be iteratively repeated in both the phases. The slack and articial variables are commonly chosen as an initial basis, which always results in a basic feasible solution. 3 Parallelization of Revised Simplex Algorithm As shown in Fig. 1, the principle cost of determining the entering variable and the leaving variable is the same a vector-matrix multiplication followed by a search for minimum value. These two steps contribute to the major computation part, more than 98% of the total computation in large test data. The third step updates the basic variable and basis inverse. The parallelization of the revised simplex algorithm, therefore, turns to applying the parallel vector-matrix multiplication and minimum reduction repeatedly for each iteration.
3 Parallel Sparse Simplex Implementation 3 1. Determine the entering variable by using optimality condition, c 0 s = minfc 0 j ; j = 1; 2; : : :; n and x j is nonbasic g; where c 0 j = c j + If c 0 s 0, then the current value is optimal and algorithm stops. 2. Determine the leaving variable by using feasibility condition, b r y rs = minf bi y is ; i = 1; 2; : : :; m and y is > 0g; where Y s = y is ; i = 1; 2; : : :; m and Y s = B?1 A s If y is 0, then the problem is unbounded and algorithm stops. 3. The basic variable in row r will be replaced by variable x s and the new basis B = [A j1 ; : : :; A jr?1 ; A s ; A jr+1 ; : : :; A jm ] mx i=1 i a ij Update the basis inverse B?1, the vector of simplex multipliers, and the right hand side vector b by B?1 = B?1? Y s B?1 (r; :)=y rs =? c 0 s B?1 (r; :) b = b? Y s b r y rs Fig. 1. The Revised Simplex Algorithm A is a sparse matrix with two properties, (1) it is accessed through column exclusively, therefore, using a column-by-column storage scheme; and (2) it is read only during the entire computation. The nonzero elements of matrix A are packed into a one-dimensional array aa, where aa(k) is the kth nonzero element in A with column-major order. Another array ia, with the same size of aa, contains the row indices which correspond to the nonzero elements of A. In addition, a one-dimension array ja is used to point out the index of aa where each column of A begins. Matrix A is partitioned column-wise, due to the nature of multiplication of vector and matrix. If the row partitioning is chosen, a row-wise reduction will introduce heavy communication. Notice that the sparsity of matrix A is not evenly distributed. For example, the right portion of matrix A, consisting of slack and articial variables with only one nonzero element for each column, is usually more sparse compared to the left portion. In order to make computation density evenly distributed, a scatter column partitioning scheme is used [9]. Vector is duplicated in all nodes to eliminate communications while doing the multiplication. Besides, vector is scattered into a full-length vector of storage. Thus, the multiplication of vector and matrix A can be performed eciently because of (1) searching for i is not needed; (2) the amount of work performed depends only on the number of nonzero elements in matrix A. The entire multiplication is then performed in parallel without any communication involved, leaving the result vector c 0 distributed. A reduction is performed to nd out the minimal value from c 0. This completes the rst multiplication and reduction. Compared to the rst multiplication, the second one is a multiplication of a matrix to
4 4 Shu and Wu a vector, instead of a vector to a matrix. Matrix B?1 is then partitioned by row to avoid communications. Vector A s is also duplicated by broadcasting column s of A once the entering variable is selected. This broadcast serves as the only communication interface between the two steps and the rest is the same as in the rst one. In the third step, the basic variable in row r will be replaced by variable x s and the new basis B is given by B = [A j1 ; :::; A jr?1 ; A s ; A jr+1 ; :::; A jm ] The basis B is not physically represented. Instead, the basis inverse B?1 is manipulated and needs to be updated. This update may generate many ll-ins, therefore, the linked list scheme is used for B?1 storage. With the scatter partitioning technique, the load can be well balanced. The performance in Table 1 shows that a nearly linear speedup can be obtained with this algorithm. Table 1 Execution Time (in seconds) of the Revised Simplex Algorithm Matrix Number of Processors Size SHARE2B 97* SC * SC * BRANDY 221* BANDM 306* Revised Simplex Algorithm with LU decomposition The standard revised simplex algorithm mentioned above is based on constructing the basis inverse matrix B?1 and updating the inverse after every iteration. While this standard algorithm bares the advantages of its simplicity and easiness for parallelization, it is not the best one in terms of its sparsity and its round-o error behavior. This algorithm has many ll-ins in B?1. Large number of ll-ins not only increase the execution time, but also occupy too much memory space. Consequently, large test data cannot t in the memory. Another alternative is the method of Bartels and Golub that is based on the LU decomposition of the basis matrix B with row exchanges [2]. Compared to the standard one, the new one diers in the way to solve two linear equations: T B =?c T B and BY s = A s With the original Bartels-Golub algorithm used, the LU decomposition needs to be applied for the newly constructed basic matrix B in each iteration. This operation is computationally more expensive than the one used to update the basis inverse B?1 in the standard case. To compromise, a better implementation is to take the advantage of the fact that the newly constructed B k diers from the preceding B k?1 in only one column after k iterations. Here, we can construct an eta matrix E, which diers from the identity matrix in only one column, referred to as its eta column, to satisfy B k = B k?1 E k = B k?2 E k?1 E k = ::: = B 0 E 1 E 2 :::E k This eta factorization, therefore, suggests another way of solving the two linear equations at iteration k: (((( T B 0 )E 1 )E 2 ):::)E k =?c T B and B 0 (E 1 (E 2 (:::(E k Y s )))) = A s
5 Parallel Sparse Simplex Implementation 5 where B 0 will be decomposed into L and U. The E solver is a special case of the linear system solver, which consists of a sequence of k eta column modication to the solution vector. Assume we use an array p to reserve each eta column position in its original eta matrix. To solve the original eta matrix system, the lth modication can be described as follows, E l (E l+1 (:::(E k X))) = Z l?1 can be modied to E l+1 (:::(E k X)) = Z l where z l;p(l) = z l?1;p(l) =e l;p(l) and z l;i = z l?1;i? e l;i z l;p(l) ; i 6= p(l) After k modications, Z k is the resulting vector. However, the number of nonzero elements of E grows with the number of iterations. As k increases, the storage of all eta matrices becomes large and the time spent on the E solver is slow too. Therefore, after a certain number of iterations, say r, the basis B 0 needs to be refactorized [2]. That is, all the eta matrixes will be discarded. The current B k is treated as a new B 0 to be decomposed. Such an LU decomposition needs to be conducted every r iterations to keep the time and storage of E solver within an acceptable range. This improved algorithm can be summarized in Fig 2. In our implementation, r is set to Determine the entering variable by using optimality condition, c 0 s = minfc 0 j ; j = 1; 2; : : :; n and x j is nonbasic g; where c 0 j = c j + If c 0 s 0, then the current value is optimal and algorithm stops. mx i=1 i a ij 2. Solving B 0 (E 1 (E 2 (:::(E k Y s )))) = A s If y is 0 for i = 1; 2; : : :; m, then the problem is unbounded and algorithm stops. 3. Determine the leaving variable by using feasibility condition, b r = bi minf ; y is > 0 and i = 1; 2; : : :; mg y rs y is 4. The basic variable in row r will be replaced by variable x s and the new basis B = [A j1 ; : : :; A jr?1 ; A s ; A jr+1 ; : : :; A jm ] 5. If (k r) then treat B k as the new B 0 and decompose B 0 to get the new matrices L and U, and reset k to 1, else store E k+1 = Y s and increment k 6. Solving (((( T B 0 )E 1 )E 2 ):::)E k =?c T B Fig. 2. The Revised Simplex Algorithm with LU Decomposition 5 Parallelization of Revised Simplex Algorithm with LU Decomposition Parallelization of the simplex algorithm with LU decomposition is more dicult. As experiments indicated, the major computation cost is distributed among the matrix multiplication in step 1 of Fig 2; the linear system solvers and E solvers for Y s in step 2 and one for in step 6, respectively, and the LU decomposition in step 5 if applied. The parallelization of the matrix multiplication of step 1 is the same as the one in the standard algorithm, as well as the storage scheme and the data partitioning strategy. For the two linear system solvers, the basis matrix B 0 serves as the coecient matrix for r iterations before the next refactorization takes place. We do not expect a good
6 6 Shu and Wu speedup from the triangular solver, because the matrices L and U are very sparse and the computation time spent on the triangular solvers is small compared to the other steps. In consideration of scalability, we distribute the factorized triangular matrices L and U to minimize the overhead of 2r triangular solvers. For either L or U, partitioning by row is a simple way to reduce overhead. Furthermore, a block partitioning is used to increase the grainsize between two consecutive communications. Although a scatter partitioning scheme can improve load balance theoretically, a large number of communication, proportional to the number of rows m, will overwhelm this potential, especially for a sparse matrix case. Here, with a block-row partitioning scheme, the total number of communications for each time of triangular solver is equal to the number of physical processors used, independent of m. It is not conceivable to parallelize the E solver, because (i) the lth eta column modication depends on the results from the l?1th one; (ii) the modication of each element depends on the value of the diagonal element in its eta column; and (iii) a small amount of computation is involved in each modication. Due to such heavy data dependences and small grainsizes, the E solver, if executed in parallel, will not be benecial; therefore, it is currently executed sequentially and all the eta columns are stored in a single physical processor, with two assumptions: if time spent on this sequential part is too long, we can adjust the frequency of refactorization to make it under control. in consideration of the scalability, if needed, we can always distribute k eta columns and have the column modications executed in turn for each processor; The basis refactorization is performed every r iterations. A new algorithm was used to factorize the basis matrix eciently [13]. The idea is to take advantages of the fact that the basis matrix B is extremely sparse and has many column singletons and row singletons. As part of the matrix is phased out by rearranging singletons, the remaining matrix to be factorized may have new singletons generated. Therefore, the column or row exchange of singletons should be reapplied continuously until no more singleton exists. At that time, the rest of the matrix can be factorized in a regular manner. By the column and row exchanges of singletons, we can reduce the size of the submatrix that is actually factorized, and sometimes, eliminate the factorization completely. During the period of exchanging of singletons, rst, no ll-ins will be introduced since no updating; and secondly, searching for the singletons can be easily conducted in parallel since there is no updating, therefore, no dependency. In general, a simple application algorithm has only one single computation task on a single data domain. For this class of problems, the major data domain should be distributed such that the computation task can be executed most eectively in parallel. The LU decomposition, matrix multiplication, triangular solver, etc. belongs to this class. However, a complex application problem, such as the revised simplex method with LU decomposition, consists of many computation steps and many underlying data domains. Usually, these computation steps depend on each other. If we simply distribute the underlying data domains in consideration of parallelization of individual computation steps, it may ends up a mismatch between two underlying data domains. A global optimization may not be reached simply with local optimal data distributions for each step. Sometimes a sub-optimal local data distribution must be used to reduce interface overhead. There are two major interfaces in this algorithm. The rst one is the interface between matrix A and basis B. The current new basis is constructed by collecting m columns from matrix A. Remember that matrix A has been distributed with scatter-column partitioning.
7 Parallel Sparse Simplex Implementation 7 One way to construct matrix B is to extract the columns of the basic variables from A and redistribute them to obtain local optimal data distribution. However, data redistribution between dierent computation steps can be benecial only when it is crucial for performance of the next step. Here, the refactorization is not expected for a perfect speedup and the benet from redistribution is less than the cost of redistribution. Therefore, we leave the extracted columns in the physical processor where they reside. With the distribution inherited from matrix A, the basis matrix B is not necessarily evenly distributed among processors, leading to a sub-optimal data distribution. The second interface is between basis B and matrices L and U after refactorization. Since L and U are to be used in the following triangular solvers for 2r times, it is worthwhile to carry out the redistribution. The unbalanced scatter-column basis B is redistributed to a block-row scheme for L and U, which is an optimal scheme used for the triangular solver. Without redistribution, matrices L and U were distributed in column-scatter scheme and the communication cost in the following triangular solvers would be even worse. This algorithm is very sparse, therefore, dicult to parallelize due to small granularity. The triangular solvers and E solvers are hard to get speedup and redistribution involves in large communication overhead. The basic strategy used in this implementation is to apply the best possible parallelization techniques for the most computational dense part matrix multiplication to obtain maximum speedup. This part spends more than 70% of the total execution time and has speedup of around 7.5 on 8 processors. The LU decomposition part spends less than 10% of total time and rarely has a speedup. The triangular solvers and the E solvers spend about 10% of total time and the parallel execution time increases with the number of processors, dominating the overall performance. The performance is shown in Table 2. Compared it with Table 1, BRANDY and BANDM are executed much faster on a single processor, but have almost no speedup. When the number of processors is larger than 8, these two data sets run slower. For large data, such as SHIP08L, SCSD8, and SHIP12L, up to two-fold speedup can be obtained. More importantly, these data cannot be run with the basic revised simplex algorithm since they cannot t in the memory. Table 2 Execution Time (in seconds) of the Revised Simplex Algorithm with LU decomposition Matrix Number of Processors Size BRANDY 221* { { BANDM 306* SHIP08L 779* SCSD8 398* SHIP12L 1152* Conclusions Two revised simplex algorithms, with and without LU decomposition, have been implemented on the Intel ipsc/2 hypercube computers. The test data sets are from netlib. These data sets represent realistic problems in industry applications, ranging from smallscale to large-scale. The execution time is from a few seconds to thousands of seconds. Although the basic revised simplex algorithm is relative easier to implement, both of them need to be carefully tuned to get the best possible performance. We emphasized on sophis-
8 8 Shu and Wu ticated parallelization techniques, such as partitioning patterns, data distribution, communication reduction, and interface between computation steps. Our experience shows that for a complicated problem, each step of computation requires dierent data distribution. The interfaces between steps aect the decision of data distribution, and consequently, communications. In summary, parallelizing sparse simplex algorithms is not easy. The experimental results show that a nearly linear speedup can be obtained with the basic revised simplex algorithm. On the other hand, because of very sparse matrices and very heavy communication, the revised simplex algorithm with LU decomposition is hard to achieve good speedup even with carefully selected partitioning patterns and communication optimization. Acknowledgments The authors wish to thank Yong Li for his contribution on linear programming algorithms and sequential codes. References [1] G. Astfalk, I. Lusting, R. Marsten and D. Shanno, The Interior-Point Method for Linear Programming, IEEE Software, July, 1992, pages 61{67. [2] R. H. Bartels and G. H. Golub, The Simplex Method of Linear Programming Using LU Decomposition, Comm. ACM, 12, pages , [3] G. B. Dantzig. Linear Programming and Extensions, Princeton University Press, New Jersey, [4] G. B. Dantzig and W. Orchard-Hays, The product form for the inverse in the simplex method, Mathematical Tables and Other Aids to Computation, 8, pages 64-67, [5] R. A. Finkel, Large-grain Parallelism { Three Cases Studies, The Characteristics of Parallel Algorithms, L. H. Jamieson ed., The MIT Press, [6] H. F. Ho, G. H. Chen, S. H. Lin, and J. P. Sheu, Solving Linear Programming on Fixed-Size Hypercubes, pages , ICPP'88. [7] N. Karmarkar, A New Polynomial-Time Algorithm for Linear Programming, Combinatorica, Vol. 4, No. 8, 1984, pages [8] Y. Li, M. Wu, W. Shu and G. Fox, Linear Programming Algorithms and Parallel Implementations, SCCS Report 288, Syracuse University, May [9] D. M. Nicol and J. H. Saltz. An analysis of scatter decomposition. IEEE Trans. Computers, C{39(11):1337{1345, November [10] T. Sheu and W. Lin, Mapping Linear Programming Algorithms onto the Buttery Parallel Processor, [11] C. B. Stunkel and D. C. Reed, Hypercube Implementation of the Simplex Algorithm, Prof. 4th Conf. on Hypercube Concurrent Computer and Applications, March [12] Y. Wu and T. G. Lewis, Performance of Parallel Simplex Algorithms, Department of Computer Science, Oregon State University, [13] M. Wu, and Y. Li, Fast LU Decomposition for Sparse Simplex Method, SIAM Conference on Parallel Processing for Scientic Computing, March 1993.
Part 1. The Review of Linear Programming The Revised Simplex Method
In the name of God Part 1. The Review of Linear Programming 1.4. Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Outline in Tableau Format Comparison Between the Simplex and the Revised Simplex
More informationDEGENERACY AND THE FUNDAMENTAL THEOREM
DEGENERACY AND THE FUNDAMENTAL THEOREM The Standard Simplex Method in Matrix Notation: we start with the standard form of the linear program in matrix notation: (SLP) m n we assume (SLP) is feasible, and
More informationPart 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm
In the name of God Part 4. 4.1. Dantzig-Wolf Decomposition Algorithm Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Introduction Real world linear programs having thousands of rows and columns.
More informationInternational Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERA
International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERATIONS KAM_IL SARAC, OMER E GEC_IO GLU, AMR EL ABBADI
More informationAim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview
Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure
More informationThe Simplex Algorithm
The Simplex Algorithm Uri Feige November 2011 1 The simplex algorithm The simplex algorithm was designed by Danzig in 1947. This write-up presents the main ideas involved. It is a slight update (mostly
More informationAn Improved Measurement Placement Algorithm for Network Observability
IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 4, NOVEMBER 2001 819 An Improved Measurement Placement Algorithm for Network Observability Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper
More informationMatrices. D. P. Koester, S. Ranka, and G. C. Fox. The Northeast Parallel Architectures Center (NPAC) Syracuse University
Parallel LU Factorization of Block-Diagonal-Bordered Sparse Matrices D. P. Koester, S. Ranka, and G. C. Fox School of Computer and Information Science and The Northeast Parallel Architectures Center (NPAC)
More informationJulian Hall School of Mathematics University of Edinburgh. June 15th Parallel matrix inversion for the revised simplex method - a study
Parallel matrix inversion for the revised simplex method - A study Julian Hall School of Mathematics University of Edinburgh June 5th 006 Parallel matrix inversion for the revised simplex method - a study
More informationDM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini
DM545 Linear and Integer Programming Lecture 2 The Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. 2. 3. 4. Standard Form Basic Feasible Solutions
More informationScalability of Heterogeneous Computing
Scalability of Heterogeneous Computing Xian-He Sun, Yong Chen, Ming u Department of Computer Science Illinois Institute of Technology {sun, chenyon1, wuming}@iit.edu Abstract Scalability is a key factor
More informationJ.A.J.Hall, K.I.M.McKinnon. September 1996
PARSMI, a parallel revised simplex algorithm incorporating minor iterations and Devex pricing J.A.J.Hall, K.I.M.McKinnon September 1996 MS 96-012 Supported by EPSRC research grant GR/J0842 Presented at
More informationDistributed Execution of Actor Programs. Gul Agha, Chris Houck and Rajendra Panwar W. Springeld Avenue. Urbana, IL 61801, USA
Distributed Execution of Actor Programs Gul Agha, Chris Houck and Rajendra Panwar Department of Computer Science 1304 W. Springeld Avenue University of Illinois at Urbana-Champaign Urbana, IL 61801, USA
More informationEgemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for
Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and
More information1e+07 10^5 Node Mesh Step Number
Implicit Finite Element Applications: A Case for Matching the Number of Processors to the Dynamics of the Program Execution Meenakshi A.Kandaswamy y Valerie E. Taylor z Rudolf Eigenmann x Jose' A. B. Fortes
More informationLarge-scale Structural Analysis Using General Sparse Matrix Technique
Large-scale Structural Analysis Using General Sparse Matrix Technique Yuan-Sen Yang 1), Shang-Hsien Hsieh 1), Kuang-Wu Chou 1), and I-Chau Tsai 1) 1) Department of Civil Engineering, National Taiwan University,
More informationNotes for Lecture 18
U.C. Berkeley CS17: Intro to CS Theory Handout N18 Professor Luca Trevisan November 6, 21 Notes for Lecture 18 1 Algorithms for Linear Programming Linear programming was first solved by the simplex method
More informationASYNPLEX, an asynchronous parallel revised simplex algorithm J. A. J. Hall K. I. M. McKinnon 15 th July 1997 Abstract This paper describes ASYNPLEX, a
ASYNPLEX, an asynchronous parallel revised simplex algorithm J.A.J. Hall K.I.M. McKinnon July 1997 MS 95-050a Supported by EPSRC research grant GR/J08942 Presented at APMOD95 Brunel University 3rd April
More informationChapter 1. Reprinted from "Proc. 6th SIAM Conference on Parallel. Processing for Scientic Computing",Norfolk, Virginia (USA), March 1993.
Chapter 1 Parallel Sparse Matrix Vector Multiplication using a Shared Virtual Memory Environment Francois Bodin y Jocelyne Erhel y Thierry Priol y Reprinted from "Proc. 6th SIAM Conference on Parallel
More informationDense Matrix Algorithms
Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication
More informationReport of Linear Solver Implementation on GPU
Report of Linear Solver Implementation on GPU XIANG LI Abstract As the development of technology and the linear equation solver is used in many aspects such as smart grid, aviation and chemical engineering,
More informationOptimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning
Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Michael M. Wolf 1,2, Erik G. Boman 2, and Bruce A. Hendrickson 3 1 Dept. of Computer Science, University of Illinois at Urbana-Champaign,
More information3 INTEGER LINEAR PROGRAMMING
3 INTEGER LINEAR PROGRAMMING PROBLEM DEFINITION Integer linear programming problem (ILP) of the decision variables x 1,..,x n : (ILP) subject to minimize c x j j n j= 1 a ij x j x j 0 x j integer n j=
More informationAN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1
AN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1 Virgil Andronache Richard P. Simpson Nelson L. Passos Department of Computer Science Midwestern State University
More informationExam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3
UMEÅ UNIVERSITET Institutionen för datavetenskap Lars Karlsson, Bo Kågström och Mikael Rännar Design and Analysis of Algorithms for Parallel Computer Systems VT2009 June 2, 2009 Exam Design and Analysis
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationThe only known methods for solving this problem optimally are enumerative in nature, with branch-and-bound being the most ecient. However, such algori
Use of K-Near Optimal Solutions to Improve Data Association in Multi-frame Processing Aubrey B. Poore a and in Yan a a Department of Mathematics, Colorado State University, Fort Collins, CO, USA ABSTRACT
More informationCopyright 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin Introduction to the Design & Analysis of Algorithms, 2 nd ed., Ch.
Iterative Improvement Algorithm design technique for solving optimization problems Start with a feasible solution Repeat the following step until no improvement can be found: change the current feasible
More informationHowever, m pq is just an approximation of M pq. As it was pointed out by Lin [2], more precise approximation can be obtained by exact integration of t
FAST CALCULATION OF GEOMETRIC MOMENTS OF BINARY IMAGES Jan Flusser Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodarenskou vez 4, 82 08 Prague 8, Czech
More informationCOLUMN GENERATION IN LINEAR PROGRAMMING
COLUMN GENERATION IN LINEAR PROGRAMMING EXAMPLE: THE CUTTING STOCK PROBLEM A certain material (e.g. lumber) is stocked in lengths of 9, 4, and 6 feet, with respective costs of $5, $9, and $. An order for
More informationOptimization of Design. Lecturer:Dung-An Wang Lecture 8
Optimization of Design Lecturer:Dung-An Wang Lecture 8 Lecture outline Reading: Ch8 of text Today s lecture 2 8.1 LINEAR FUNCTIONS Cost Function Constraints 3 8.2 The standard LP problem Only equality
More informationLecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8.
CZ4102 High Performance Computing Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations - Dr Tay Seng Chuan Reference: Introduction to Parallel Computing Chapter 8. 1 Topic Overview
More informationImplementations of Dijkstra's Algorithm. Based on Multi-Level Buckets. November Abstract
Implementations of Dijkstra's Algorithm Based on Multi-Level Buckets Andrew V. Goldberg NEC Research Institute 4 Independence Way Princeton, NJ 08540 avg@research.nj.nec.com Craig Silverstein Computer
More informationChapter 15 Introduction to Linear Programming
Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of
More informationDiscrete Optimization. Lecture Notes 2
Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The
More informationIntroduction. Linear because it requires linear functions. Programming as synonymous of planning.
LINEAR PROGRAMMING Introduction Development of linear programming was among the most important scientific advances of mid-20th cent. Most common type of applications: allocate limited resources to competing
More informationHeap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the
Heap-on-Top Priority Queues Boris V. Cherkassky Central Economics and Mathematics Institute Krasikova St. 32 117418, Moscow, Russia cher@cemi.msk.su Andrew V. Goldberg NEC Research Institute 4 Independence
More informationAMATH 383 Lecture Notes Linear Programming
AMATH 8 Lecture Notes Linear Programming Jakob Kotas (jkotas@uw.edu) University of Washington February 4, 014 Based on lecture notes for IND E 51 by Zelda Zabinsky, available from http://courses.washington.edu/inde51/notesindex.htm.
More information16.410/413 Principles of Autonomy and Decision Making
16.410/413 Principles of Autonomy and Decision Making Lecture 16: Mathematical Programming I Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology November 8, 2010 E. Frazzoli
More information2 The Service Provision Problem The formulation given here can also be found in Tomasgard et al. [6]. That paper also details the background of the mo
Two-Stage Service Provision by Branch and Bound Shane Dye Department ofmanagement University of Canterbury Christchurch, New Zealand s.dye@mang.canterbury.ac.nz Asgeir Tomasgard SINTEF, Trondheim, Norway
More informationY. Han* B. Narahari** H-A. Choi** University of Kentucky. The George Washington University
Mapping a Chain Task to Chained Processors Y. Han* B. Narahari** H-A. Choi** *Department of Computer Science University of Kentucky Lexington, KY 40506 **Department of Electrical Engineering and Computer
More informationMatrix Multiplication on an Experimental Parallel System With Hybrid Architecture
Matrix Multiplication on an Experimental Parallel System With Hybrid Architecture SOTIRIOS G. ZIAVRAS and CONSTANTINE N. MANIKOPOULOS Department of Electrical and Computer Engineering New Jersey Institute
More informationy(b)-- Y[a,b]y(a). EQUATIONS ON AN INTEL HYPERCUBE*
SIAM J. ScI. STAT. COMPUT. Vol. 12, No. 6, pp. 1480-1485, November 1991 ()1991 Society for Industrial and Applied Mathematics 015 SOLUTION OF LINEAR SYSTEMS OF ORDINARY DIFFERENTIAL EQUATIONS ON AN INTEL
More informationThe Simplex Algorithm with a New. Primal and Dual Pivot Rule. Hsin-Der CHEN 3, Panos M. PARDALOS 3 and Michael A. SAUNDERS y. June 14, 1993.
The Simplex Algorithm with a New rimal and Dual ivot Rule Hsin-Der CHEN 3, anos M. ARDALOS 3 and Michael A. SAUNDERS y June 14, 1993 Abstract We present a simplex-type algorithm for linear programming
More informationA Fast Recursive Mapping Algorithm. Department of Computer and Information Science. New Jersey Institute of Technology.
A Fast Recursive Mapping Algorithm Song Chen and Mary M. Eshaghian Department of Computer and Information Science New Jersey Institute of Technology Newark, NJ 7 Abstract This paper presents a generic
More informationGeneralized Network Flow Programming
Appendix C Page Generalized Network Flow Programming This chapter adapts the bounded variable primal simplex method to the generalized minimum cost flow problem. Generalized networks are far more useful
More informationLecture 27: Fast Laplacian Solvers
Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall
More informationEcient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines
Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Zhou B. B., Brent R. P. and Tridgell A. y Computer Sciences Laboratory The Australian National University Canberra,
More informationTHE simplex algorithm [1] has been popularly used
Proceedings of the International MultiConference of Engineers and Computer Scientists 207 Vol II, IMECS 207, March 5-7, 207, Hong Kong An Improvement in the Artificial-free Technique along the Objective
More informationExplore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan
Explore Co-clustering on Job Applications Qingyun Wan SUNet ID:qywan 1 Introduction In the job marketplace, the supply side represents the job postings posted by job posters and the demand side presents
More informationSeminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm
Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of
More informationHomework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization
ECE669: Parallel Computer Architecture Fall 2 Handout #2 Homework # 2 Due: October 6 Programming Multiprocessors: Parallelism, Communication, and Synchronization 1 Introduction When developing multiprocessor
More informationDistributed Execution of Actor Programs. Gul Agha, Chris Houck and Rajendra Panwar W. Springeld Avenue. Urbana, IL 61801, USA
Distributed Execution of Actor Programs Gul Agha, Chris Houck and Rajendra Panwar Department of Computer Science 1304 W. Springeld Avenue University of Illinois at Urbana-Champaign Urbana, IL 61801, USA
More informationOutline. CS38 Introduction to Algorithms. Linear programming 5/21/2014. Linear programming. Lecture 15 May 20, 2014
5/2/24 Outline CS38 Introduction to Algorithms Lecture 5 May 2, 24 Linear programming simplex algorithm LP duality ellipsoid algorithm * slides from Kevin Wayne May 2, 24 CS38 Lecture 5 May 2, 24 CS38
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 3 Dense Linear Systems Section 3.3 Triangular Linear Systems Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign
More informationreasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap
Storage-Ecient Finite Field Basis Conversion Burton S. Kaliski Jr. 1 and Yiqun Lisa Yin 2 RSA Laboratories 1 20 Crosby Drive, Bedford, MA 01730. burt@rsa.com 2 2955 Campus Drive, San Mateo, CA 94402. yiqun@rsa.com
More informationChapter II. Linear Programming
1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION
More information1 Introduction Complex decision problems related to economy, environment, business and engineering are multidimensional and have multiple and conictin
A Scalable Parallel Algorithm for Multiple Objective Linear Programs Malgorzata M. Wiecek Hong Zhang y Abstract This paper presents an ADBASE-based parallel algorithm for solving multiple objective linear
More informationSensor-Target and Weapon-Target Pairings Based on Auction Algorithm
Proceedings of the 11th WSEAS International Conference on APPLIED MATHEMATICS, Dallas, Texas, USA, March 22-24, 2007 92 Sensor-Target and Weapon-Target Pairings Based on Auction Algorithm Z. R. BOGDANOWICZ,
More information2 ATTILA FAZEKAS The tracking model of the robot car The schematic picture of the robot car can be seen on Fig.1. Figure 1. The main controlling task
NEW OPTICAL TRACKING METHODS FOR ROBOT CARS Attila Fazekas Debrecen Abstract. In this paper new methods are proposed for intelligent optical tracking of robot cars the important tools of CIM (Computer
More informationA. Atamturk. G.L. Nemhauser. M.W.P. Savelsbergh. Georgia Institute of Technology. School of Industrial and Systems Engineering.
A Combined Lagrangian, Linear Programming and Implication Heuristic for Large-Scale Set Partitioning Problems 1 A. Atamturk G.L. Nemhauser M.W.P. Savelsbergh Georgia Institute of Technology School of Industrial
More informationWhat is the Worst Case Behavior of the Simplex Algorithm?
Centre de Recherches Mathématiques CRM Proceedings and Lecture Notes Volume, 28 What is the Worst Case Behavior of the Simplex Algorithm? Norman Zadeh Abstract. The examples published by Klee and Minty
More informationAccelerating the Iterative Linear Solver for Reservoir Simulation
Accelerating the Iterative Linear Solver for Reservoir Simulation Wei Wu 1, Xiang Li 2, Lei He 1, Dongxiao Zhang 2 1 Electrical Engineering Department, UCLA 2 Department of Energy and Resources Engineering,
More informationSome Advanced Topics in Linear Programming
Some Advanced Topics in Linear Programming Matthew J. Saltzman July 2, 995 Connections with Algebra and Geometry In this section, we will explore how some of the ideas in linear programming, duality theory,
More information5.4 Pure Minimal Cost Flow
Pure Minimal Cost Flow Problem. Pure Minimal Cost Flow Networks are especially convenient for modeling because of their simple nonmathematical structure that can be easily portrayed with a graph. This
More informationPerformance Analysis of Distributed Iterative Linear Solvers
Performance Analysis of Distributed Iterative Linear Solvers W.M. ZUBEREK and T.D.P. PERERA Department of Computer Science Memorial University St.John s, Canada A1B 3X5 Abstract: The solution of large,
More informationStable sets, corner polyhedra and the Chvátal closure
Stable sets, corner polyhedra and the Chvátal closure Manoel Campêlo Departamento de Estatística e Matemática Aplicada, Universidade Federal do Ceará, Brazil, mcampelo@lia.ufc.br. Gérard Cornuéjols Tepper
More informationCivil Engineering Systems Analysis Lecture XV. Instructor: Prof. Naveen Eluru Department of Civil Engineering and Applied Mechanics
Civil Engineering Systems Analysis Lecture XV Instructor: Prof. Naveen Eluru Department of Civil Engineering and Applied Mechanics Today s Learning Objectives Sensitivity Analysis Dual Simplex Method 2
More informationParallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering
Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering George Karypis and Vipin Kumar Brian Shi CSci 8314 03/09/2017 Outline Introduction Graph Partitioning Problem Multilevel
More informationMemory Hierarchy Management for Iterative Graph Structures
Memory Hierarchy Management for Iterative Graph Structures Ibraheem Al-Furaih y Syracuse University Sanjay Ranka University of Florida Abstract The increasing gap in processor and memory speeds has forced
More informationMathematics and Computer Science
Technical Report TR-2006-010 Revisiting hypergraph models for sparse matrix decomposition by Cevdet Aykanat, Bora Ucar Mathematics and Computer Science EMORY UNIVERSITY REVISITING HYPERGRAPH MODELS FOR
More informationChapter 3: Towards the Simplex Method for Efficient Solution of Linear Programs
Chapter 3: Towards the Simplex Method for Efficient Solution of Linear Programs The simplex method, invented by George Dantzig in 1947, is the basic workhorse for solving linear programs, even today. While
More informationParallelizing the dual revised simplex method
Parallelizing the dual revised simplex method Qi Huangfu 1 Julian Hall 2 1 FICO 2 School of Mathematics, University of Edinburgh Birmingham 9 September 2016 Overview Background Two parallel schemes Single
More information1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3
6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require
More informationECEN 615 Methods of Electric Power Systems Analysis Lecture 11: Sparse Systems
ECEN 615 Methods of Electric Power Systems Analysis Lecture 11: Sparse Systems Prof. Tom Overbye Dept. of Electrical and Computer Engineering Texas A&M University overbye@tamu.edu Announcements Homework
More informationLinear Programming Problems
Linear Programming Problems Two common formulations of linear programming (LP) problems are: min Subject to: 1,,, 1,2,,;, max Subject to: 1,,, 1,2,,;, Linear Programming Problems The standard LP problem
More informationAdvanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs
Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material
More informationAvailability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742
Availability of Coding Based Replication Schemes Gagan Agrawal Department of Computer Science University of Maryland College Park, MD 20742 Abstract Data is often replicated in distributed systems to improve
More informationA Novel Method for Power-Flow Solution of Radial Distribution Networks
A Novel Method for Power-Flow Solution of Radial Distribution Networks 1 Narinder Singh, 2 Prof. Rajni Bala 1 Student-M.Tech(Power System), 2 Professor(Power System) BBSBEC, Fatehgarh Sahib, Punjab Abstract
More informationMathematical and Algorithmic Foundations Linear Programming and Matchings
Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis
More informationGraphs that have the feasible bases of a given linear
Algorithmic Operations Research Vol.1 (2006) 46 51 Simplex Adjacency Graphs in Linear Optimization Gerard Sierksma and Gert A. Tijssen University of Groningen, Faculty of Economics, P.O. Box 800, 9700
More informationDistance (*40 ft) Depth (*40 ft) Profile A-A from SEG-EAEG salt model
Proposal for a WTOPI Research Consortium Wavelet Transform On Propagation and Imaging for seismic exploration Ru-Shan Wu Modeling and Imaging Project, University of California, Santa Cruz August 27, 1996
More informationNATCOR Convex Optimization Linear Programming 1
NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 5 June 2018 What is linear programming (LP)? The most important model used in
More informationDesign and Analysis of Algorithms (V)
Design and Analysis of Algorithms (V) An Introduction to Linear Programming Guoqiang Li School of Software, Shanghai Jiao Tong University Homework Assignment 2 is announced! (deadline Apr. 10) Linear Programming
More information2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006
2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert
More informationOptimistic Message Logging for Independent Checkpointing. in Message-Passing Systems. Yi-Min Wang and W. Kent Fuchs. Coordinated Science Laboratory
Optimistic Message Logging for Independent Checkpointing in Message-Passing Systems Yi-Min Wang and W. Kent Fuchs Coordinated Science Laboratory University of Illinois at Urbana-Champaign Abstract Message-passing
More informationLecture 13: March 25
CISC 879 Software Support for Multicore Architectures Spring 2007 Lecture 13: March 25 Lecturer: John Cavazos Scribe: Ying Yu 13.1. Bryan Youse-Optimization of Sparse Matrix-Vector Multiplication on Emerging
More informationTechnical Report TR , Computer and Information Sciences Department, University. Abstract
An Approach for Parallelizing any General Unsymmetric Sparse Matrix Algorithm Tariq Rashid y Timothy A.Davis z Technical Report TR-94-036, Computer and Information Sciences Department, University of Florida,
More informationCMPSCI611: The Simplex Algorithm Lecture 24
CMPSCI611: The Simplex Algorithm Lecture 24 Let s first review the general situation for linear programming problems. Our problem in standard form is to choose a vector x R n, such that x 0 and Ax = b,
More informationLinear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization?
Linear and Integer Programming 15-853:Algorithms in the Real World Linear and Integer Programming I Introduction Geometric Interpretation Simplex Method Linear or Integer programming maximize z = c T x
More informationPARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES
PARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES Zhou B. B. and Brent R. P. Computer Sciences Laboratory Australian National University Canberra, ACT 000 Abstract We describe
More informationLecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1
CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the
More informationThe driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above
Janus a C++ Template Library for Parallel Dynamic Mesh Applications Jens Gerlach, Mitsuhisa Sato, and Yutaka Ishikawa fjens,msato,ishikawag@trc.rwcp.or.jp Tsukuba Research Center of the Real World Computing
More informationTASK FLOW GRAPH MAPPING TO "ABUNDANT" CLIQUE PARALLEL EXECUTION GRAPH CLUSTERING PARALLEL EXECUTION GRAPH MAPPING TO MAPPING HEURISTIC "LIMITED"
Parallel Processing Letters c World Scientic Publishing Company FUNCTIONAL ALGORITHM SIMULATION OF THE FAST MULTIPOLE METHOD: ARCHITECTURAL IMPLICATIONS MARIOS D. DIKAIAKOS Departments of Astronomy and
More informationWhat is linear programming (LP)? NATCOR Convex Optimization Linear Programming 1. Solving LP problems: The standard simplex method
NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 14 June 2016 What is linear programming (LP)? The most important model used in
More informationReduction of Huge, Sparse Matrices over Finite Fields Via Created Catastrophes
Reduction of Huge, Sparse Matrices over Finite Fields Via Created Catastrophes Carl Pomerance and J. W. Smith CONTENTS 1. Introduction 2. Description of the Method 3. Outline of Experiments 4. Conclusion
More informationVery Large-scale Linear. Programming: A Case Study. Exploiting Both Parallelism and. Distributed Memory. Anne Kilgore. December, 1993.
Very Large-scale Linear Programming: A Case Study Exploiting Both Parallelism and Distributed Memory Anne Kilgore CRPC-TR93354-S December, 1993 Center for Research on Parallel Computation Rice University
More informationwhich isaconvex optimization problem in the variables P = P T 2 R nn and x 2 R n+1. The algorithm used in [6] is based on solving this problem using g
Handling Nonnegative Constraints in Spectral Estimation Brien Alkire and Lieven Vandenberghe Electrical Engineering Department University of California, Los Angeles (brien@alkires.com, vandenbe@ee.ucla.edu)
More information8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1
Delay Abstraction in Combinational Logic Circuits Noriya Kobayashi Sharad Malik C&C Research Laboratories Department of Electrical Engineering NEC Corp. Princeton University Miyamae-ku, Kawasaki Japan
More information