Wei Shu and Min-You Wu. Abstract. partitioning patterns, and communication optimization to achieve a speedup.

Size: px
Start display at page:

Download "Wei Shu and Min-You Wu. Abstract. partitioning patterns, and communication optimization to achieve a speedup."

Transcription

1 Sparse Implementation of Revised Simplex Algorithms on Parallel Computers Wei Shu and Min-You Wu Abstract Parallelizing sparse simplex algorithms is one of the most challenging problems. Because of very sparse matrices and very heavy communication, the ratio of computation to communication is extremely low. It becomes necessary to carefully select parallel algorithms, partitioning patterns, and communication optimization to achieve a speedup. Two implementations on Intel hypercubes are presented in this paper. The experimental results show that a nearly linear speedup can be obtained with the basic revised simplex algorithm. However, the basic revised simplex algorithm has many ll-ins. We also implement a revised simplex algorithm with LU decomposition. It is a very sparse algorithm, and it is very dicult to achieve a speedup with it. 1 Introduction Linear programming is a fundamental problem in the eld of operations research. Many methods are available for solving the linear programming problems, in which the simplex method is the most widely used one for its simplicity and speed. Although it is well-known that the computational complexity of the simplex method is not polynomial in the number of equations, in practice it can quickly solve many linear programming problems. The interior-point algorithms are polynomial and potentially easy to parallelize [7, 1]. However, since the interior-point algorithm has not been well-developed yet, the simplex algorithm is still applied to most real applications. Given a linear programming problem, the simplex method starts from an initial feasible solution and moves toward the optimal solution. The execution is carried out iteratively. At each iteration, it improves the value of the objective function. This procedure will terminate after a nite number of iterations. The simplex method was rst designed in 1947 by Dantzig [3]. The revised simplex method is a modication of the original one which signicantly reduces the total number of calculations to be performed at each iteration [4]. In this paper, we will use the revised simplex algorithm to solve the linear programming problem. Parallel implementations of the linear programming algorithms have been studied on dierent machine architectures, including both distributed and shared memory parallel machines. Sheu et al. presented a mapping technique of linear programming problems on the BNN Buttery parallel computer [10]. The performance of parallel simplex algorithms was studied on the Sequent Balance shared memory machine and a 16-processor Transputer system from INMOS Corporation [12]. The parallel simplex algorithms for a loosely coupled message-passing based parallel systems were presented in [5]. The implementations This work appeared in the proceeding of the Sixth SIAM Conference on Parallel Processing for Scientic Computing, March 22-24, 1993, Norfolk. This research was partially supported by NSF grants CCR and CCR

2 2 Shu and Wu of simplex algorithm on the xed-size hypercubes was proposed by Ho et al. [6]. The performance of the simplex and revised simplex method on the Intel ipsc/2 hypercube was examined in [11]. All existing works are for dense algorithms and there is no sparse implementation. Parallel computers are used to solve large application problems. Large-scale linear programming problems are very sparse. Therefore, dense algorithms are not of practical value. We implemented two sparse algorithms. The rst one, the revised simplex algorithm, can be a good parallel algorithm but has many ll-ins. It is easy to parallelize and can yield good speedup. The second one, the revised simplex algorithm with LU decomposition [2], is one of the best sequential simplex algorithms. It is a sophisticated algorithm with few ll-ins, however, it is extremely dicult to parallelize and to obtain a speedup, especially on distributed memory computers. Actually, it is one of the most dicult algorithms we ever implemented. 2 Revised Simplex Algorithm The linear programming problems can be formulated as follows: solve Ax = b while minimizing z = c T x where A is a m n matrix representing m constraints, b is the right hand side vector, c is the coecient vector that denes the objective function z, and x is a column of n variables. A fundamental theorem of linear programming states that an optimal solution, if it exists, occurs when (n? m) components of x are set to zero. In most practical problems, some or all constraints may be specied by linear inequalities, which can be easily converted to linear equality constraints with the addition of nonnegative slack variables. Similarly, articial variables may be added to maintain x nonnegative [8]. We use the two-phase revised simplex method, in which the articial variables are driven to zero in the rst phase, and an optimal solution is obtained in the second phase. A basis consists of m linearly independent columns of matrix A, represented as B = [A j1 ; A j2 ; :::; A jm ]. A basic solution can be obtained by setting (n? m) components of x to zero and solving the resulting m m linear equations. These m components are called the basic variables; the rest of (n? m) components are called the nonbasic variables. Each set of basic variables represents a solution, but not necessarily the optimal one. To approach the optimal, we need to vary the basis by exchanging a basic variable with a nonbasic variable each time, such that the new basis represents a better solution than the one formed previously. The variable that enters the basis is called the entering variable, and the variable that leaves the basis is called the leaving variable. This process forms a kernel of the revised simplex method algorithm as shown in Fig 1. It will be iteratively repeated in both the phases. The slack and articial variables are commonly chosen as an initial basis, which always results in a basic feasible solution. 3 Parallelization of Revised Simplex Algorithm As shown in Fig. 1, the principle cost of determining the entering variable and the leaving variable is the same a vector-matrix multiplication followed by a search for minimum value. These two steps contribute to the major computation part, more than 98% of the total computation in large test data. The third step updates the basic variable and basis inverse. The parallelization of the revised simplex algorithm, therefore, turns to applying the parallel vector-matrix multiplication and minimum reduction repeatedly for each iteration.

3 Parallel Sparse Simplex Implementation 3 1. Determine the entering variable by using optimality condition, c 0 s = minfc 0 j ; j = 1; 2; : : :; n and x j is nonbasic g; where c 0 j = c j + If c 0 s 0, then the current value is optimal and algorithm stops. 2. Determine the leaving variable by using feasibility condition, b r y rs = minf bi y is ; i = 1; 2; : : :; m and y is > 0g; where Y s = y is ; i = 1; 2; : : :; m and Y s = B?1 A s If y is 0, then the problem is unbounded and algorithm stops. 3. The basic variable in row r will be replaced by variable x s and the new basis B = [A j1 ; : : :; A jr?1 ; A s ; A jr+1 ; : : :; A jm ] mx i=1 i a ij Update the basis inverse B?1, the vector of simplex multipliers, and the right hand side vector b by B?1 = B?1? Y s B?1 (r; :)=y rs =? c 0 s B?1 (r; :) b = b? Y s b r y rs Fig. 1. The Revised Simplex Algorithm A is a sparse matrix with two properties, (1) it is accessed through column exclusively, therefore, using a column-by-column storage scheme; and (2) it is read only during the entire computation. The nonzero elements of matrix A are packed into a one-dimensional array aa, where aa(k) is the kth nonzero element in A with column-major order. Another array ia, with the same size of aa, contains the row indices which correspond to the nonzero elements of A. In addition, a one-dimension array ja is used to point out the index of aa where each column of A begins. Matrix A is partitioned column-wise, due to the nature of multiplication of vector and matrix. If the row partitioning is chosen, a row-wise reduction will introduce heavy communication. Notice that the sparsity of matrix A is not evenly distributed. For example, the right portion of matrix A, consisting of slack and articial variables with only one nonzero element for each column, is usually more sparse compared to the left portion. In order to make computation density evenly distributed, a scatter column partitioning scheme is used [9]. Vector is duplicated in all nodes to eliminate communications while doing the multiplication. Besides, vector is scattered into a full-length vector of storage. Thus, the multiplication of vector and matrix A can be performed eciently because of (1) searching for i is not needed; (2) the amount of work performed depends only on the number of nonzero elements in matrix A. The entire multiplication is then performed in parallel without any communication involved, leaving the result vector c 0 distributed. A reduction is performed to nd out the minimal value from c 0. This completes the rst multiplication and reduction. Compared to the rst multiplication, the second one is a multiplication of a matrix to

4 4 Shu and Wu a vector, instead of a vector to a matrix. Matrix B?1 is then partitioned by row to avoid communications. Vector A s is also duplicated by broadcasting column s of A once the entering variable is selected. This broadcast serves as the only communication interface between the two steps and the rest is the same as in the rst one. In the third step, the basic variable in row r will be replaced by variable x s and the new basis B is given by B = [A j1 ; :::; A jr?1 ; A s ; A jr+1 ; :::; A jm ] The basis B is not physically represented. Instead, the basis inverse B?1 is manipulated and needs to be updated. This update may generate many ll-ins, therefore, the linked list scheme is used for B?1 storage. With the scatter partitioning technique, the load can be well balanced. The performance in Table 1 shows that a nearly linear speedup can be obtained with this algorithm. Table 1 Execution Time (in seconds) of the Revised Simplex Algorithm Matrix Number of Processors Size SHARE2B 97* SC * SC * BRANDY 221* BANDM 306* Revised Simplex Algorithm with LU decomposition The standard revised simplex algorithm mentioned above is based on constructing the basis inverse matrix B?1 and updating the inverse after every iteration. While this standard algorithm bares the advantages of its simplicity and easiness for parallelization, it is not the best one in terms of its sparsity and its round-o error behavior. This algorithm has many ll-ins in B?1. Large number of ll-ins not only increase the execution time, but also occupy too much memory space. Consequently, large test data cannot t in the memory. Another alternative is the method of Bartels and Golub that is based on the LU decomposition of the basis matrix B with row exchanges [2]. Compared to the standard one, the new one diers in the way to solve two linear equations: T B =?c T B and BY s = A s With the original Bartels-Golub algorithm used, the LU decomposition needs to be applied for the newly constructed basic matrix B in each iteration. This operation is computationally more expensive than the one used to update the basis inverse B?1 in the standard case. To compromise, a better implementation is to take the advantage of the fact that the newly constructed B k diers from the preceding B k?1 in only one column after k iterations. Here, we can construct an eta matrix E, which diers from the identity matrix in only one column, referred to as its eta column, to satisfy B k = B k?1 E k = B k?2 E k?1 E k = ::: = B 0 E 1 E 2 :::E k This eta factorization, therefore, suggests another way of solving the two linear equations at iteration k: (((( T B 0 )E 1 )E 2 ):::)E k =?c T B and B 0 (E 1 (E 2 (:::(E k Y s )))) = A s

5 Parallel Sparse Simplex Implementation 5 where B 0 will be decomposed into L and U. The E solver is a special case of the linear system solver, which consists of a sequence of k eta column modication to the solution vector. Assume we use an array p to reserve each eta column position in its original eta matrix. To solve the original eta matrix system, the lth modication can be described as follows, E l (E l+1 (:::(E k X))) = Z l?1 can be modied to E l+1 (:::(E k X)) = Z l where z l;p(l) = z l?1;p(l) =e l;p(l) and z l;i = z l?1;i? e l;i z l;p(l) ; i 6= p(l) After k modications, Z k is the resulting vector. However, the number of nonzero elements of E grows with the number of iterations. As k increases, the storage of all eta matrices becomes large and the time spent on the E solver is slow too. Therefore, after a certain number of iterations, say r, the basis B 0 needs to be refactorized [2]. That is, all the eta matrixes will be discarded. The current B k is treated as a new B 0 to be decomposed. Such an LU decomposition needs to be conducted every r iterations to keep the time and storage of E solver within an acceptable range. This improved algorithm can be summarized in Fig 2. In our implementation, r is set to Determine the entering variable by using optimality condition, c 0 s = minfc 0 j ; j = 1; 2; : : :; n and x j is nonbasic g; where c 0 j = c j + If c 0 s 0, then the current value is optimal and algorithm stops. mx i=1 i a ij 2. Solving B 0 (E 1 (E 2 (:::(E k Y s )))) = A s If y is 0 for i = 1; 2; : : :; m, then the problem is unbounded and algorithm stops. 3. Determine the leaving variable by using feasibility condition, b r = bi minf ; y is > 0 and i = 1; 2; : : :; mg y rs y is 4. The basic variable in row r will be replaced by variable x s and the new basis B = [A j1 ; : : :; A jr?1 ; A s ; A jr+1 ; : : :; A jm ] 5. If (k r) then treat B k as the new B 0 and decompose B 0 to get the new matrices L and U, and reset k to 1, else store E k+1 = Y s and increment k 6. Solving (((( T B 0 )E 1 )E 2 ):::)E k =?c T B Fig. 2. The Revised Simplex Algorithm with LU Decomposition 5 Parallelization of Revised Simplex Algorithm with LU Decomposition Parallelization of the simplex algorithm with LU decomposition is more dicult. As experiments indicated, the major computation cost is distributed among the matrix multiplication in step 1 of Fig 2; the linear system solvers and E solvers for Y s in step 2 and one for in step 6, respectively, and the LU decomposition in step 5 if applied. The parallelization of the matrix multiplication of step 1 is the same as the one in the standard algorithm, as well as the storage scheme and the data partitioning strategy. For the two linear system solvers, the basis matrix B 0 serves as the coecient matrix for r iterations before the next refactorization takes place. We do not expect a good

6 6 Shu and Wu speedup from the triangular solver, because the matrices L and U are very sparse and the computation time spent on the triangular solvers is small compared to the other steps. In consideration of scalability, we distribute the factorized triangular matrices L and U to minimize the overhead of 2r triangular solvers. For either L or U, partitioning by row is a simple way to reduce overhead. Furthermore, a block partitioning is used to increase the grainsize between two consecutive communications. Although a scatter partitioning scheme can improve load balance theoretically, a large number of communication, proportional to the number of rows m, will overwhelm this potential, especially for a sparse matrix case. Here, with a block-row partitioning scheme, the total number of communications for each time of triangular solver is equal to the number of physical processors used, independent of m. It is not conceivable to parallelize the E solver, because (i) the lth eta column modication depends on the results from the l?1th one; (ii) the modication of each element depends on the value of the diagonal element in its eta column; and (iii) a small amount of computation is involved in each modication. Due to such heavy data dependences and small grainsizes, the E solver, if executed in parallel, will not be benecial; therefore, it is currently executed sequentially and all the eta columns are stored in a single physical processor, with two assumptions: if time spent on this sequential part is too long, we can adjust the frequency of refactorization to make it under control. in consideration of the scalability, if needed, we can always distribute k eta columns and have the column modications executed in turn for each processor; The basis refactorization is performed every r iterations. A new algorithm was used to factorize the basis matrix eciently [13]. The idea is to take advantages of the fact that the basis matrix B is extremely sparse and has many column singletons and row singletons. As part of the matrix is phased out by rearranging singletons, the remaining matrix to be factorized may have new singletons generated. Therefore, the column or row exchange of singletons should be reapplied continuously until no more singleton exists. At that time, the rest of the matrix can be factorized in a regular manner. By the column and row exchanges of singletons, we can reduce the size of the submatrix that is actually factorized, and sometimes, eliminate the factorization completely. During the period of exchanging of singletons, rst, no ll-ins will be introduced since no updating; and secondly, searching for the singletons can be easily conducted in parallel since there is no updating, therefore, no dependency. In general, a simple application algorithm has only one single computation task on a single data domain. For this class of problems, the major data domain should be distributed such that the computation task can be executed most eectively in parallel. The LU decomposition, matrix multiplication, triangular solver, etc. belongs to this class. However, a complex application problem, such as the revised simplex method with LU decomposition, consists of many computation steps and many underlying data domains. Usually, these computation steps depend on each other. If we simply distribute the underlying data domains in consideration of parallelization of individual computation steps, it may ends up a mismatch between two underlying data domains. A global optimization may not be reached simply with local optimal data distributions for each step. Sometimes a sub-optimal local data distribution must be used to reduce interface overhead. There are two major interfaces in this algorithm. The rst one is the interface between matrix A and basis B. The current new basis is constructed by collecting m columns from matrix A. Remember that matrix A has been distributed with scatter-column partitioning.

7 Parallel Sparse Simplex Implementation 7 One way to construct matrix B is to extract the columns of the basic variables from A and redistribute them to obtain local optimal data distribution. However, data redistribution between dierent computation steps can be benecial only when it is crucial for performance of the next step. Here, the refactorization is not expected for a perfect speedup and the benet from redistribution is less than the cost of redistribution. Therefore, we leave the extracted columns in the physical processor where they reside. With the distribution inherited from matrix A, the basis matrix B is not necessarily evenly distributed among processors, leading to a sub-optimal data distribution. The second interface is between basis B and matrices L and U after refactorization. Since L and U are to be used in the following triangular solvers for 2r times, it is worthwhile to carry out the redistribution. The unbalanced scatter-column basis B is redistributed to a block-row scheme for L and U, which is an optimal scheme used for the triangular solver. Without redistribution, matrices L and U were distributed in column-scatter scheme and the communication cost in the following triangular solvers would be even worse. This algorithm is very sparse, therefore, dicult to parallelize due to small granularity. The triangular solvers and E solvers are hard to get speedup and redistribution involves in large communication overhead. The basic strategy used in this implementation is to apply the best possible parallelization techniques for the most computational dense part matrix multiplication to obtain maximum speedup. This part spends more than 70% of the total execution time and has speedup of around 7.5 on 8 processors. The LU decomposition part spends less than 10% of total time and rarely has a speedup. The triangular solvers and the E solvers spend about 10% of total time and the parallel execution time increases with the number of processors, dominating the overall performance. The performance is shown in Table 2. Compared it with Table 1, BRANDY and BANDM are executed much faster on a single processor, but have almost no speedup. When the number of processors is larger than 8, these two data sets run slower. For large data, such as SHIP08L, SCSD8, and SHIP12L, up to two-fold speedup can be obtained. More importantly, these data cannot be run with the basic revised simplex algorithm since they cannot t in the memory. Table 2 Execution Time (in seconds) of the Revised Simplex Algorithm with LU decomposition Matrix Number of Processors Size BRANDY 221* { { BANDM 306* SHIP08L 779* SCSD8 398* SHIP12L 1152* Conclusions Two revised simplex algorithms, with and without LU decomposition, have been implemented on the Intel ipsc/2 hypercube computers. The test data sets are from netlib. These data sets represent realistic problems in industry applications, ranging from smallscale to large-scale. The execution time is from a few seconds to thousands of seconds. Although the basic revised simplex algorithm is relative easier to implement, both of them need to be carefully tuned to get the best possible performance. We emphasized on sophis-

8 8 Shu and Wu ticated parallelization techniques, such as partitioning patterns, data distribution, communication reduction, and interface between computation steps. Our experience shows that for a complicated problem, each step of computation requires dierent data distribution. The interfaces between steps aect the decision of data distribution, and consequently, communications. In summary, parallelizing sparse simplex algorithms is not easy. The experimental results show that a nearly linear speedup can be obtained with the basic revised simplex algorithm. On the other hand, because of very sparse matrices and very heavy communication, the revised simplex algorithm with LU decomposition is hard to achieve good speedup even with carefully selected partitioning patterns and communication optimization. Acknowledgments The authors wish to thank Yong Li for his contribution on linear programming algorithms and sequential codes. References [1] G. Astfalk, I. Lusting, R. Marsten and D. Shanno, The Interior-Point Method for Linear Programming, IEEE Software, July, 1992, pages 61{67. [2] R. H. Bartels and G. H. Golub, The Simplex Method of Linear Programming Using LU Decomposition, Comm. ACM, 12, pages , [3] G. B. Dantzig. Linear Programming and Extensions, Princeton University Press, New Jersey, [4] G. B. Dantzig and W. Orchard-Hays, The product form for the inverse in the simplex method, Mathematical Tables and Other Aids to Computation, 8, pages 64-67, [5] R. A. Finkel, Large-grain Parallelism { Three Cases Studies, The Characteristics of Parallel Algorithms, L. H. Jamieson ed., The MIT Press, [6] H. F. Ho, G. H. Chen, S. H. Lin, and J. P. Sheu, Solving Linear Programming on Fixed-Size Hypercubes, pages , ICPP'88. [7] N. Karmarkar, A New Polynomial-Time Algorithm for Linear Programming, Combinatorica, Vol. 4, No. 8, 1984, pages [8] Y. Li, M. Wu, W. Shu and G. Fox, Linear Programming Algorithms and Parallel Implementations, SCCS Report 288, Syracuse University, May [9] D. M. Nicol and J. H. Saltz. An analysis of scatter decomposition. IEEE Trans. Computers, C{39(11):1337{1345, November [10] T. Sheu and W. Lin, Mapping Linear Programming Algorithms onto the Buttery Parallel Processor, [11] C. B. Stunkel and D. C. Reed, Hypercube Implementation of the Simplex Algorithm, Prof. 4th Conf. on Hypercube Concurrent Computer and Applications, March [12] Y. Wu and T. G. Lewis, Performance of Parallel Simplex Algorithms, Department of Computer Science, Oregon State University, [13] M. Wu, and Y. Li, Fast LU Decomposition for Sparse Simplex Method, SIAM Conference on Parallel Processing for Scientic Computing, March 1993.

Part 1. The Review of Linear Programming The Revised Simplex Method

Part 1. The Review of Linear Programming The Revised Simplex Method In the name of God Part 1. The Review of Linear Programming 1.4. Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Outline in Tableau Format Comparison Between the Simplex and the Revised Simplex

More information

DEGENERACY AND THE FUNDAMENTAL THEOREM

DEGENERACY AND THE FUNDAMENTAL THEOREM DEGENERACY AND THE FUNDAMENTAL THEOREM The Standard Simplex Method in Matrix Notation: we start with the standard form of the linear program in matrix notation: (SLP) m n we assume (SLP) is feasible, and

More information

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm

Part 4. Decomposition Algorithms Dantzig-Wolf Decomposition Algorithm In the name of God Part 4. 4.1. Dantzig-Wolf Decomposition Algorithm Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Introduction Real world linear programs having thousands of rows and columns.

More information

International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERA

International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERA International Journal of Foundations of Computer Science c World Scientic Publishing Company DFT TECHNIQUES FOR SIZE ESTIMATION OF DATABASE JOIN OPERATIONS KAM_IL SARAC, OMER E GEC_IO GLU, AMR EL ABBADI

More information

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure

More information

The Simplex Algorithm

The Simplex Algorithm The Simplex Algorithm Uri Feige November 2011 1 The simplex algorithm The simplex algorithm was designed by Danzig in 1947. This write-up presents the main ideas involved. It is a slight update (mostly

More information

An Improved Measurement Placement Algorithm for Network Observability

An Improved Measurement Placement Algorithm for Network Observability IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 4, NOVEMBER 2001 819 An Improved Measurement Placement Algorithm for Network Observability Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper

More information

Matrices. D. P. Koester, S. Ranka, and G. C. Fox. The Northeast Parallel Architectures Center (NPAC) Syracuse University

Matrices. D. P. Koester, S. Ranka, and G. C. Fox. The Northeast Parallel Architectures Center (NPAC) Syracuse University Parallel LU Factorization of Block-Diagonal-Bordered Sparse Matrices D. P. Koester, S. Ranka, and G. C. Fox School of Computer and Information Science and The Northeast Parallel Architectures Center (NPAC)

More information

Julian Hall School of Mathematics University of Edinburgh. June 15th Parallel matrix inversion for the revised simplex method - a study

Julian Hall School of Mathematics University of Edinburgh. June 15th Parallel matrix inversion for the revised simplex method - a study Parallel matrix inversion for the revised simplex method - A study Julian Hall School of Mathematics University of Edinburgh June 5th 006 Parallel matrix inversion for the revised simplex method - a study

More information

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini DM545 Linear and Integer Programming Lecture 2 The Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. 2. 3. 4. Standard Form Basic Feasible Solutions

More information

Scalability of Heterogeneous Computing

Scalability of Heterogeneous Computing Scalability of Heterogeneous Computing Xian-He Sun, Yong Chen, Ming u Department of Computer Science Illinois Institute of Technology {sun, chenyon1, wuming}@iit.edu Abstract Scalability is a key factor

More information

J.A.J.Hall, K.I.M.McKinnon. September 1996

J.A.J.Hall, K.I.M.McKinnon. September 1996 PARSMI, a parallel revised simplex algorithm incorporating minor iterations and Devex pricing J.A.J.Hall, K.I.M.McKinnon September 1996 MS 96-012 Supported by EPSRC research grant GR/J0842 Presented at

More information

Distributed Execution of Actor Programs. Gul Agha, Chris Houck and Rajendra Panwar W. Springeld Avenue. Urbana, IL 61801, USA

Distributed Execution of Actor Programs. Gul Agha, Chris Houck and Rajendra Panwar W. Springeld Avenue. Urbana, IL 61801, USA Distributed Execution of Actor Programs Gul Agha, Chris Houck and Rajendra Panwar Department of Computer Science 1304 W. Springeld Avenue University of Illinois at Urbana-Champaign Urbana, IL 61801, USA

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

1e+07 10^5 Node Mesh Step Number

1e+07 10^5 Node Mesh Step Number Implicit Finite Element Applications: A Case for Matching the Number of Processors to the Dynamics of the Program Execution Meenakshi A.Kandaswamy y Valerie E. Taylor z Rudolf Eigenmann x Jose' A. B. Fortes

More information

Large-scale Structural Analysis Using General Sparse Matrix Technique

Large-scale Structural Analysis Using General Sparse Matrix Technique Large-scale Structural Analysis Using General Sparse Matrix Technique Yuan-Sen Yang 1), Shang-Hsien Hsieh 1), Kuang-Wu Chou 1), and I-Chau Tsai 1) 1) Department of Civil Engineering, National Taiwan University,

More information

Notes for Lecture 18

Notes for Lecture 18 U.C. Berkeley CS17: Intro to CS Theory Handout N18 Professor Luca Trevisan November 6, 21 Notes for Lecture 18 1 Algorithms for Linear Programming Linear programming was first solved by the simplex method

More information

ASYNPLEX, an asynchronous parallel revised simplex algorithm J. A. J. Hall K. I. M. McKinnon 15 th July 1997 Abstract This paper describes ASYNPLEX, a

ASYNPLEX, an asynchronous parallel revised simplex algorithm J. A. J. Hall K. I. M. McKinnon 15 th July 1997 Abstract This paper describes ASYNPLEX, a ASYNPLEX, an asynchronous parallel revised simplex algorithm J.A.J. Hall K.I.M. McKinnon July 1997 MS 95-050a Supported by EPSRC research grant GR/J08942 Presented at APMOD95 Brunel University 3rd April

More information

Chapter 1. Reprinted from "Proc. 6th SIAM Conference on Parallel. Processing for Scientic Computing",Norfolk, Virginia (USA), March 1993.

Chapter 1. Reprinted from Proc. 6th SIAM Conference on Parallel. Processing for Scientic Computing,Norfolk, Virginia (USA), March 1993. Chapter 1 Parallel Sparse Matrix Vector Multiplication using a Shared Virtual Memory Environment Francois Bodin y Jocelyne Erhel y Thierry Priol y Reprinted from "Proc. 6th SIAM Conference on Parallel

More information

Dense Matrix Algorithms

Dense Matrix Algorithms Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication

More information

Report of Linear Solver Implementation on GPU

Report of Linear Solver Implementation on GPU Report of Linear Solver Implementation on GPU XIANG LI Abstract As the development of technology and the linear equation solver is used in many aspects such as smart grid, aviation and chemical engineering,

More information

Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning

Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Optimizing Parallel Sparse Matrix-Vector Multiplication by Corner Partitioning Michael M. Wolf 1,2, Erik G. Boman 2, and Bruce A. Hendrickson 3 1 Dept. of Computer Science, University of Illinois at Urbana-Champaign,

More information

3 INTEGER LINEAR PROGRAMMING

3 INTEGER LINEAR PROGRAMMING 3 INTEGER LINEAR PROGRAMMING PROBLEM DEFINITION Integer linear programming problem (ILP) of the decision variables x 1,..,x n : (ILP) subject to minimize c x j j n j= 1 a ij x j x j 0 x j integer n j=

More information

AN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1

AN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1 AN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1 Virgil Andronache Richard P. Simpson Nelson L. Passos Department of Computer Science Midwestern State University

More information

Exam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3

Exam Design and Analysis of Algorithms for Parallel Computer Systems 9 15 at ÖP3 UMEÅ UNIVERSITET Institutionen för datavetenskap Lars Karlsson, Bo Kågström och Mikael Rännar Design and Analysis of Algorithms for Parallel Computer Systems VT2009 June 2, 2009 Exam Design and Analysis

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

The only known methods for solving this problem optimally are enumerative in nature, with branch-and-bound being the most ecient. However, such algori

The only known methods for solving this problem optimally are enumerative in nature, with branch-and-bound being the most ecient. However, such algori Use of K-Near Optimal Solutions to Improve Data Association in Multi-frame Processing Aubrey B. Poore a and in Yan a a Department of Mathematics, Colorado State University, Fort Collins, CO, USA ABSTRACT

More information

Copyright 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin Introduction to the Design & Analysis of Algorithms, 2 nd ed., Ch.

Copyright 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin Introduction to the Design & Analysis of Algorithms, 2 nd ed., Ch. Iterative Improvement Algorithm design technique for solving optimization problems Start with a feasible solution Repeat the following step until no improvement can be found: change the current feasible

More information

However, m pq is just an approximation of M pq. As it was pointed out by Lin [2], more precise approximation can be obtained by exact integration of t

However, m pq is just an approximation of M pq. As it was pointed out by Lin [2], more precise approximation can be obtained by exact integration of t FAST CALCULATION OF GEOMETRIC MOMENTS OF BINARY IMAGES Jan Flusser Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodarenskou vez 4, 82 08 Prague 8, Czech

More information

COLUMN GENERATION IN LINEAR PROGRAMMING

COLUMN GENERATION IN LINEAR PROGRAMMING COLUMN GENERATION IN LINEAR PROGRAMMING EXAMPLE: THE CUTTING STOCK PROBLEM A certain material (e.g. lumber) is stocked in lengths of 9, 4, and 6 feet, with respective costs of $5, $9, and $. An order for

More information

Optimization of Design. Lecturer:Dung-An Wang Lecture 8

Optimization of Design. Lecturer:Dung-An Wang Lecture 8 Optimization of Design Lecturer:Dung-An Wang Lecture 8 Lecture outline Reading: Ch8 of text Today s lecture 2 8.1 LINEAR FUNCTIONS Cost Function Constraints 3 8.2 The standard LP problem Only equality

More information

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8.

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8. CZ4102 High Performance Computing Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations - Dr Tay Seng Chuan Reference: Introduction to Parallel Computing Chapter 8. 1 Topic Overview

More information

Implementations of Dijkstra's Algorithm. Based on Multi-Level Buckets. November Abstract

Implementations of Dijkstra's Algorithm. Based on Multi-Level Buckets. November Abstract Implementations of Dijkstra's Algorithm Based on Multi-Level Buckets Andrew V. Goldberg NEC Research Institute 4 Independence Way Princeton, NJ 08540 avg@research.nj.nec.com Craig Silverstein Computer

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

More information

Introduction. Linear because it requires linear functions. Programming as synonymous of planning.

Introduction. Linear because it requires linear functions. Programming as synonymous of planning. LINEAR PROGRAMMING Introduction Development of linear programming was among the most important scientific advances of mid-20th cent. Most common type of applications: allocate limited resources to competing

More information

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the Heap-on-Top Priority Queues Boris V. Cherkassky Central Economics and Mathematics Institute Krasikova St. 32 117418, Moscow, Russia cher@cemi.msk.su Andrew V. Goldberg NEC Research Institute 4 Independence

More information

AMATH 383 Lecture Notes Linear Programming

AMATH 383 Lecture Notes Linear Programming AMATH 8 Lecture Notes Linear Programming Jakob Kotas (jkotas@uw.edu) University of Washington February 4, 014 Based on lecture notes for IND E 51 by Zelda Zabinsky, available from http://courses.washington.edu/inde51/notesindex.htm.

More information

16.410/413 Principles of Autonomy and Decision Making

16.410/413 Principles of Autonomy and Decision Making 16.410/413 Principles of Autonomy and Decision Making Lecture 16: Mathematical Programming I Emilio Frazzoli Aeronautics and Astronautics Massachusetts Institute of Technology November 8, 2010 E. Frazzoli

More information

2 The Service Provision Problem The formulation given here can also be found in Tomasgard et al. [6]. That paper also details the background of the mo

2 The Service Provision Problem The formulation given here can also be found in Tomasgard et al. [6]. That paper also details the background of the mo Two-Stage Service Provision by Branch and Bound Shane Dye Department ofmanagement University of Canterbury Christchurch, New Zealand s.dye@mang.canterbury.ac.nz Asgeir Tomasgard SINTEF, Trondheim, Norway

More information

Y. Han* B. Narahari** H-A. Choi** University of Kentucky. The George Washington University

Y. Han* B. Narahari** H-A. Choi** University of Kentucky. The George Washington University Mapping a Chain Task to Chained Processors Y. Han* B. Narahari** H-A. Choi** *Department of Computer Science University of Kentucky Lexington, KY 40506 **Department of Electrical Engineering and Computer

More information

Matrix Multiplication on an Experimental Parallel System With Hybrid Architecture

Matrix Multiplication on an Experimental Parallel System With Hybrid Architecture Matrix Multiplication on an Experimental Parallel System With Hybrid Architecture SOTIRIOS G. ZIAVRAS and CONSTANTINE N. MANIKOPOULOS Department of Electrical and Computer Engineering New Jersey Institute

More information

y(b)-- Y[a,b]y(a). EQUATIONS ON AN INTEL HYPERCUBE*

y(b)-- Y[a,b]y(a). EQUATIONS ON AN INTEL HYPERCUBE* SIAM J. ScI. STAT. COMPUT. Vol. 12, No. 6, pp. 1480-1485, November 1991 ()1991 Society for Industrial and Applied Mathematics 015 SOLUTION OF LINEAR SYSTEMS OF ORDINARY DIFFERENTIAL EQUATIONS ON AN INTEL

More information

The Simplex Algorithm with a New. Primal and Dual Pivot Rule. Hsin-Der CHEN 3, Panos M. PARDALOS 3 and Michael A. SAUNDERS y. June 14, 1993.

The Simplex Algorithm with a New. Primal and Dual Pivot Rule. Hsin-Der CHEN 3, Panos M. PARDALOS 3 and Michael A. SAUNDERS y. June 14, 1993. The Simplex Algorithm with a New rimal and Dual ivot Rule Hsin-Der CHEN 3, anos M. ARDALOS 3 and Michael A. SAUNDERS y June 14, 1993 Abstract We present a simplex-type algorithm for linear programming

More information

A Fast Recursive Mapping Algorithm. Department of Computer and Information Science. New Jersey Institute of Technology.

A Fast Recursive Mapping Algorithm. Department of Computer and Information Science. New Jersey Institute of Technology. A Fast Recursive Mapping Algorithm Song Chen and Mary M. Eshaghian Department of Computer and Information Science New Jersey Institute of Technology Newark, NJ 7 Abstract This paper presents a generic

More information

Generalized Network Flow Programming

Generalized Network Flow Programming Appendix C Page Generalized Network Flow Programming This chapter adapts the bounded variable primal simplex method to the generalized minimum cost flow problem. Generalized networks are far more useful

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines

Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Ecient Implementation of Sorting Algorithms on Asynchronous Distributed-Memory Machines Zhou B. B., Brent R. P. and Tridgell A. y Computer Sciences Laboratory The Australian National University Canberra,

More information

THE simplex algorithm [1] has been popularly used

THE simplex algorithm [1] has been popularly used Proceedings of the International MultiConference of Engineers and Computer Scientists 207 Vol II, IMECS 207, March 5-7, 207, Hong Kong An Improvement in the Artificial-free Technique along the Objective

More information

Explore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan

Explore Co-clustering on Job Applications. Qingyun Wan SUNet ID:qywan Explore Co-clustering on Job Applications Qingyun Wan SUNet ID:qywan 1 Introduction In the job marketplace, the supply side represents the job postings posted by job posters and the demand side presents

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Homework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization

Homework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization ECE669: Parallel Computer Architecture Fall 2 Handout #2 Homework # 2 Due: October 6 Programming Multiprocessors: Parallelism, Communication, and Synchronization 1 Introduction When developing multiprocessor

More information

Distributed Execution of Actor Programs. Gul Agha, Chris Houck and Rajendra Panwar W. Springeld Avenue. Urbana, IL 61801, USA

Distributed Execution of Actor Programs. Gul Agha, Chris Houck and Rajendra Panwar W. Springeld Avenue. Urbana, IL 61801, USA Distributed Execution of Actor Programs Gul Agha, Chris Houck and Rajendra Panwar Department of Computer Science 1304 W. Springeld Avenue University of Illinois at Urbana-Champaign Urbana, IL 61801, USA

More information

Outline. CS38 Introduction to Algorithms. Linear programming 5/21/2014. Linear programming. Lecture 15 May 20, 2014

Outline. CS38 Introduction to Algorithms. Linear programming 5/21/2014. Linear programming. Lecture 15 May 20, 2014 5/2/24 Outline CS38 Introduction to Algorithms Lecture 5 May 2, 24 Linear programming simplex algorithm LP duality ellipsoid algorithm * slides from Kevin Wayne May 2, 24 CS38 Lecture 5 May 2, 24 CS38

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 3 Dense Linear Systems Section 3.3 Triangular Linear Systems Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign

More information

reasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap

reasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap Storage-Ecient Finite Field Basis Conversion Burton S. Kaliski Jr. 1 and Yiqun Lisa Yin 2 RSA Laboratories 1 20 Crosby Drive, Bedford, MA 01730. burt@rsa.com 2 2955 Campus Drive, San Mateo, CA 94402. yiqun@rsa.com

More information

Chapter II. Linear Programming

Chapter II. Linear Programming 1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION

More information

1 Introduction Complex decision problems related to economy, environment, business and engineering are multidimensional and have multiple and conictin

1 Introduction Complex decision problems related to economy, environment, business and engineering are multidimensional and have multiple and conictin A Scalable Parallel Algorithm for Multiple Objective Linear Programs Malgorzata M. Wiecek Hong Zhang y Abstract This paper presents an ADBASE-based parallel algorithm for solving multiple objective linear

More information

Sensor-Target and Weapon-Target Pairings Based on Auction Algorithm

Sensor-Target and Weapon-Target Pairings Based on Auction Algorithm Proceedings of the 11th WSEAS International Conference on APPLIED MATHEMATICS, Dallas, Texas, USA, March 22-24, 2007 92 Sensor-Target and Weapon-Target Pairings Based on Auction Algorithm Z. R. BOGDANOWICZ,

More information

2 ATTILA FAZEKAS The tracking model of the robot car The schematic picture of the robot car can be seen on Fig.1. Figure 1. The main controlling task

2 ATTILA FAZEKAS The tracking model of the robot car The schematic picture of the robot car can be seen on Fig.1. Figure 1. The main controlling task NEW OPTICAL TRACKING METHODS FOR ROBOT CARS Attila Fazekas Debrecen Abstract. In this paper new methods are proposed for intelligent optical tracking of robot cars the important tools of CIM (Computer

More information

A. Atamturk. G.L. Nemhauser. M.W.P. Savelsbergh. Georgia Institute of Technology. School of Industrial and Systems Engineering.

A. Atamturk. G.L. Nemhauser. M.W.P. Savelsbergh. Georgia Institute of Technology. School of Industrial and Systems Engineering. A Combined Lagrangian, Linear Programming and Implication Heuristic for Large-Scale Set Partitioning Problems 1 A. Atamturk G.L. Nemhauser M.W.P. Savelsbergh Georgia Institute of Technology School of Industrial

More information

What is the Worst Case Behavior of the Simplex Algorithm?

What is the Worst Case Behavior of the Simplex Algorithm? Centre de Recherches Mathématiques CRM Proceedings and Lecture Notes Volume, 28 What is the Worst Case Behavior of the Simplex Algorithm? Norman Zadeh Abstract. The examples published by Klee and Minty

More information

Accelerating the Iterative Linear Solver for Reservoir Simulation

Accelerating the Iterative Linear Solver for Reservoir Simulation Accelerating the Iterative Linear Solver for Reservoir Simulation Wei Wu 1, Xiang Li 2, Lei He 1, Dongxiao Zhang 2 1 Electrical Engineering Department, UCLA 2 Department of Energy and Resources Engineering,

More information

Some Advanced Topics in Linear Programming

Some Advanced Topics in Linear Programming Some Advanced Topics in Linear Programming Matthew J. Saltzman July 2, 995 Connections with Algebra and Geometry In this section, we will explore how some of the ideas in linear programming, duality theory,

More information

5.4 Pure Minimal Cost Flow

5.4 Pure Minimal Cost Flow Pure Minimal Cost Flow Problem. Pure Minimal Cost Flow Networks are especially convenient for modeling because of their simple nonmathematical structure that can be easily portrayed with a graph. This

More information

Performance Analysis of Distributed Iterative Linear Solvers

Performance Analysis of Distributed Iterative Linear Solvers Performance Analysis of Distributed Iterative Linear Solvers W.M. ZUBEREK and T.D.P. PERERA Department of Computer Science Memorial University St.John s, Canada A1B 3X5 Abstract: The solution of large,

More information

Stable sets, corner polyhedra and the Chvátal closure

Stable sets, corner polyhedra and the Chvátal closure Stable sets, corner polyhedra and the Chvátal closure Manoel Campêlo Departamento de Estatística e Matemática Aplicada, Universidade Federal do Ceará, Brazil, mcampelo@lia.ufc.br. Gérard Cornuéjols Tepper

More information

Civil Engineering Systems Analysis Lecture XV. Instructor: Prof. Naveen Eluru Department of Civil Engineering and Applied Mechanics

Civil Engineering Systems Analysis Lecture XV. Instructor: Prof. Naveen Eluru Department of Civil Engineering and Applied Mechanics Civil Engineering Systems Analysis Lecture XV Instructor: Prof. Naveen Eluru Department of Civil Engineering and Applied Mechanics Today s Learning Objectives Sensitivity Analysis Dual Simplex Method 2

More information

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering

Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering Parallel Algorithm for Multilevel Graph Partitioning and Sparse Matrix Ordering George Karypis and Vipin Kumar Brian Shi CSci 8314 03/09/2017 Outline Introduction Graph Partitioning Problem Multilevel

More information

Memory Hierarchy Management for Iterative Graph Structures

Memory Hierarchy Management for Iterative Graph Structures Memory Hierarchy Management for Iterative Graph Structures Ibraheem Al-Furaih y Syracuse University Sanjay Ranka University of Florida Abstract The increasing gap in processor and memory speeds has forced

More information

Mathematics and Computer Science

Mathematics and Computer Science Technical Report TR-2006-010 Revisiting hypergraph models for sparse matrix decomposition by Cevdet Aykanat, Bora Ucar Mathematics and Computer Science EMORY UNIVERSITY REVISITING HYPERGRAPH MODELS FOR

More information

Chapter 3: Towards the Simplex Method for Efficient Solution of Linear Programs

Chapter 3: Towards the Simplex Method for Efficient Solution of Linear Programs Chapter 3: Towards the Simplex Method for Efficient Solution of Linear Programs The simplex method, invented by George Dantzig in 1947, is the basic workhorse for solving linear programs, even today. While

More information

Parallelizing the dual revised simplex method

Parallelizing the dual revised simplex method Parallelizing the dual revised simplex method Qi Huangfu 1 Julian Hall 2 1 FICO 2 School of Mathematics, University of Edinburgh Birmingham 9 September 2016 Overview Background Two parallel schemes Single

More information

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3 6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require

More information

ECEN 615 Methods of Electric Power Systems Analysis Lecture 11: Sparse Systems

ECEN 615 Methods of Electric Power Systems Analysis Lecture 11: Sparse Systems ECEN 615 Methods of Electric Power Systems Analysis Lecture 11: Sparse Systems Prof. Tom Overbye Dept. of Electrical and Computer Engineering Texas A&M University overbye@tamu.edu Announcements Homework

More information

Linear Programming Problems

Linear Programming Problems Linear Programming Problems Two common formulations of linear programming (LP) problems are: min Subject to: 1,,, 1,2,,;, max Subject to: 1,,, 1,2,,;, Linear Programming Problems The standard LP problem

More information

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material

More information

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742 Availability of Coding Based Replication Schemes Gagan Agrawal Department of Computer Science University of Maryland College Park, MD 20742 Abstract Data is often replicated in distributed systems to improve

More information

A Novel Method for Power-Flow Solution of Radial Distribution Networks

A Novel Method for Power-Flow Solution of Radial Distribution Networks A Novel Method for Power-Flow Solution of Radial Distribution Networks 1 Narinder Singh, 2 Prof. Rajni Bala 1 Student-M.Tech(Power System), 2 Professor(Power System) BBSBEC, Fatehgarh Sahib, Punjab Abstract

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Graphs that have the feasible bases of a given linear

Graphs that have the feasible bases of a given linear Algorithmic Operations Research Vol.1 (2006) 46 51 Simplex Adjacency Graphs in Linear Optimization Gerard Sierksma and Gert A. Tijssen University of Groningen, Faculty of Economics, P.O. Box 800, 9700

More information

Distance (*40 ft) Depth (*40 ft) Profile A-A from SEG-EAEG salt model

Distance (*40 ft) Depth (*40 ft) Profile A-A from SEG-EAEG salt model Proposal for a WTOPI Research Consortium Wavelet Transform On Propagation and Imaging for seismic exploration Ru-Shan Wu Modeling and Imaging Project, University of California, Santa Cruz August 27, 1996

More information

NATCOR Convex Optimization Linear Programming 1

NATCOR Convex Optimization Linear Programming 1 NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 5 June 2018 What is linear programming (LP)? The most important model used in

More information

Design and Analysis of Algorithms (V)

Design and Analysis of Algorithms (V) Design and Analysis of Algorithms (V) An Introduction to Linear Programming Guoqiang Li School of Software, Shanghai Jiao Tong University Homework Assignment 2 is announced! (deadline Apr. 10) Linear Programming

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

(Sparse) Linear Solvers

(Sparse) Linear Solvers (Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert

More information

Optimistic Message Logging for Independent Checkpointing. in Message-Passing Systems. Yi-Min Wang and W. Kent Fuchs. Coordinated Science Laboratory

Optimistic Message Logging for Independent Checkpointing. in Message-Passing Systems. Yi-Min Wang and W. Kent Fuchs. Coordinated Science Laboratory Optimistic Message Logging for Independent Checkpointing in Message-Passing Systems Yi-Min Wang and W. Kent Fuchs Coordinated Science Laboratory University of Illinois at Urbana-Champaign Abstract Message-passing

More information

Lecture 13: March 25

Lecture 13: March 25 CISC 879 Software Support for Multicore Architectures Spring 2007 Lecture 13: March 25 Lecturer: John Cavazos Scribe: Ying Yu 13.1. Bryan Youse-Optimization of Sparse Matrix-Vector Multiplication on Emerging

More information

Technical Report TR , Computer and Information Sciences Department, University. Abstract

Technical Report TR , Computer and Information Sciences Department, University. Abstract An Approach for Parallelizing any General Unsymmetric Sparse Matrix Algorithm Tariq Rashid y Timothy A.Davis z Technical Report TR-94-036, Computer and Information Sciences Department, University of Florida,

More information

CMPSCI611: The Simplex Algorithm Lecture 24

CMPSCI611: The Simplex Algorithm Lecture 24 CMPSCI611: The Simplex Algorithm Lecture 24 Let s first review the general situation for linear programming problems. Our problem in standard form is to choose a vector x R n, such that x 0 and Ax = b,

More information

Linear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization?

Linear and Integer Programming :Algorithms in the Real World. Related Optimization Problems. How important is optimization? Linear and Integer Programming 15-853:Algorithms in the Real World Linear and Integer Programming I Introduction Geometric Interpretation Simplex Method Linear or Integer programming maximize z = c T x

More information

PARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES

PARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES PARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES Zhou B. B. and Brent R. P. Computer Sciences Laboratory Australian National University Canberra, ACT 000 Abstract We describe

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

The driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above

The driving motivation behind the design of the Janus framework is to provide application-oriented, easy-to-use and ecient abstractions for the above Janus a C++ Template Library for Parallel Dynamic Mesh Applications Jens Gerlach, Mitsuhisa Sato, and Yutaka Ishikawa fjens,msato,ishikawag@trc.rwcp.or.jp Tsukuba Research Center of the Real World Computing

More information

TASK FLOW GRAPH MAPPING TO "ABUNDANT" CLIQUE PARALLEL EXECUTION GRAPH CLUSTERING PARALLEL EXECUTION GRAPH MAPPING TO MAPPING HEURISTIC "LIMITED"

TASK FLOW GRAPH MAPPING TO ABUNDANT CLIQUE PARALLEL EXECUTION GRAPH CLUSTERING PARALLEL EXECUTION GRAPH MAPPING TO MAPPING HEURISTIC LIMITED Parallel Processing Letters c World Scientic Publishing Company FUNCTIONAL ALGORITHM SIMULATION OF THE FAST MULTIPOLE METHOD: ARCHITECTURAL IMPLICATIONS MARIOS D. DIKAIAKOS Departments of Astronomy and

More information

What is linear programming (LP)? NATCOR Convex Optimization Linear Programming 1. Solving LP problems: The standard simplex method

What is linear programming (LP)? NATCOR Convex Optimization Linear Programming 1. Solving LP problems: The standard simplex method NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 14 June 2016 What is linear programming (LP)? The most important model used in

More information

Reduction of Huge, Sparse Matrices over Finite Fields Via Created Catastrophes

Reduction of Huge, Sparse Matrices over Finite Fields Via Created Catastrophes Reduction of Huge, Sparse Matrices over Finite Fields Via Created Catastrophes Carl Pomerance and J. W. Smith CONTENTS 1. Introduction 2. Description of the Method 3. Outline of Experiments 4. Conclusion

More information

Very Large-scale Linear. Programming: A Case Study. Exploiting Both Parallelism and. Distributed Memory. Anne Kilgore. December, 1993.

Very Large-scale Linear. Programming: A Case Study. Exploiting Both Parallelism and. Distributed Memory. Anne Kilgore. December, 1993. Very Large-scale Linear Programming: A Case Study Exploiting Both Parallelism and Distributed Memory Anne Kilgore CRPC-TR93354-S December, 1993 Center for Research on Parallel Computation Rice University

More information

which isaconvex optimization problem in the variables P = P T 2 R nn and x 2 R n+1. The algorithm used in [6] is based on solving this problem using g

which isaconvex optimization problem in the variables P = P T 2 R nn and x 2 R n+1. The algorithm used in [6] is based on solving this problem using g Handling Nonnegative Constraints in Spectral Estimation Brien Alkire and Lieven Vandenberghe Electrical Engineering Department University of California, Los Angeles (brien@alkires.com, vandenbe@ee.ucla.edu)

More information

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1 Delay Abstraction in Combinational Logic Circuits Noriya Kobayashi Sharad Malik C&C Research Laboratories Department of Electrical Engineering NEC Corp. Princeton University Miyamae-ku, Kawasaki Japan

More information