PAPER Design of Optimal Array Processors for Two-Step Division-Free Gaussian Elimination

Size: px
Start display at page:

Download "PAPER Design of Optimal Array Processors for Two-Step Division-Free Gaussian Elimination"

Transcription

1 1503 PAPER Design of Optimal Array Processors for Two-Step Division-Free Gaussian Elimination Shietung PENG and Stanislav G. SEDUKHIN Nonmembers SUMMARY The design of array processors for solving linear systems using two-step division-free Gaussian elimination method is considered. The two-step method can be used to improve the systems based on the one-step method in terms of numerical stability as well as the requirements for high-precision. In spite of the rather complicated computations needed at each iteration of the two-step method we develop an innovative parallel algorithm whose data dependency graph meets the requirements for regularity and locality. Then we derive two-dimensional array processors by adopting a systematic approach to investigate the set of all admissible solutions and obtain the optimal array processors under linear time-space scheduling. The array processors is optimal in terms of the number of processing elements used. key words: linear system parallel algorithm parallel architecture systolic array processors 1. Introduction In the area of high-performance computing the design of division-free array processors has been attracting research attention recently [5] [6] [8] [10]. The main reason is that the division unit in high-performance special-purpose processors are both time and space consuming. Besides as far as numerical stability is concerned the cumulative effect of roundoff error for division operations also makes division-free algorithms much more appealing. The one-step division-free Gaussian elimination method was used in [8] [10] to develop optimal array processors. This method may drive the system to highprecision requirements and instability as described in [1] [3]. The problem arises from the fact that the absolute values of elements of the updated matrix increase rapidly and eventually reduce the numerical stability of the algorithm for large input matrices. To increase numerical stability Bareiss proposed a multistep division-free Gaussian elimination method and showed that the multi-step method gives better numerical stability than the one-step method [1]. Because the formulation of the multi-step method is irregular and complicated it is a nontrivial task to design an efficient parallel algorithm and a corresponding array processor based on this method. In this paper we design a parallel algorithm of the two-step method through proper partitioning and Manuscript received July Manuscript revised June The authors are with University of Aizu Aizu- Wakamatsu-shi Japan. re-indexing. In the design process we show how to circumvent the irregularity of the original algorithm stepby-step. Then highly-parallel array processors based on this parallel algorithm are systematically designed and analyzed. Two optimal 2-D array processors in terms of the number of PEs are shown in this paper. The key features of these array processors are (1) division-free and (2) the numerical stability enhanced by the twostep method compared with its counterpart based on the one-step method. The rest of the paper is organized into five sections. Section 2 gives some background for the twostep division-free Gaussian elimination method. A new parallel algorithm based on the two-step method is presented in Sect. 3. In Sect. 4 the 3-D data dependency graph of the new algorithm and its analysis are given. In Sect. 5 two optimal 2-D array processors based on the new algorithm and a systematic approach [9] are described and their performance is investigated. Finally in the last section concluding remarks and further research directions are discussed. 2. Two-Step Division-Free Gaussian Elimination Let a generalized linear system of equations be given by AX B where A [a ] 1 i j n B [b ] 1 i n n +1 j m and X [x ] 1 i n 1 j m n. To solve AX B the matrix A should be reduced to diagonal form or triangular form with subsequent back substitution. In general systolization of the algorithm which reduces A to diagonal form is more difficult than to a triangular one. In this paper we consider the algorithm to reduce the matrix A to diagonal form using two-step Gaussian elimination method. The simplest (one-step) division-free algorithm for diagonalization is given by the following recurrence formula: a 1 i j n; b 1 i n n +1 j m; if i k; otherwise; 1 k n 1 i n k j m;

2 1504 A is a determinant of matrix A; ii ii 1 k n 1 i k 1. (1) Notice that 0 for i j < k and i j. T he advantage of this formula is the absence of division operations. Hence the division-free algorithm avoids division round-off error and thus is more numerically stable than the classical Gaussian elimination algorithm. However the one-step equation (1) suffers from rapid increase of absolute values of the updated matrix eventually requires high-precision computation for large input matrices. To circumvent this problem a multistep approach has been proposed by Bareiss [1]. The equations for the two-step division-free Gaussian elimination is given as follows: a 1 i n 1 j m; if i k or k 1; a(k 1) if i k; if i k 1; n k i n k 1 j m; 2 ii ii n k i k 2; 2 if n is odd a (n) a (n 1) nn a (n 1) nj a (n 1) in a (n 1) 1 i n 1 n+1 j m. (2) It is instructive to obtain Eq. (2) directly from (1) by applying the one-step equation twice and simplifying the result. The simplified result can be expressed as follows:. (3) Disregarding the factor in Eq. (3) yields (2). Therefore the coefficients of (2) are smaller by a factor and in addition can be obtained from more efficiently than those of (1) because some terms cancel and need not be calculated. The effect of the above transformation on the matrix [ ] can be described as follows. The first equation of (2) for when formally extended to j k 1 and j k reduces the elements i > k and i<k 1 to zero for two columns and leaves the elements and a(k 2) unchanged. It remains to transform and a(k 2) to zero. However once the elements i > k have been determined the element can be transformed to zero by (1) for updating as shown in (2). Similarly the element can be transformed to zero by (1) for updating. Notice that if n is odd then additional computations to transform the element a (n 2) in i n to zero by (1) for a (n 1) i n n j m are needed. 3. A Parallel Algorithm Regularity is one of the most important factors for designing parallel algorithms and architectures. In [3] it was shown that the irregularity makes the parallelism hard to expose. In [7] Kung stated the importance of the regularity and modularity for designing systolic arrays. In this paper we define explicitly the regularity of a parallel algorithm. Intuitively a regular parallel algorithm is the one that can be used to construct array processors systematically. The definition of the regularity of an indexed algorithm is based on the data dependency graph (DDG) of the algorithm defined below (also see [8]). Index space of an indexed algorithm is the set of all index points p (i j k) T in 3-D case where each index point is associated with a single computation. Data dependency vector (DDV) is the difference between the index point where a variable is used as input variable and the index point where that variable is generated as output variable. DDG of an indexed algorithm is an directed graph with index points as vertices and DDVs as edges. A parallel algorithm is regular if its DDG satisfies the following two conditions: 1) there do not exist any opposite edges in any dimension; 2) there does not exist any cycle. The main concerns in the design of a regular parallel algorithm using the two-step algorithm are the computations for i k or k 1 in Eq. (2) since it involves quite complicated computation patterns. We will show how to reformulate this part of computations step-bystep by partitioning and re-indexing techniques.

3 PENG and SEDUKHIN: DESIGN OF OPTIMAL ARRAY PROCESSORS 1505 First to simplify the computation and to reduce the number of multiplications three indexed variables c(k 2) are introduced to hold the intermediate results of the computations in the two-step method. We assume that an indexed variable v k is held at index point p (i j k) T. + c(k 2) 1 i n i k 1 i k k +1 j m. (4) and c(k 2) in (4) for k 2. Since the data item i1 is needed for computing c (0) i2 and the data item a(0) i2 is needed for computing c (0) there are bidirectional edges between index Consider the computations of i1 points p i (i 1 0) T and q i (i 2 0) T 1 i n i 1 i 2. That is there are cycles in the corresponding DDG. To solve this problem we divide the computations into two layers and put the variables and in points (i k k 1) T and (i k k) T respectively to eliminate the cycles. Therefore the first step of our algorithm is to re-index the variables and into c (k) and c(k 1) respectively and partition the computations at each iteration into two layers as following to guarantee that the corresponding DDG is acyclic. Notice that after this partitioning the computation at each index point p (i j k) T of the algorithm involves up to two multiplications and one addition/subtraction. c (k) + c (k) c(k 1) where ; 1 i n i k 1 i k k +1 j m. (5) Next to eliminate the opposite edges in i and j directions in the DDG of Eq. (5) we adopt the similar technique used in [8]. We shift the (k 1)th and kth rows to the (n + k 1)th and (n + k)th rows in the (k 1)th and kth layers respectively. The resulting algorithm is shown in Eq. (6). k +1 i n + k 2; k +1 i n + k 2 j>k; k +1 i n + k 2; c (k) + c(k 1) c (k) where. (6) Then we properly distribute the other one-step computations in Eq. (2) into the (k 1)th and/or kth layers. If n is odd then 2 n 2 2n 1. In this case the nth layer is constructed using the one-step algorithm. Finally to make the diagonal elements a (n) ii next to the elements a (n) in+1 for effective data transmission and computations at the last step of the algorithm we rearrange the initial data to reserve the (n + 1)th column as free space and shift the (k 1)th and kth diagonal elements into this column in the (k 1)th and kth layers respectively. After diagonalization of matrix A one more step (a division step) is used to get the solution of the generalized linear systems. A complete description of the regular two-step division-free (2SDF) algorithm for solving generalized linear systems AX B is shown below. Algorithm 2SDF begin /*Initialization*/ forall 1 i n 1 j n do a ; /*Reserve the (n + 1)th column as free space*/ forall 1 i n n +2 j m +1do a ij 1 ; /*Internal computations*/ for k n 2 do begin k /*Computation at the (k 1)th layer*/ a(k 2) a(k 2) ; (a) type

4 1506 forall k +1 i n + k 2 do a(k 2) a(k 2) kk ; (b) type n+k 1n+1 a(k 2) ; forall k 1 i n + k 2 j k 1 or k do ; forall k +1 i n + k 2 k+1 j m +1 j n +1do + c(k 1) ; (c) type forall k +1 j m +1j n +1do begin a(k 1) a(k 1) ; (d) type n+ a(k 2) kk ; end /*Computation at the kth layer*/ forall k +1 i n + k 2 do c (k) a(k 1) a(k 1) (e) type a(k 1) ; (f) type n+kn+1 c(k 1) ; forall k +1 j m +1j n +1do n+kj a(k) ; forall k +1 i n + k 2 k+1 j m +1 j n +1do c (k) forall n +1 i n + k 2 do a(k 1) ; (g) type /*Computation for ii in+1 c(k 1) in+1 ; forall k j m +1j n +1do n+ a(k 1) kj ii */ (h) type ; (i) type 4. Data Dependency Graph of Algorithm 2SDF In this section we derive a localized DDG of Algorithm 2SDF and then analyze it. First we need a strategy of data pipeline for shared variables so that computations in Algorithm 2SDF involve only local data transfer. The strategy is straightforward and the resulting localized DDG are shown in Figs. 1 and 2 for the (k 1)th and kth layers respectively. If n is an odd integer then one extra layer (the nth layer) is needed. This extra layer of the localized DDG for odd n is shown in Fig. 3. Finally the (n + 1)th layer of the DDG for the output computations is shown in Fig. 4. From Algorithm 2SDF it is easy to see that nine different types of nodes are needed for computations in the (k 1)th and kth layers. The function and the end k /*Computation at the nth layer when n is an odd integer*/ if n is odd then forall n<i 2n 1 n+1<j m +1do begin a (n) a (n 1) nn a (n 1) a (n 1) nj a (n 1) in ; /*Computation for a (n) ii a (n) in+1 a(n 1) nn a (n 1) in+1 ; a (n 1) nn a (n 1) ii */ end /*Output computations*/ forall n +1 i 2n n +2 j m +1do end x (n+1) i nj n 1 a(n) ij /a(n) in+1 ; Fig. 1 The (k 1)th layer of DDG. Fig. 2 The kth layer of DDG.

5 PENG and SEDUKHIN: DESIGN OF OPTIMAL ARRAY PROCESSORS 1507 location of each type of nodes in the DDG are listed as follows (see Algorithm 2SDF and Fig. 5). Type (a) node: compute (k k k 1) T ; Type (b) nodes: compute (i k k 1) T k+1 i n + k 2; Type (c) nodes: compute (i j k 1) T k +1 i n + k 2 k +1 j m +1j n +1; Type (d) nodes: compute (k j k 1) T k+1 j m +1j n +1; Type (e) nodes: compute n+ (n + k 1jk 1) T k+1 j m +1j n +1; Type (f) nodes: compute c (k) (i k k) T k+1 i n + k 2; Type (g) nodes: compute (i j k) T k +1 i n + k 2 k +1 j m +1j n +1; Type (h) nodes: compute in+1 (i n +1k) T k+1 i n + k 2; Type (i) nodes: compute n+ (n+k) T k+1 j m+1j n+1. Notice that empty nodes are included in the DDG for proper local transmission of shared data. Let the index space of the algorithm be P. Then Fig. 3 The nth layer of DDG if n is odd. Fig. 4 The (n+1)th layer of the DDG for output computation. Fig. 5 The nine types of the nodes at the (k 1)th and kth layers of the DDG.

6 1508 P can be expressed as P in P int P out where P in P int P out are the index sets of input internal and output computations respectively. From Algorithm 2SDF and the localized DDG we have where P in {(i j 0) T 1 i n 1 j m +1 j n +1} Z 2 {0}; n k (P k 1 P k ) 1 k 2 k 2k P int if n is even; n k (P k 1 P k ) P odd 1 k 2 k 2k otherwise P k 1 {(i j k 1) T k 1 i n + k 1 k 1 j m +1} P k {(i j k) T k i n + k k 1 j m +1} P odd {(i j n) T n i 2n n j m +1}; P out {(i j n +1) T n +1 i 2n n +1 j m +1} Z 2 {n +1}. Next we show that the number of multiplications in Algorithm 2SDF is N 3n 2 (2m n)/4+o(n 2 + mn). It is easy to see that the number of multiplications in Algorithm 2SDF is dominated by the computation of (c) and (g) types That is the number of multiplications in the computation of all other types is O(n 2 + mn). The computation of (c) and (g) types have the same number of iterations for each fixed k which is (n 2)(m k) with 3 multiplications together each iteration. Therefore the total number of multiplications for the computation of (c) and (g) types is n/2 3(n 2)(m 2l) l1 n/2 3(n 2)nm/2 6(n 2) l l1 3n 2 m/2 3n 3 /4+O(mn + n 2 ). This completes the proof. The longest path in the DDG is ρ p min p max where p min (1 1 1) T and p max (2n m +1n+1) T. Therefore we have ρ 3n + m 1 where ρ the length of the path ρ is the Manhattan distance between p min and p max. A timing (step) function step(p): P int Z + which assigns a computational time step to each index point p P int is defined as follows step(p) i + j + k 3. The step function can also be specified in the linear form as step(p) λ p + γ where λ (1 1 1) and γ 3. This function defines a set of hyperplanes orthogonal to the schedule vector λ on the index space of the algorithm. Equal values of timing function are shown by dashed lines in Figs The minimal computation time of the algorithm is T (m n) T min (DDG) step(p max ) 3n + m 1 assuming step(p min ) 0. From the discussion above we have T (m n) ρ. The allocation function place(p): Z 3 Z 2 is defined in the linear form: place(p) Λ η p where Λ η is a (2 3) matrix of the linear transformation corresponding to a projection vector η ker Λ η. Notice that for obtaining the correct input/output data flows in 2-D array processors we have to find new positions of input/output matrix elements in the 3-D index space i.e. redefine P in and P out domains [9]. 5. Design of Optimal Array Processors Many 2-D array processors can be derived by projecting the localized 3-D DDG along different admissible projection vectors. A projection vector η is admissible if and only if λ η 0. This condition guarantees that each PE executes at most one computation (one index point in the DDG) at any given time step. In order to find an optimal design of 2-D array processor from the given localized DDG we need to select from the space of the all admissible solutions based on some integrated criteria such as number of PEs data pipelining period computation time array topology number of I/O ports etc. In this paper for the given algorithm (and its DDG) we say that an array processor is optimal if (1) it uses minimum number of PEs among all array processors obtained by admissible projections and (2) its total computation time equals to the length of the longest path in the DDG (3n + m 1 in this case). It can be shown that for the localized DDG we adopted there are 13 admissible projections from 17 possible ones [9]. After having obtained and tested all admissible array processors it can be shown that two solutions generated by projecting the localized DDG along i axis and j axis have the minimal number of PEs. Each of the two array processors holds minimal number of PEs in certain range of m and n. We will discuss this in the rest of the section. The 2-D array processor S (010) that is generated by mapping the localized DDG along η 1 (0 1 0) T direction is shown in Fig. 6 for the case of n 5 and m 7. This array processor is an array of PEs and programmable register-latches (PRLs). The number of

7 PENG and SEDUKHIN: DESIGN OF OPTIMAL ARRAY PROCESSORS 1509 Fig. 6 The optimal array processor S (010) and its PE types. PEs in this array processor is N (010) P n 2 + n 2 if n is even; n (2n 1) otherwise 2 which is independent of m. It is easy to verify from the DDG and the selected projection that there are eight types of PEs and four types of PRLs. The eight types of PEs are depicted in Figs. 6 (a) (h). Functional description of each type of PE can be obtained easily from Figs. 5 (a) (i) as well as projection of Figs. 1 4 along direction η 1. The rhombic array processor S (010) simulates the 3-D DDG without time extension i.e. it solves a single task in time T (010) (1nm)T min (DDG) 3n + m 1. The data pipelining period defined as the time interval separating the neighboring items of input or output data is α λ η 1 1 and the block pipelining period defined as the time interval between the initiations of two successive task instances is β m + 1 i.e. the next task can be pushed into this array processor after m + 1 time steps. The number of I/O ports is 2n. Also it is evident that for l tasks T (010) (l n m) T (010) (1mn)+(l 1)β 3n + l(m +1) 2. Another optimal design can be obtained by mapping the DDG along direction η 2 (1 0 0) T. T he corresponding array processor PE types and input/output data flows are shown in Fig. 7 for the case of n 5 and m 7. The number of PEs of the array is mn n2 N (100) P 2 + m n if n is even 2 n (2m n) otherwise. 2 There are eight types of PEs (see Figs. 7 (a) (h)) and only one type of register-latches. It is not difficult to show that N (100) P N (010) P { } 3n 1 if m min 3n2 +2n 3n (n +1) 2 It means that array processor S (100) is optimal in term of number of PEs if m 3n 1 2. The functions carried in each of the eight types of

8 1510 Fig. 7 The optimal array processor S (100) and its PE types. PEs are relatively simple considering the complexity of Algorithm 2SDF. The computations carried on each time step are no more than two multiplications and one addition/subtraction. The total computation time of this array is T (100) (1mn)T min (DDG) 3n + m 1. The data pipelining period α equals to λ η 2 1. The block pipelining period β equals to m i.e. for l tasks T (100) (l n m) 3n + lm Concluding Remarks The design of new array processors in this paper shows that a numerical algorithm which involves rather high computational irregularity can be arranged to perform pipelined computations in an elegant way. The array processors designed in this paper is a step towards designing efficient special-purpose array processors based on an irregular algorithm namely two-step divisionfree Gaussian elimination method. The techniques used in our design should be applicable to other numerical algorithms with rather complicated structure. For example with minor modification the techniques used here can also be used to develop array processors for two-step fraction-free (integer-preserving) solutions for linear systems with integer coefficients. Another direction for further research is to consider asynchronous array models. The asynchronous arrays e.g. wavefront array processors [7] are good alternatives for efficiency and flexibility of massively parallel processing. Finally to map the proposed parallel algorithm effectively onto existing massively parallel computers using proper partitioning techniques is also worth further research [2]. Acknowledgment We would le to thank the reviewers for their constructive comments to improve the quality of this paper.

9 PENG and SEDUKHIN: DESIGN OF OPTIMAL ARRAY PROCESSORS 1511 References [1] E.H. Bareiss Sylvester s identity and multistep integerpreserving Gaussian elimination Mathematics of Computation vol.22 pp [2] X. Chen and G.M. Megson A general methodology of partitioning and mapping for given regular arrays IEEE Trans. Parallel & Distributed Systems vol.6 no.10 pp [3] G.C. Fox R.D. Williams and P.C. Messina Parallel Computing Works Morgan Kaufmann Publishers San Francisco [4] L. Fox An Introduction to Numerical Linear Algebra Clarendon Press Oxford [5] E.N. Frantzeskakis and K.J.R. Liu A class of square root and division-free algorithms and architectures for QRDbased adaptive signal processing IEEE Trans. Signal Processing vol.42 no.9 pp [6] J. Gotze and J. Schwegelshohn A square root and division-free Givens rotation for solving least squares problems on systolic arrays SIAM J. Sci. Statist. Comput. vol.12 pp July [7] S.Y. Kung VLSI Array Processors Prentice Hall [8] S. Peng and S.G. Sedukhin Array processors design for division-free linear system solving The Computer Journal vol.39 no.8 pp [9] S.G. Sedukhin and I.S. Sedukhin Systematic approach and software tool for systolic design Lecture Notes in Computer Science vol.854 pp [10] S.G. Sedukhin An algorithm and array processors for solving the systems of linear equations Proc. Intern. Conf. Parallel and Distributed Processing Techniques and Applications (PDPTA 95) pp Athens Georgia Nov Stanislav G. Sedukhin is a professor of computer science at University of Aizu. His research interests are in distributed and parallel high-performance computing and communications parallel algorithms architectural synthesis of the applicationspecific VLSI-processors. He received his Candidate of Sciences (Ph.D.) and Doctor of Physical & Mathematical Sciences (Dr.Sci.) in Computer Science from the Russian (former USSR) Academy of Sciences in 1982 and 1993 respectively. Prof. Sedukhin is a member of the IEEE Computer Society ACM SIAM and AMS. Shietung Peng received his M.S. and Ph.D. degrees in Computer Science from the University of Texas in 1984 and 1986 respectively. He was with the University of Maryland from 1986 to He is currently with University of Aizu. His research interests include parallel and distributed processing parallel algorithms parallel architectures and highperformance computing. He is a member of ACM and IEEE Computer Society.

Dense Matrix Algorithms

Dense Matrix Algorithms Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication

More information

Linear Arrays. Chapter 7

Linear Arrays. Chapter 7 Linear Arrays Chapter 7 1. Basics for the linear array computational model. a. A diagram for this model is P 1 P 2 P 3... P k b. It is the simplest of all models that allow some form of communication between

More information

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure

More information

Chapter 18. Geometric Operations

Chapter 18. Geometric Operations Chapter 18 Geometric Operations To this point, the image processing operations have computed the gray value (digital count) of the output image pixel based on the gray values of one or more input pixels;

More information

Realization of Hardware Architectures for Householder Transformation based QR Decomposition using Xilinx System Generator Block Sets

Realization of Hardware Architectures for Householder Transformation based QR Decomposition using Xilinx System Generator Block Sets IJSTE - International Journal of Science Technology & Engineering Volume 2 Issue 08 February 2016 ISSN (online): 2349-784X Realization of Hardware Architectures for Householder Transformation based QR

More information

FINDING MINIMUM COST SPANNING TREE ON BIDIRECTIONAL LINEAR SYSTOLIC ARRAY

FINDING MINIMUM COST SPANNING TREE ON BIDIRECTIONAL LINEAR SYSTOLIC ARRAY Faculty of Sciences and Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.yu/filomat Filomat 23: (2009), 2 FINDING MINIMUM COST SPANNING TREE ON BIDIRECTIONAL LINEAR SYSTOLIC ARRAY

More information

Chapter 8 Dense Matrix Algorithms

Chapter 8 Dense Matrix Algorithms Chapter 8 Dense Matrix Algorithms (Selected slides & additional slides) A. Grama, A. Gupta, G. Karypis, and V. Kumar To accompany the text Introduction to arallel Computing, Addison Wesley, 23. Topic Overview

More information

CE 601: Numerical Methods Lecture 5. Course Coordinator: Dr. Suresh A. Kartha, Associate Professor, Department of Civil Engineering, IIT Guwahati.

CE 601: Numerical Methods Lecture 5. Course Coordinator: Dr. Suresh A. Kartha, Associate Professor, Department of Civil Engineering, IIT Guwahati. CE 601: Numerical Methods Lecture 5 Course Coordinator: Dr. Suresh A. Kartha, Associate Professor, Department of Civil Engineering, IIT Guwahati. Elimination Methods For a system [A]{x} = {b} where [A]

More information

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8.

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8. CZ4102 High Performance Computing Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations - Dr Tay Seng Chuan Reference: Introduction to Parallel Computing Chapter 8. 1 Topic Overview

More information

An Improved Measurement Placement Algorithm for Network Observability

An Improved Measurement Placement Algorithm for Network Observability IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 4, NOVEMBER 2001 819 An Improved Measurement Placement Algorithm for Network Observability Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper

More information

Matrix Multiplication on an Experimental Parallel System With Hybrid Architecture

Matrix Multiplication on an Experimental Parallel System With Hybrid Architecture Matrix Multiplication on an Experimental Parallel System With Hybrid Architecture SOTIRIOS G. ZIAVRAS and CONSTANTINE N. MANIKOPOULOS Department of Electrical and Computer Engineering New Jersey Institute

More information

Numerical Algorithms

Numerical Algorithms Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0

More information

A Comparative study on Algorithms for Shortest-Route Problem and Some Extensions

A Comparative study on Algorithms for Shortest-Route Problem and Some Extensions International Journal of Basic & Applied Sciences IJBAS-IJENS Vol: No: 0 A Comparative study on Algorithms for Shortest-Route Problem and Some Extensions Sohana Jahan, Md. Sazib Hasan Abstract-- The shortest-route

More information

Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

More information

Implementation Of Quadratic Rotation Decomposition Based Recursive Least Squares Algorithm

Implementation Of Quadratic Rotation Decomposition Based Recursive Least Squares Algorithm 157 Implementation Of Quadratic Rotation Decomposition Based Recursive Least Squares Algorithm Manpreet Singh 1, Sandeep Singh Gill 2 1 University College of Engineering, Punjabi University, Patiala-India

More information

Twiddle Factor Transformation for Pipelined FFT Processing

Twiddle Factor Transformation for Pipelined FFT Processing Twiddle Factor Transformation for Pipelined FFT Processing In-Cheol Park, WonHee Son, and Ji-Hoon Kim School of EECS, Korea Advanced Institute of Science and Technology, Daejeon, Korea icpark@ee.kaist.ac.kr,

More information

Vertex Magic Total Labelings of Complete Graphs 1

Vertex Magic Total Labelings of Complete Graphs 1 Vertex Magic Total Labelings of Complete Graphs 1 Krishnappa. H. K. and Kishore Kothapalli and V. Ch. Venkaiah Centre for Security, Theory, and Algorithmic Research International Institute of Information

More information

For example, the system. 22 may be represented by the augmented matrix

For example, the system. 22 may be represented by the augmented matrix Matrix Solutions to Linear Systems A matrix is a rectangular array of elements. o An array is a systematic arrangement of numbers or symbols in rows and columns. Matrices (the plural of matrix) may be

More information

The transition: Each student passes half his store of candies to the right. students with an odd number of candies eat one.

The transition: Each student passes half his store of candies to the right. students with an odd number of candies eat one. Kate s problem: The students are distributed around a circular table The teacher distributes candies to all the students, so that each student has an even number of candies The transition: Each student

More information

Parallel Implementations of Gaussian Elimination

Parallel Implementations of Gaussian Elimination s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n

More information

CS 450 Numerical Analysis. Chapter 7: Interpolation

CS 450 Numerical Analysis. Chapter 7: Interpolation Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

Matrix multiplication

Matrix multiplication Matrix multiplication Standard serial algorithm: procedure MAT_VECT (A, x, y) begin for i := 0 to n - 1 do begin y[i] := 0 for j := 0 to n - 1 do y[i] := y[i] + A[i, j] * x [j] end end MAT_VECT Complexity:

More information

Procedures for Folding Transformations

Procedures for Folding Transformations Procedures for Folding Transformations Marjan Gušev 1 and David J. Evans 2 1 Kiril i Metodij University, PMF Informatika, p.f.162, 91000 Skopje, Macedonia 2 PARC, University of Technology, Loughborough,

More information

Prefix Computation and Sorting in Dual-Cube

Prefix Computation and Sorting in Dual-Cube Prefix Computation and Sorting in Dual-Cube Yamin Li and Shietung Peng Department of Computer Science Hosei University Tokyo - Japan {yamin, speng}@k.hosei.ac.jp Wanming Chu Department of Computer Hardware

More information

Vertex Magic Total Labelings of Complete Graphs

Vertex Magic Total Labelings of Complete Graphs AKCE J. Graphs. Combin., 6, No. 1 (2009), pp. 143-154 Vertex Magic Total Labelings of Complete Graphs H. K. Krishnappa, Kishore Kothapalli and V. Ch. Venkaiah Center for Security, Theory, and Algorithmic

More information

Finite element algorithm with adaptive quadtree-octree mesh refinement

Finite element algorithm with adaptive quadtree-octree mesh refinement ANZIAM J. 46 (E) ppc15 C28, 2005 C15 Finite element algorithm with adaptive quadtree-octree mesh refinement G. P. Nikishkov (Received 18 October 2004; revised 24 January 2005) Abstract Certain difficulties

More information

Estimating normal vectors and curvatures by centroid weights

Estimating normal vectors and curvatures by centroid weights Computer Aided Geometric Design 21 (2004) 447 458 www.elsevier.com/locate/cagd Estimating normal vectors and curvatures by centroid weights Sheng-Gwo Chen, Jyh-Yang Wu Department of Mathematics, National

More information

An Efficient Method for Solving the Direct Kinematics of Parallel Manipulators Following a Trajectory

An Efficient Method for Solving the Direct Kinematics of Parallel Manipulators Following a Trajectory An Efficient Method for Solving the Direct Kinematics of Parallel Manipulators Following a Trajectory Roshdy Foaad Abo-Shanab Kafr Elsheikh University/Department of Mechanical Engineering, Kafr Elsheikh,

More information

NATCOR Convex Optimization Linear Programming 1

NATCOR Convex Optimization Linear Programming 1 NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 5 June 2018 What is linear programming (LP)? The most important model used in

More information

REDUCING GRAPH COLORING TO CLIQUE SEARCH

REDUCING GRAPH COLORING TO CLIQUE SEARCH Asia Pacific Journal of Mathematics, Vol. 3, No. 1 (2016), 64-85 ISSN 2357-2205 REDUCING GRAPH COLORING TO CLIQUE SEARCH SÁNDOR SZABÓ AND BOGDÁN ZAVÁLNIJ Institute of Mathematics and Informatics, University

More information

Parallel and perspective projections such as used in representing 3d images.

Parallel and perspective projections such as used in representing 3d images. Chapter 5 Rotations and projections In this chapter we discuss Rotations Parallel and perspective projections such as used in representing 3d images. Using coordinates and matrices, parallel projections

More information

On the Number of Tilings of a Square by Rectangles

On the Number of Tilings of a Square by Rectangles University of Tennessee, Knoxville Trace: Tennessee Research and Creative Exchange University of Tennessee Honors Thesis Projects University of Tennessee Honors Program 5-2012 On the Number of Tilings

More information

Calculation of extended gcd by normalization

Calculation of extended gcd by normalization SCIREA Journal of Mathematics http://www.scirea.org/journal/mathematics August 2, 2018 Volume 3, Issue 3, June 2018 Calculation of extended gcd by normalization WOLF Marc, WOLF François, LE COZ Corentin

More information

LARP / 2018 ACK : 1. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates

LARP / 2018 ACK : 1. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates Triangular Factors and Row Exchanges LARP / 28 ACK :. Linear Algebra and Its Applications - Gilbert Strang 2. Autar Kaw, Transforming Numerical Methods Education for STEM Graduates Then there were three

More information

Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider

Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider Tamkang Journal of Science and Engineering, Vol. 3, No., pp. 29-255 (2000) 29 Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider Jen-Shiun Chiang, Hung-Da Chung and Min-Show

More information

What is linear programming (LP)? NATCOR Convex Optimization Linear Programming 1. Solving LP problems: The standard simplex method

What is linear programming (LP)? NATCOR Convex Optimization Linear Programming 1. Solving LP problems: The standard simplex method NATCOR Convex Optimization Linear Programming 1 Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk 14 June 2016 What is linear programming (LP)? The most important model used in

More information

SOLVING SYSTEMS OF LINEAR INTERVAL EQUATIONS USING THE INTERVAL EXTENDED ZERO METHOD AND MULTIMEDIA EXTENSIONS

SOLVING SYSTEMS OF LINEAR INTERVAL EQUATIONS USING THE INTERVAL EXTENDED ZERO METHOD AND MULTIMEDIA EXTENSIONS Please cite this article as: Mariusz Pilarek, Solving systems of linear interval equations using the "interval extended zero" method and multimedia extensions, Scientific Research of the Institute of Mathematics

More information

ARITHMETIC operations based on residue number systems

ARITHMETIC operations based on residue number systems IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 2, FEBRUARY 2006 133 Improved Memoryless RNS Forward Converter Based on the Periodicity of Residues A. B. Premkumar, Senior Member,

More information

Application of Two-dimensional Periodic Cellular Automata in Image Processing

Application of Two-dimensional Periodic Cellular Automata in Image Processing International Journal of Computer, Mathematical Sciences and Applications Serials Publications Vol. 5, No. 1-2, January-June 2011, pp. 49 55 ISSN: 0973-6786 Application of Two-dimensional Periodic Cellular

More information

Advance Convergence Characteristic Based on Recycling Buffer Structure in Adaptive Transversal Filter

Advance Convergence Characteristic Based on Recycling Buffer Structure in Adaptive Transversal Filter Advance Convergence Characteristic ased on Recycling uffer Structure in Adaptive Transversal Filter Gwang Jun Kim, Chang Soo Jang, Chan o Yoon, Seung Jin Jang and Jin Woo Lee Department of Computer Engineering,

More information

MOST attention in the literature of network codes has

MOST attention in the literature of network codes has 3862 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 8, AUGUST 2010 Efficient Network Code Design for Cyclic Networks Elona Erez, Member, IEEE, and Meir Feder, Fellow, IEEE Abstract This paper introduces

More information

ENERGY-EFFICIENT VLSI REALIZATION OF BINARY64 DIVISION WITH REDUNDANT NUMBER SYSTEMS 1 AVANIGADDA. NAGA SANDHYA RANI

ENERGY-EFFICIENT VLSI REALIZATION OF BINARY64 DIVISION WITH REDUNDANT NUMBER SYSTEMS 1 AVANIGADDA. NAGA SANDHYA RANI ENERGY-EFFICIENT VLSI REALIZATION OF BINARY64 DIVISION WITH REDUNDANT NUMBER SYSTEMS 1 AVANIGADDA. NAGA SANDHYA RANI 2 BALA KRISHNA.KONDA M.Tech, Assistant Professor 1,2 Eluru College Of Engineering And

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Implementation of Two Level DWT VLSI Architecture

Implementation of Two Level DWT VLSI Architecture V. Revathi Tanuja et al Int. Journal of Engineering Research and Applications RESEARCH ARTICLE OPEN ACCESS Implementation of Two Level DWT VLSI Architecture V. Revathi Tanuja*, R V V Krishna ** *(Department

More information

CS 204 Lecture Notes on Elementary Network Analysis

CS 204 Lecture Notes on Elementary Network Analysis CS 204 Lecture Notes on Elementary Network Analysis Mart Molle Department of Computer Science and Engineering University of California, Riverside CA 92521 mart@cs.ucr.edu October 18, 2006 1 First-Order

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

Project Report. 1 Abstract. 2 Algorithms. 2.1 Gaussian elimination without partial pivoting. 2.2 Gaussian elimination with partial pivoting

Project Report. 1 Abstract. 2 Algorithms. 2.1 Gaussian elimination without partial pivoting. 2.2 Gaussian elimination with partial pivoting Project Report Bernardo A. Gonzalez Torres beaugonz@ucsc.edu Abstract The final term project consist of two parts: a Fortran implementation of a linear algebra solver and a Python implementation of a run

More information

Identifying Layout Classes for Mathematical Symbols Using Layout Context

Identifying Layout Classes for Mathematical Symbols Using Layout Context Rochester Institute of Technology RIT Scholar Works Articles 2009 Identifying Layout Classes for Mathematical Symbols Using Layout Context Ling Ouyang Rochester Institute of Technology Richard Zanibbi

More information

Error Detecting and Correcting Code Using Orthogonal Latin Square Using Verilog HDL

Error Detecting and Correcting Code Using Orthogonal Latin Square Using Verilog HDL Error Detecting and Correcting Code Using Orthogonal Latin Square Using Verilog HDL Ch.Srujana M.Tech [EDT] srujanaxc@gmail.com SR Engineering College, Warangal. M.Sampath Reddy Assoc. Professor, Department

More information

Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras. Lecture - 36

Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras. Lecture - 36 Finite Element Analysis Prof. Dr. B. N. Rao Department of Civil Engineering Indian Institute of Technology, Madras Lecture - 36 In last class, we have derived element equations for two d elasticity problems

More information

Synthesis of Constrained nr Planar Robots to Reach Five Task Positions

Synthesis of Constrained nr Planar Robots to Reach Five Task Positions Robotics: Science and Systems 007 Atlanta, GA, USA, June 7-30, 007 Synthesis of Constrained nr Planar Robots to Reach Five Task Positions Gim Song Soh Robotics and Automation Laboratory University of California

More information

REGULAR GRAPHS OF GIVEN GIRTH. Contents

REGULAR GRAPHS OF GIVEN GIRTH. Contents REGULAR GRAPHS OF GIVEN GIRTH BROOKE ULLERY Contents 1. Introduction This paper gives an introduction to the area of graph theory dealing with properties of regular graphs of given girth. A large portion

More information

DECOMPOSITION is one of the important subjects in

DECOMPOSITION is one of the important subjects in Proceedings of the Federated Conference on Computer Science and Information Systems pp. 561 565 ISBN 978-83-60810-51-4 Analysis and Comparison of QR Decomposition Algorithm in Some Types of Matrix A. S.

More information

Generalized Network Flow Programming

Generalized Network Flow Programming Appendix C Page Generalized Network Flow Programming This chapter adapts the bounded variable primal simplex method to the generalized minimum cost flow problem. Generalized networks are far more useful

More information

On Algebraic Expressions of Generalized Fibonacci Graphs

On Algebraic Expressions of Generalized Fibonacci Graphs On Algebraic Expressions of Generalized Fibonacci Graphs MARK KORENBLIT and VADIM E LEVIT Department of Computer Science Holon Academic Institute of Technology 5 Golomb Str, PO Box 305, Holon 580 ISRAEL

More information

15. The Software System ParaLab for Learning and Investigations of Parallel Methods

15. The Software System ParaLab for Learning and Investigations of Parallel Methods 15. The Software System ParaLab for Learning and Investigations of Parallel Methods 15. The Software System ParaLab for Learning and Investigations of Parallel Methods... 1 15.1. Introduction...1 15.2.

More information

Distributed Detection in Sensor Networks: Connectivity Graph and Small World Networks

Distributed Detection in Sensor Networks: Connectivity Graph and Small World Networks Distributed Detection in Sensor Networks: Connectivity Graph and Small World Networks SaeedA.AldosariandJoséM.F.Moura Electrical and Computer Engineering Department Carnegie Mellon University 5000 Forbes

More information

Lecture Notes on Karger s Min-Cut Algorithm. Eric Vigoda Georgia Institute of Technology Last updated for Advanced Algorithms, Fall 2013.

Lecture Notes on Karger s Min-Cut Algorithm. Eric Vigoda Georgia Institute of Technology Last updated for Advanced Algorithms, Fall 2013. Lecture Notes on Karger s Min-Cut Algorithm. Eric Vigoda Georgia Institute of Technology Last updated for 4540 - Advanced Algorithms, Fall 2013. Today s topic: Karger s min-cut algorithm [3]. Problem Definition

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

Deduction and Logic Implementation of the Fractal Scan Algorithm

Deduction and Logic Implementation of the Fractal Scan Algorithm Deduction and Logic Implementation of the Fractal Scan Algorithm Zhangjin Chen, Feng Ran, Zheming Jin Microelectronic R&D center, Shanghai University Shanghai, China and Meihua Xu School of Mechatronical

More information

Transformations: 2D Transforms

Transformations: 2D Transforms 1. Translation Transformations: 2D Transforms Relocation of point WRT frame Given P = (x, y), translation T (dx, dy) Then P (x, y ) = T (dx, dy) P, where x = x + dx, y = y + dy Using matrix representation

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 21 Outline 1 Course

More information

PARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES

PARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES PARALLEL COMPUTATION OF THE SINGULAR VALUE DECOMPOSITION ON TREE ARCHITECTURES Zhou B. B. and Brent R. P. Computer Sciences Laboratory Australian National University Canberra, ACT 000 Abstract We describe

More information

Adaptive Surface Modeling Using a Quadtree of Quadratic Finite Elements

Adaptive Surface Modeling Using a Quadtree of Quadratic Finite Elements Adaptive Surface Modeling Using a Quadtree of Quadratic Finite Elements G. P. Nikishkov University of Aizu, Aizu-Wakamatsu 965-8580, Japan niki@u-aizu.ac.jp http://www.u-aizu.ac.jp/ niki Abstract. This

More information

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy

On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy On Massively Parallel Algorithms to Track One Path of a Polynomial Homotopy Jan Verschelde joint with Genady Yoffe and Xiangcheng Yu University of Illinois at Chicago Department of Mathematics, Statistics,

More information

Geometric and Thematic Integration of Spatial Data into Maps

Geometric and Thematic Integration of Spatial Data into Maps Geometric and Thematic Integration of Spatial Data into Maps Mark McKenney Department of Computer Science, Texas State University mckenney@txstate.edu Abstract The map construction problem (MCP) is defined

More information

Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model

Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model 608 IEEE TRANSACTIONS ON MAGNETICS, VOL. 39, NO. 1, JANUARY 2003 Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model Tsai-Sheng Kao and Mu-Huo Cheng Abstract

More information

Computer-aided design and visualization of regular algorithm dependence graphs and processor array architectures

Computer-aided design and visualization of regular algorithm dependence graphs and processor array architectures Computer-aided design and visualization of regular algorithm dependence graphs and processor array architectures Oleg Maslennikow, Natalia Maslennikowa, Przemysław Sołtan Department of Electronics Technical

More information

Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications

Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications 46 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.3, March 2008 Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications

More information

Systems of Linear Equations and their Graphical Solution

Systems of Linear Equations and their Graphical Solution Proceedings of the World Congress on Engineering and Computer Science Vol I WCECS, - October,, San Francisco, USA Systems of Linear Equations and their Graphical Solution ISBN: 98-988-95-- ISSN: 8-958

More information

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter A.S. Sneka Priyaa PG Scholar Government College of Technology Coimbatore ABSTRACT The Least Mean Square Adaptive Filter is frequently

More information

An Efficient List-Ranking Algorithm on a Reconfigurable Mesh with Shift Switching

An Efficient List-Ranking Algorithm on a Reconfigurable Mesh with Shift Switching IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.6, June 2007 209 An Efficient List-Ranking Algorithm on a Reconfigurable Mesh with Shift Switching Young-Hak Kim Kumoh National

More information

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies

ISSN (Online), Volume 1, Special Issue 2(ICITET 15), March 2015 International Journal of Innovative Trends and Emerging Technologies VLSI IMPLEMENTATION OF HIGH PERFORMANCE DISTRIBUTED ARITHMETIC (DA) BASED ADAPTIVE FILTER WITH FAST CONVERGENCE FACTOR G. PARTHIBAN 1, P.SATHIYA 2 PG Student, VLSI Design, Department of ECE, Surya Group

More information

A Novel Methodology for Designing Radix-2 n Serial-Serial Multipliers

A Novel Methodology for Designing Radix-2 n Serial-Serial Multipliers Journal of Computer Science 6 (4): 461-469, 21 ISSN 1549-3636 21 Science Publications A Novel Methodology for Designing Radix-2 n Serial-Serial Multipliers Abdurazzag Sulaiman Almiladi Department of Computer

More information

A DH-parameter based condition for 3R orthogonal manipulators to have 4 distinct inverse kinematic solutions

A DH-parameter based condition for 3R orthogonal manipulators to have 4 distinct inverse kinematic solutions Wenger P., Chablat D. et Baili M., A DH-parameter based condition for R orthogonal manipulators to have 4 distinct inverse kinematic solutions, Journal of Mechanical Design, Volume 17, pp. 150-155, Janvier

More information

The Recursive Dual-net and its Applications

The Recursive Dual-net and its Applications The Recursive Dual-net and its Applications Yamin Li 1, Shietung Peng 1, and Wanming Chu 2 1 Department of Computer Science Hosei University Tokyo 184-8584 Japan {yamin, speng}@k.hosei.ac.jp 2 Department

More information

A New Architecture for Multihop Optical Networks

A New Architecture for Multihop Optical Networks A New Architecture for Multihop Optical Networks A. Jaekel 1, S. Bandyopadhyay 1 and A. Sengupta 2 1 School of Computer Science, University of Windsor Windsor, Ontario N9B 3P4 2 Dept. of Computer Science,

More information

Speed-up of Parallel Processing of Divisible Loads on k-dimensional Meshes and Tori

Speed-up of Parallel Processing of Divisible Loads on k-dimensional Meshes and Tori The Computer Journal, 46(6, c British Computer Society 2003; all rights reserved Speed-up of Parallel Processing of Divisible Loads on k-dimensional Meshes Tori KEQIN LI Department of Computer Science,

More information

A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and

A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and Parallel Processing Letters c World Scientific Publishing Company A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS DANNY KRIZANC Department of Computer Science, University of Rochester

More information

Performance Analysis of Gray Code based Structured Regular Column-Weight Two LDPC Codes

Performance Analysis of Gray Code based Structured Regular Column-Weight Two LDPC Codes IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 11, Issue 4, Ver. III (Jul.-Aug.2016), PP 06-10 www.iosrjournals.org Performance Analysis

More information

Q. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R.

Q. Wang National Key Laboratory of Antenna and Microwave Technology Xidian University No. 2 South Taiba Road, Xi an, Shaanxi , P. R. Progress In Electromagnetics Research Letters, Vol. 9, 29 38, 2009 AN IMPROVED ALGORITHM FOR MATRIX BANDWIDTH AND PROFILE REDUCTION IN FINITE ELEMENT ANALYSIS Q. Wang National Key Laboratory of Antenna

More information

A Quantitative Approach for Textural Image Segmentation with Median Filter

A Quantitative Approach for Textural Image Segmentation with Median Filter International Journal of Advancements in Research & Technology, Volume 2, Issue 4, April-2013 1 179 A Quantitative Approach for Textural Image Segmentation with Median Filter Dr. D. Pugazhenthi 1, Priya

More information

Pipeline Givens sequences for computing the QR decomposition on a EREW PRAM q

Pipeline Givens sequences for computing the QR decomposition on a EREW PRAM q Parallel Computing 32 (2006) 222 230 www.elsevier.com/locate/parco Pipeline Givens sequences for computing the QR decomposition on a EREW PRAM q Marc Hofmann a, *, Erricos John Kontoghiorghes b,c a Institut

More information

A NON-TRIGONOMETRIC, PSEUDO AREA PRESERVING, POLYLINE SMOOTHING ALGORITHM

A NON-TRIGONOMETRIC, PSEUDO AREA PRESERVING, POLYLINE SMOOTHING ALGORITHM A NON-TRIGONOMETRIC, PSEUDO AREA PRESERVING, POLYLINE SMOOTHING ALGORITHM Wayne Brown and Leemon Baird Department of Computer Science The United States Air Force Academy 2354 Fairchild Dr., Suite 6G- USAF

More information

THE DESIGN OF STRUCTURED REGULAR LDPC CODES WITH LARGE GIRTH. Haotian Zhang and José M. F. Moura

THE DESIGN OF STRUCTURED REGULAR LDPC CODES WITH LARGE GIRTH. Haotian Zhang and José M. F. Moura THE DESIGN OF STRUCTURED REGULAR LDPC CODES WITH LARGE GIRTH Haotian Zhang and José M. F. Moura Department of Electrical and Computer Engineering Carnegie Mellon University, Pittsburgh, PA 523 {haotian,

More information

Fixed Point LMS Adaptive Filter with Low Adaptation Delay

Fixed Point LMS Adaptive Filter with Low Adaptation Delay Fixed Point LMS Adaptive Filter with Low Adaptation Delay INGUDAM CHITRASEN MEITEI Electronics and Communication Engineering Vel Tech Multitech Dr RR Dr SR Engg. College Chennai, India MR. P. BALAVENKATESHWARLU

More information

Scan-Based BIST Diagnosis Using an Embedded Processor

Scan-Based BIST Diagnosis Using an Embedded Processor Scan-Based BIST Diagnosis Using an Embedded Processor Kedarnath J. Balakrishnan and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer Engineering University of Texas

More information

Quaternion Rotations AUI Course Denbigh Starkey

Quaternion Rotations AUI Course Denbigh Starkey Major points of these notes: Quaternion Rotations AUI Course Denbigh Starkey. What I will and won t be doing. Definition of a quaternion and notation 3 3. Using quaternions to rotate any point around an

More information

Byzantine Consensus in Directed Graphs

Byzantine Consensus in Directed Graphs Byzantine Consensus in Directed Graphs Lewis Tseng 1,3, and Nitin Vaidya 2,3 1 Department of Computer Science, 2 Department of Electrical and Computer Engineering, and 3 Coordinated Science Laboratory

More information

DUE to the high computational complexity and real-time

DUE to the high computational complexity and real-time IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen

More information

Space-filling curves for 2-simplicial meshes created with bisections and reflections

Space-filling curves for 2-simplicial meshes created with bisections and reflections Space-filling curves for 2-simplicial meshes created with bisections and reflections Dr. Joseph M. Maubach Department of Mathematics Eindhoven University of Technology Eindhoven, The Netherlands j.m.l.maubach@tue.nl

More information

Optimization Problems Under One-sided (max, min)-linear Equality Constraints

Optimization Problems Under One-sided (max, min)-linear Equality Constraints WDS'12 Proceedings of Contributed Papers, Part I, 13 19, 2012. ISBN 978-80-7378-224-5 MATFYZPRESS Optimization Problems Under One-sided (max, min)-linear Equality Constraints M. Gad Charles University,

More information

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can 208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer

More information

Sparse Component Analysis (SCA) in Random-valued and Salt and Pepper Noise Removal

Sparse Component Analysis (SCA) in Random-valued and Salt and Pepper Noise Removal Sparse Component Analysis (SCA) in Random-valued and Salt and Pepper Noise Removal Hadi. Zayyani, Seyyedmajid. Valliollahzadeh Sharif University of Technology zayyani000@yahoo.com, valliollahzadeh@yahoo.com

More information

Module 7 VIDEO CODING AND MOTION ESTIMATION

Module 7 VIDEO CODING AND MOTION ESTIMATION Module 7 VIDEO CODING AND MOTION ESTIMATION Version ECE IIT, Kharagpur Lesson Block based motion estimation algorithms Version ECE IIT, Kharagpur Lesson Objectives At the end of this less, the students

More information

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and

More information

Rollout Algorithms for Discrete Optimization: A Survey

Rollout Algorithms for Discrete Optimization: A Survey Rollout Algorithms for Discrete Optimization: A Survey by Dimitri P. Bertsekas Massachusetts Institute of Technology Cambridge, MA 02139 dimitrib@mit.edu August 2010 Abstract This chapter discusses rollout

More information

Graph Adjacency Matrix Automata Joshua Abbott, Phyllis Z. Chinn, Tyler Evans, Allen J. Stewart Humboldt State University, Arcata, California

Graph Adjacency Matrix Automata Joshua Abbott, Phyllis Z. Chinn, Tyler Evans, Allen J. Stewart Humboldt State University, Arcata, California Graph Adjacency Matrix Automata Joshua Abbott, Phyllis Z. Chinn, Tyler Evans, Allen J. Stewart Humboldt State University, Arcata, California Abstract We define a graph adjacency matrix automaton (GAMA)

More information

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini DM545 Linear and Integer Programming Lecture 2 The Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Outline 1. 2. 3. 4. Standard Form Basic Feasible Solutions

More information