Reduced-Complexity Column-Layered Decoding and. Implementation for LDPC Codes

Size: px
Start display at page:

Download "Reduced-Complexity Column-Layered Decoding and. Implementation for LDPC Codes"

Transcription

1 Redued-Complexity Column-Layered Deoding and Implementation for LDPC Codes Zhiqiang Cui 1, Zhongfeng Wang 2, Senior Member, IEEE, and Xinmiao Zhang 3 1 Qualomm In., San Diego, CA 92121, USA 2 Broadom Corp., Irvine, CA 92617, USA 3 Case Western Reserve University, Cleveland, OH 44106, USA Abstrat: Layered deoding is well appreiated in Low-Density Parity-Chek (LDPC) deoder implementation sine it an ahieve effetively high deoding throughput with low omputation omplexity. This work, for the first time, addresses low omplexity olumn-layered deoding shemes and VLSI arhitetures for multi-gb/s appliations. At first, the Min-Sum algorithm is inorporated into the olumn-layered deoding. Then algorithmi transformations and judiious approximations are explored to minimize the overall omputation omplexity. Compared to the original olumn-layered deoding, the new approah an redue the omputation omplexity in hek node proessing for high-rate LDPC odes by up to 90% while maintaining the fast onvergene speed of layered deoding. Furthermore, a relaxed pipelining sheme is presented to enable very high lok speed for VLSI implementation. Equipped with these new tehniques, an effiient deoder arhiteture for quasi-yli LDPC odes is developed and implemented with 0.13um CMOS tehnology. It is shown that a deoding throughput of nearly 4 Gb/s at maximum of 10 iterations an be ahieved for a (4096, 3584) LDPC ode. Hene, this work has failitated pratial appliations of olumn-layered deoding and partiularly made it very attrative in high-speed, high-rate LDPC deoder implementation. 1

2 Index Terms Deoder, Error orretion odes, Low-density parity-hek (LDPC), Quasi-yli (QC) odes, VLSI Arhiteture, Layered deoding. 1 INTRODUCTION Conventionally, LDPC odes are deoded using the Sum-Produt algorithm (SPA) [1] or the modified Min-Sum algorithm (MSA) [2]. In general, the SPA has the best deoding performane. The MSA is an approximation of the SPA aimed to redue omputation omplexity. It is widely employed in LDPC deoder design [3][4][5][6]. Both algorithms are based on two-phase message-passing (TPMP) sheme. In the literature, variations of the SPA and MSA deoding approahes have been investigated to redue the interonnet omplexity in VLSI implementation [15][16]. Reently, layered LDPC deoding shemes [7][8][9][10] have attrated muh attention in both aademy and industry beause they an effetively speed up the onvergene of LDPC deoding and thus redue the required maximum number of deoding iterations. Presently two kinds of layered deoding approahes, i.e., row-layered deoding [7][8][9] and olumn-layered deoding [9][10] have been proposed. In row-layered deoding, the parity hek matrix of the LDPC ode is partitioned into multiple row layers. The message updating is performed row layer by row layer. The olumn-layered deoding employs the similar idea exept that the parity hek matrix is partitioned into multiple olumn layers. The message omputation is performed olumn layer by olumn layer. The implementation of LDPC deoder with row-layered deoding has been widely studied [11][12][13][14]. The SPA-based olumn-layered deoding approah [10] was proposed in In the same year, Radosavljevi [9], et al, proposed a simplifiation of the SPA-based olumn-layered deoding approah. It has been reported that the olumn-layered deoding has a similar onvergene speed as rowlayered deoding. However, the olumn-layered deoding algorithm has attrated muh less attention due to its inherent high omputation omplexity. In this work, we investigate and explore the benefits of the 2

3 olumn-layered deoding approah. We first inorporate the Min-Sum algorithm into the olumn-layered message passing sheme beause it has muh lower omplexity than the SPA. In addition, by deeply investigating the message updating proess in the Min-Sum based olumn-layered deoding, we develop a simplified olumn-layered deoding sheme, whih maximally eliminates the redundant omputations in the original sheme and signifiantly redues the omputation omplexity further with judiious algorithmi approximation. It is shown that up to 90% of omputations in hek node proessing an be saved for high rate LDPC odes. For high-speed VLSI implementation, the proposed design has signifiant advantages over the onventional row-layered deoding. First, the olumn-layered deoding inherently has shorter ritial path. For a row-layered deoder, the messages assoiated to multiple sub-bloks in a row layer an be proessed in one yle to inrease throughput. However, it will inrease the omplexity of hek node unit (CNU) and require serial onatenation of multiple omparison and seletion stages in VLSI implementation. It has been reported that signifiant hardware overhead is required to optimize the orresponding iruitry for high lok speed [12]. In the olumn layered deoding, the major implementation omplexity is assoiated with variable node unit (VNU), partiularly when the messages orresponding to multiple sub-bloks in a olumn layer are proessed in parallel. Beause only addition operations are performed in a VNU, it is very onvenient to employ arithmeti optimization to minimize the ritial path. Seond, in the proposed design, the overall pipeline lateny in deoding a ode blok is equal to the number of pipeline stages while in rowlayered deoding, the same amount of pipeline lateny is introdued for the message updating of every layer. Moreover, the intrinsi message loading lateny is minimized in the olumn-layered deoding beause the deoding an start as soon as the intrinsi messages orresponding to one blok olumn are available. In summary, the proposed design is well suited for very high deoding throughput and low power LDPC deoder implementation. 3

4 2 SIMPLIFIED COLUMN-LAYERED DECODING WITH MIN-SUM ALGORITHM The olumn-layered deoding sheme (also known as the shuffled iterative deoding) for LDPC odes was proposed in [10]. The deoding sheme is based on the SPA algorithm. Similar to row-layered deoding [7][8][9], the maximum number of deoding iterations an be signifiantly redued for the same deoding performane. Sine the Min-Sum algorithm has muh lower implementation omplexity than the SPA algorithm, it is widely utilized in hardware implementation. In this paper, the Min-Sum algorithm is inorporated into the olumn-layered deoding for the first time. Then, algorithmi transformations and intelligent approximations are explored to signifiantly redue the omputation omplexity and memory requirement. 2.1 Min-Sum Algorithm Based Column-layered Deoding Let C be a binary (N, K) LDPC ode speified by a parity-hek matrix H with M rows and N olumns. Eah row of the parity hek matrix is assoiated with a hek node, and eah olumn is assoiated with a variable node. Let N { v : H = 1} = v denote the set of variable nodes that partiipate in hek node, and M ( v) { : H = 1} denote the set of hek nodes assoiated to variable node v. Let I v = v denote the intrinsi message for variable node v and R v represent the hek-to-variable message onveyed from hek node to variable node v, and Lv represent the variable-to-hek message onveyed from variable node v to hek node. Assume that the N bits of a odeword are divided into G groups of the same size, N 0, N1, N G 1. Aordingly, the parity-hek matrix is divided into G blok olumns. The proposed olumn-layered deoding based on the Min-Sum algorithm is desribed with the pseudo ode below. Min-Sum-based olumn-layered deoding algorithm Initialization: L v = I v for v=0, 1,, N-1, =0, 1,, M-1; 4

5 Iterative deoding: For iter = 1, 2,, maximum iteration number { } For g=0, 1,, G-1 { } Horizontal Step: For eah hek node that is onneted to variable node Vertial Step: For eah variable node Hard deision and termination: Make hard deision by using the sign of v N g, updates Lv and L v R v v N g, omputes = sgn( Ln ) min L. (1) n N ( )\ v n n N ( )\ v as follows: α m M ( v) Rmv, (2) L v \ = Iv + Lv = Iv + α m M (v ) Rmv. (3) L v ; Terminate the deoding if a valid odeword is found. The optimum value of the saling fator α is around 0.8[Wrong!]0. For the onveniene of VLSI implementation, it is set as 0.75 in this work. Assume all variable nodes are divided into 4 groups (i.e., G = 4). The omputation flow of the olumnlayered deoding for one iteration is illustrated in Figure 1. (a). The shaded sub-matries indiate overage of omputation in deoding eah layer, where the omputation in (1) must be arried out for all blok olumns exept for the urrent updating blok olumn. Hene, the omputation omplexity of the original olumn-layered deoding sheme per iteration is many times more than that of the onventional TPMP sheme whether the MSA or SPA is used. 5

6 g = 0 g = 1 g = 2 g = 3 horizontal step vertial step ( a ) g = 0 g = 1 g = 2 g = 3 horizontal step vertial step ( b ) Figure 1. The omputation flow in an iteration of olumn-layered deoding, (a) original algorithm, (b) after algorithm reforumlation. 2.2 Low Complexity Deoding Sheme Algorithm Reformulation With a lose study of the omputation flow of the olumn-layered deoding, it an be observed that a signifiant amount of redundant omputation is performed in the original olumn-layered deoding algorithm. To improve the omputation effiieny, the omputation in (1) for onseutive olumn layers an be inrementally performed. In LDPC deoding, eah variable node sends a variable-to-hek message to every neighboring hek node. Assume that a hek node has d variable node neighbors, and hene reeives d soft messages from its neighboring variable nodes. To larify the main proedure of the reformulated olumn-layered deoding, we assume that eah hek node has only one variable node neighbor in eah blok olumn of the parity hek matrix. It should be noted that this onstraint is not indispensable for olumn-layered deoding in general. Let = m [ m m2... m d ] 1 be a sorted vetor of the magnitudes of the soft messages reeived by hek node in asending order. The supersript g indiates 6

7 the soft message was generated when the th g blok olumn is proessed. Similarly, let S be n N ( ) sign( Ln ) when the deoding for the layer g is ompleted. To redue the omputation omplexity of ( g 1) Min-Sum based olumn-layered deoding, m an be omputed from m ( g 1) 1. For eah variable node v in N g, remove the old L from m v in three steps. to obtain a temporary sorted ( g 1) vetor m. In addition, remove the old sign L ) from S ( v and obtain a temporary sign-produt ~ ( g ) S. Sine the smallest value, m 1, in m is min )\ ( L ) and S is sign ( Ln ), the n N ( v n n N ( )\ v value of R an be omputed as m1 v ~ ( 1) S g. Send the R v message to the variable node v. 2. Perform variable-to-hek message omputations for all variable nodes belonging to N g. The new Lv messages are sent bak to orresponding hek nodes. 3. For eah hek node, insert the updated Lv into m in a sorted order to obtain The reformulated olumn-layered deoding proedure is summarized as follows: m. The Reformulated Column-layered Deoding Initialization: Let L v = Iv for all variable nodes. For eah hek node, sort the magnitudes of the L v messages from its neighboring variable nodes. Compute the sign produt for eah hek node S = n N ) Iterative deoding: For iter = 1, 2,, maximum iteration number { For g=0, 1,, G-1 { Horizontal Step-A: for eah hek node that onnets to variable node v N, sign ( L. ( n) ompute m by removing the old L from m g, (4) v ( g 1) and R = m, where S = S old sign. (5) v ~ S g 1 L v Vertial Step: For eah variable node v, ompute L and L using (2) and (3). N g v v 7

8 } Horizontal Step-B: for eah hek node that onnets to variable node v N, ompute m using m and L, (6) ~ ) v v ( g) ( g) ( g and S = S sign( L ). (7) Hard deision and termination: Make hard deision by using the sign of L v ; Terminate deoding if a valid odeword is found or the maximum deoding iteration is reahed. } ~ ~ (0 (0) ) In the above algorithm, m and S in a deoding iteration are omputed from m G and G S, respetively, whih are obtained in the previous iteration. The new omputation flow for one full iteration is depited in Figure 1. (b) Simplifiation of Min-Sum Based Column-layered Deoding The algorithm reformulation presented above removes a large amount of redundant omputation in the original olumn-layered deoding sheme, and thus signifiantly redues the overall omputation omplexity. However, beause every vetor m ontains d values, the reformulated algorithm still requires a onsiderable amount of memory to store hek-to-variable messages. For row, only the two smallest values in the sorted vetor m are diretly involved in the message updating Step-A from layer g-1 to layer g. For example, if the smallest value in m g is from layer g in the previous iteration, it is removed from the vetor m g in horizontal Step-A. Then the seond smallest value is used as the magnitude of variable-to-hek message. In horizontal Step-B, m is omputed with the new value, L v, from variable node, and the pre-sorted vetor m from horizontal Step-A. The new value, L, ould take any index in m. If it takes a very small index, it has more hane to be used in Step-A of further omputation. Otherwise, it has muh less hane to be one of the two smallest values in the remaining omputation. In another word, though m ontains d values, most of the values in the end of the sorted v 8

9 vetor are less likely to be used as reliability information for message updating. Thus, it is reasonable to redue the length of the vetor m to further redue the implementation omplexity. Our simulations show that if the lengths of the vetors m and m are set to be 3, the deoding onvergene speed and performane have almost no degradation ompared to the standard Min-Sum based olumn-layered deoding. In suh ase, we name the sheme as three-min olumn-layered deoding. Similarly, the lengths of the vetors m and m an be set as 2 or even 1. Correspondingly, more penalties in onvergene speed and performane are expeted. In this approximation, m ontains three values in most ases. It may ontain two valid values if one value is removed from the vetor m g in horizontal Step-A. In the omputation of horizontal Step-B, at most three omparison operations are required to sort the new value Lv from variable node and the sorted values in m. If m ontains three valid values, the third omparison is needed to determine whether the Lv is the third smallest value or it should be disarded. Table I shows the average number of m updating events in one iteration for the deoding of a (4096, 3584) (4, 32) QC-LDPC ode at the SNR of 4.1dB. In one iteration, the total number of m updating events for a hek node is 32. A message updating event an be ategorized into one of three types. Type I, a value is removed from m g and then a new value is inserted into m. Type II, no value is removed from m g but m is updated. Type III, no value is removed from m g and L v is disarded. It an be seen from Table I that if no value is removed from m g, L v is muh more likely to be disarded than being the third smallest value in m. To further redue the deoding omplexity, the third omparison in horizontal Step-B is eliminated. In the modifiation, Lv is disarded if no value is removed from m g and Lv is larger than the seond value in m. This results in the simplified three-min olumn-layered deoding. In this ase, a 9

10 ( g 1) hek node requires 3 equal-omparisons to remove the old Lv from vetor m in horizontal Step-A and requires 2 regular omparisons to update m vetor with the new L in horizontal Step-B. The major differene between the three-min deoding and the simplified three-min deoding is that the three-min deoding requires 3 omparisons for sorting in horizontal Step-B. The simplified three-min requires 2 omparisons for sorting in horizontal Step-B. Our simulation shows that the additional approximation introdued by the simplified three-min deoding only auses very small performane loss. v Table I shows that a sorted vetor m only gets updated about 4 times in an iteration of the three-min deoding. On the ontrary, for TPMP or row-layered deoding, the sorting omputation to find the smallest and the seond smallest magnitudes for a hek node re-starts in every iteration. For the same LDPC ode, the average number of updating ativities for a sorted vetor is more than 16 per iteration. It leads to signifiant power savings for the three-min olumn-layered deoding in hek node message updating. TABLE I. THE AVERAGE NUMBER OF m UPDATING EVENTS IN ONE ITERATION ~ ( g) m ~ ( g) m = ~ ( g) m = ~ ( g) m = In the 3 rd iteration In the 6 th iteration ( g 1) m ( g 1) m, Lv being the smallest value in ( g 1) m, Lv being the seond smallest value in ( g 1) m, Lv being the third smallest value in m m m m = ~ ( g) m = ( g 1) m, Lv being disarded Pipelining of Column-layered Deoding Pipelining is a ommon pratie in VLSI implementation to inrease lok speed and thus to speed up data proessing throughput. In general, pipelining an only be applied to feed-forward data paths in order to maintain the original funtion of VLSI iruitry. In LDPC olumn-layered deoding algorithms, data dependeny exists between onseutive layers. Thus, the VLSI iruitry for hek node, variable node, and message memories forms a logi loop and pipelining an not be diretly applied to inrease the effetive 10

11 lok speed of LDPC deoders. In this setion, a relaxed pipelining sheme of olumn-layered LDPC deoding is proposed. In the original olumn-layered deoding, the hange between m g ( g 1+ P) and m is very small if the value of P is not large. Thus, m g ( g 1+ P) ( g P) an be used as the estimation of m. An approximation of R + v an be alulated from m g ( g 1+ P) before m are obtained. Then, it is immediately used for omputing ( g P) v L +. The approximation allows P lok yles to omplete the message omputation for eah layer. ( P) When L g+ ( g 1+ P) is obtained, the m is already available. Thus, the horizontal step for layer g+p an be v 1 ) undertaken with ( g + m P ( P) and L g+ ( g P). The updating method of m v + is the same as before. The pipelined olumn-layered deoding algorithm is formulated as the following. The Pipelined Column-layered Deoding Initialization: Let L v = Iv for all variable nodes. For eah hek node, sort the magnitudes of the L v messages from its neighboring variable nodes. Compute the sign produt for eah hek node = S n N ( sign ( L ) n). Iterative deoding: For iter = 1, 2,, maximum iteration number { For g=0, 1,, G-1 { ( P) Compute R g+ ( g ) : For eah hek node that onnets to variable node v N, ompute m v ~ P + by ( P) removing the old L g+ from m g ( P). Then, g ~ ( g+ P) ( g 1) S = v ( g P) L v S old sign( ) R + ( g ) = v ~ + P S m 1, where +. (8) ( P) Vertial Step: For eah variable node v N g + ( P), omputes L g+ ( P) and L g+ using (2) and (3). v v Horizontal Step-A: for eah hek node that onnets to variable node v N, ompute m by removing the old L from m g, (9) v Horizontal Step-B: for eah hek node that onnets to variable node N v, 11

12 } ompute m using m and L, (10) ~ ) v v ( g) ( g) ( g and S = S sign( L ). (11) Hard deision and termination: Make hard deision by using the sign of maximum deoding iteration is reahed. } L v ; Terminate deoding if a valid odeword is found or the Impat to the Complexity of VLSI Implementation For the standard Min-Sum based olumn-layered deoding, a hek node requires d 2 regular omparators to ompute the hek-to-variable message for a new olumn-layer. To exeute (1) and (2), the variable-to-hek messages orresponding to the whole H matrix must be stored. For the simplified threemin olumn-layered deoding algorithm, a hek node only requires 2 regular omparators. The magnitudes of variable-to-hek messages an be used on-the-fly and are not stored. Only the sign bits of variable-to-hek messages should be stored. For eah hek node, three magnitudes, three indies, and one sign bit need to be saved. For a (N, N-M) (4, 32) strutured LDPC ode, approximately, ( 30 2) / 30 = 93% of omputation an be redued. Assuming eah message is quantized as 4 bits, the number of memory bits needed by a hek node is 3 (3 + 5) + 1 = 25. The perentage of memory bits that an be saved for extrinsi messages is [( ) ] /(32 4 M ) = 55.4% M. Beause the proposed olumn-layered deoding approahes do not save the magnitude of variable-tohek messages, they are inherently memory-effiient. Equipped with the proposed deoding tehniques, an effiient olumn-layered deoder arhiteture for quasi-yli LDPC odes is developed with arhitetural and arithmeti optimizations. The details are provided in Setion IV. 12

13 3 PERFORMANCE SIMULATION To evaluate the deoding performane and onvergene speed of the proposed algorithms, three LDPC odes are simulated. Codes I and II are rate 1/2, (2304, 1152) LDPC ode and rate 5/6, (2304, 1920) LDPC ode, respetively, adopted in WiMax (802.16e) standard. Code III is a (4096, 3584) (4, 32)-regular QC-LDPC ode, whih is onstruted using the progressive edge growth method (PEG) [17][18]. The ode is also used to illustrate the VLSI arhiteture design in Setion 4. In eah simulation, at least 50 frame errors are observed. Figure 2. shows the frame error rates in deoding the WiMax rate-1/2 (2304, 1152) LDPC ode with four different layered deoding algorithms, i.e., 1) the row-layered deoding, 2) the original olumnlayered deoding, 3) the three-min olumn-layered deoding, and 4) the simplified three-min olumnlayered deoding. The maximum number of deoding iterations is set as 10 and 20, respetively, for all deoding approahes. It an be observed that all the four deoding algorithms have almost the same deoding performane with maximum of 20 iterations. With the maximum of 10 deoding iterations, only subtle performane differene an be observed. Figure 3. shows the frame error rates in deoding the WiMax rate-5/6 (2304, 1920) LDPC ode with maximum of 10 and 20 iterations, respetively. It shows again that the performane differenes among various layered deoding algorithms are negligible. 13

14 iterations Frame error rates iterations 10-4 Row-layered. Column-layered. 3-min olumn-layered. simplified 3-min olumn-layered Eb/No (db) Figure 2. Frame error rate of various layered deoding approahes in deoding WiMax rate-1/2 (2304, 1152) LDPC ode. 14

15 10-1 Frame error rate iterations Row-layered. Column-layered. 3-min olumn-layered. simplified 3-min olumn-layered Eb/No (db) 10 iterations Figure 3. Frame error rate of various layered deoding approahes in deoding WiMax rate-5/6 (2304, 1920) LDPC ode. Figure 4. shows the frame error rate (FER) in deoding the (4096, 3584) (4, 32) QC-LDPC ode. The number of olumns in eah olumn layer is 128. The maximum number of iteration is set as 10. It an be seen that the three-min olumn-layered deoding algorithm ahieves almost the same deoding performane as the standard Min-Sum based olumn-layered deoding algorithm. The simplified three-min approah has about 0.02dB performane loss. The pipelined three-min olumn-layered deoding methods introdued slight performane loss ompared to the non-pipelined three-min deoding algorithm. With two stages of pipelining, the pipelined deoding sheme has less than 0.02dB performane loss (ompared to whih??). 15

16 Figure 4. Frame error rate of various deoding approahes in deoding a (4096, 3584) (4, 32) QC-LDPC ode. 4 PROPOSED COLUMN-LAYERED DECODER ARCHITECTURES In this setion, an optimized QC-LDPC deoder arhiteture for a (4096, 3584) (4, 32)-regular QC- LDPC ode using the proposed pipelined olumn-layered deoding sheme is presented. The arhiteture effiiently enables very high deoding parallelism for layered deoding while having very short ritial path. In one lok yle, the messages orresponding to 4 irulant matries are proessed in parallel. Twostage pipelining is employed in order to improve the lok frequeny. In order to further redue the ritial path delay, we rearrange the additions in variable node unit to maximally take the advantage of arry-save addition. 16

17 4.1 Column-Layered Deoder Arhiteture for QC-LDPC Codes The parity hek matrix of the QC-LDPC ode to be onsidered in this work onsists of an array of 4 32 ylially shifted permutation matries of dimension The deoding steps disussed in Setion II.B are performed for one blok olumn at a time. That means all extrinsi messages orresponding to a blok olumn of the H matrix are proessed in parallel in one lok yle. The arhiteture of pipelined deoder with two pipeline stages is shown in Figure 5. Eah M omponent in an M-register array on the left side of Figure 5. represents a 25-bit register for storing a sorted vetor assoiated with a row of the H matrix. The Get Rv omponent performs (8) to get hek-to-variable messages for layer g+p, where P=2. The CNUa omponent (the portion of hek node unit for horizontal Step-A) performs (9) to remove the old variable-to-hek message assoiated to layer g from the sorted vetor. The CNUb omponent (the portion of hek node unit for horizontal Step-B) is utilized to perform (10) and (11). The total numbers of M-register, Get Rv, CNUa, and CNUb are all 512. The number of variable node unit (VNU) in the VNU array is 128. A VNU is required to perform the vertial step (2) and (3) for a variable node. The barrel shifters in the left side of the VNU array align the hek-to-variable message from row order to olumn order. Similarly, the barrel shifters in the right side of VNU array align the variable-to-hek message from olumn order to row order. It an be observed form Figure 5. that two types of loops are formed in olumn-layered deoding. The first type of loop onsists of a CNUa and a CNUb omponents. The seond type of loop is omposed of a Get Rv, a CNUb, a VNU, and two barrel shifters. With the pipelined olumn-layered deoding approah, the ritial path in the seond type loop is drastially redued when applying two-stage pipelining. The Get Rv omponent is needed beause its outputs are the hek-to-variable messages for layer g+p. On the ontrary, the outputs of CNUa are intermediate values for layer g. For non-pipelined deoding, the Get Rv omponent an be eliminated beause the output of CNUa ontains the hek-to-variable message for layer g. We only need one stage of barrel shifter at the input of CNUa array to minimize 17

18 ritial path. The onnetions among the array of CNUa, CNUb and VNU have no hange during the deoding. Barrel shifter Barrel shifter Barrel shifter Pipeline Pipeline Barrel shifter Figure 5. The top-level blok diagram of pipelined olumn-layered deoding. 4.2 Arhiteture of Chek Node Unit Figure 6. shows the struture of the CNU for the three-min deoding sheme. The sub-matries in a blok row of the H matrix are proessed one at a time. The CNU is omposed of two onatenated stages. 18

19 The first stage, CNUa, omputes the m, and the seond stage, CNUb, generates m. The data in Fig. 10 are assoiated with the vetors in the horizontal deoding step as follows: ( g 1) m = [min1_old, min2_old, min3_old], ~ ( g) m = [min1_temp, min2_temp, min3_temp], ( g) m = [min1_new, min2_new, min3_new], ( g 1) I = [idx1_old, idx2_old, idx3_old], ~ ( g ) = I [idx1_temp, idx2_temp, idx3_tmp], ( g) I = [idx1_new, idx2_new, idx3_new]. In eah M-register for a sorted vetor, three smallest magnitudes and their indies are stored. In the deoding of a olumn layer, if the olumn index is not in the vetor, the magnitudes and indies in the vetor are diretly passed through the selet-logi-a to the seond stage. Otherwise, min1_temp and min2_temp get values from the two remaining smallest magnitudes. The value of min3_temp beomes void. The temporary index values are determined in the same way. It is lear that min1_temp is the magnitude of the Rv message for the olumn layer in non-pipelined deoding. After new L v value is sent bak from VNU, it is used to ompute the new value for the sorted vetor. The struture of a Get Rv omponent is shown in Figure 6.. This omponent is not required for non-pipelined deoding. For the proposed simplified three-min deoding approah, the adder for the omputation of abs Lv ) min 3 _ temp ( is not needed. Instead, a simple three-input OR gate an be used to disable M-register update in the ondition shown in row 5 of Table I. The three inputs for the OR gate are the outputs of the three omparators in CNUa. It is shown in Fig. 3 that the removal of the adder results in less than 0.02dB performane loss. 19

20 Figure 6. (a) CNU arhiteture for the three-min olumn-layered deoding. (b) The struture of the Get Rv omponent. 4.3 Arhiteture of Optimized Variable Node Unit Shown in Figure 7. is the struture of an optimized VNU that an simultaneously proess 4 hek-tovariable messages. The addition operations are rearranged suh that the advantage of arry-save adder an be maximally taken. In the beginning of a VNU, the hek-to-variable messages in signed-magnitude format are onverted to 2 s omplement representation. In the data onversion for eah signed-magnitude number, the sign bit is not immediately added to the bitwise-not of lower bits. Instead, all sign bits are added through adder array. Then, eah two-bit sign-sum is sent to an adder in the seond addition stage for final summation. Beause eah adder in the seond and third stages has three inputs, it an be implemented using a arry-save adder and a regular binary adder. The right shift operations >>1 and >>2 are used for performing the salar multipliation of The above mentioned arithmeti optimizations aim to signifiantly redue the logi delay in a VNU. 20

21 Figure 7. The arhiteture of the optimized VNU. To illustrate the data flow of the optimized VNU, let us take the omputation of L 3 v as an example. Assume that R 4 v is a negative number and other inputs are positive numbers. The value before the shift of saling operation of L 3 is R R + bit _ inverse ( R ) + '0' + '0' '1' ). The ' 0' + '0' + '1' omputation is v ( v 2v v + performed by a 1-bit full adder in the adder array blok assoiated with L 3 v message. After the final shift and addition stage, the output of L 3 v is 0.75 ( R 1 v + R2v + R4v ) + Iv. 5 HARDWARE REQUIREMENT AND DECODING THROUGHPUT The pipelined olumn-layered deoder with two pipeline stages is modeled using Verilog RTL and synthesized using Fujitsu 0.13um standard library. The required hardware resoure and the synthesis result 21

22 are summarized in Table II. Eah intrinsi message is quantized as 4 bits. It takes 32 lok yles to ompute the initial sorted vetors and = 320 lok yles for 10 deoding iterations. The number of lok yles for pipeline lateny is 2. Let flk denote the lok frequeny of the deoder, the information (soure data) deoding throughput is f ( ) /( ). Thus, the deoder ahieves a deoding throughput of Gb/s at a lok speed of 388MHz. lk TABLE II. THE HARDWARE RESOURCE FOR (4096, 3584) (4, 32) QC-LDPC DECODERS VNU 128 CNU 512 Get_Rv 512 Intrinsi message (bits) = Sorted vetor (bits) = Signs of variable-to-hek messages = Area per VNU ( um ) Area per CNU ( um ) Area per Get_Rv ( um ) Clok Frequeny (MHz) Synthesis area ( mm ) Information deoding throughput (Gb/s) Table III ompares the proposed olumn-layered deoder with the state-of-the-art row-layered deoders. To mitigate the disrepany introdued from different implementation tehnologies, the area and lok speed of all these designs are saled to 65nm. For the implementation tehnology of 0.18um, 0.13um, and 90nm, the area saling down fator is set as 8, 4, and 2, respetively. The orresponding lok frequeny is saled up by , , and 1.25, respetively. The maximum deoding iteration is set as 22

23 10 for all deoders with layered deoding algorithms. For area of synthesis result, a saling fator of 1/0.7 is applied to approximate the layout area. TABLE III. DESIGN COMPARISONS WITH RECENT HIGH SPEED LDPC DECODERS This work [11] [14] [12] [13] Code (4096, 3584) (9600, 7200) n e, 2048-bits n Rate 7/8 3/4 1/2 ~5/6 1/2 ~ 5/6 1/2 ~ 7/8 Algorithm Column-layered Min-Sum Row-layered Min-Sum Row-layered Min-Sum Row-layered BP Row-layered BCJR LLR message quantization 4-bits 6-bits 5-bit - 4-bits Max. iteration Max deoding parallelism Tehnology 0.13um 65nm 0.18um 90nm 0.18um Clok frequeny(mhz) Area (synthesis) (synthesis) Max info. throughput (Gb/s) Clok frequeny (saled to 65nm) Layout area (saled to 65nm) Normalized info. throughput (saled to 65nm) (10 iterations) Considering the normalized deoding throughput, throughput/area ratio, and deoding performane among various designs, it an be onluded that the proposed simplified olumn-layered deoding algorithm and arhiteture have signifiant advantages in high throughput LDPC deoder implementation. 23

24 6 CONCLUSION In this paper, various tehniques have been explored to redue the omputation omplexity of the olumn-layered deoding. As a result, the proposed method an drastially redue the overall omputation omplexity of the original sheme while largely maintaining deoding performane and onvergene speed. In addition, a relaxed pipelining sheme has been shown to break the data dependeny between adjaent olumn layers, and thus enhane the lok speed. Combining all the proposed tehniques, a low-omplexity, high-speed LDPC deoder arhiteture for generi QC-LDPC odes has been developed and the implementation result for a speifi example has demonstrated that the proposed olumn-layered deoder arhiteture is very ompetitive to state-of-the-art row-layered LDPC deoder designs. REFERENCES [1] R. G. Gallager, Low-density parity-hek odes, IRE Trans. Inform. Theory, vol. IT-8, pp , Jan [2] J. Chen, A. Dholakia, E. Eleftheriou, M.P.C. Fossorier, X. Y. Hu, Redued-Complexity Deoding of LDPC Codes, IEEE Transations on Communiations, vol. 53, issue 8, pp , Aug [3] Z. Wang and Z. Cui, A memory effiient partially parallel deoder arhiteture for QC-LDPC odes, in Pro. of the 39th Asilomar Conferene on Signals, Systems & Computers, pp , [4] C. Lin, K. Lin, H. Chan, and C. Lee, A 3.33 Gb/s (1200, 720) low-density parity hek ode deoder, in Pro. 31st Eur. Solid-State Ciruits Conf., pp , Sep [5] D. Oh and K. K. Parhi Nonuniformly quantized Min-Sum deoder arhiteture for low-density parity-hek odes, in Pro. of the 18th ACM Great Lakes symposium on VLSI, pp , 2008 [6] X. Shih, C. Zhan, C. Lin, and A. Wu, An 8.29 mm2 52 mw Multi-Mode LDPC Deoder Design for Mobile WiMAX System in 0.13 µm CMOS Proess, IEEE Journal of Solid-State Ciruits, vol. 43, issue 3, pp , Marh [7] E. Sharon, S. Litsyn, and J. Goldberger, An effiient message-passing shedule for LDPC deoding, in Pro. of the 23rd IEEE Convention of Eletrial and Eletronis Engineers in Israel, pp , Sept.,

25 [8] D. E. Hoevar, A redued omplexity deoder arhiteture via layered deoding of LDPC odes, IEEE Workshop on Signal Proessing Systems, pp , [9] P. Radosavljevi, A. Baynast, and J. R. Cavallaro, Optimized Message Passing Shedules for LDPC Deoding, in Pro. of 39th Asilomar Conferene on Signals, Systems and Computers, pp , [10] J. Zhang, M.P.C. Fossorier, Shuffled iterative deoding, IEEE Transations on Communiations, vol. 53, Issue 2, pp , Feb [11] T. Brak, M. Alles, T. Lehnigk-Emden, F. Kienle, N. When, N. E. L Insalata, F. Rossi, M. Rovini, and L. Fanui, Low Complexity LDPC Code Deoders for Next Generation Standards, in Pro. of Design, Automation and Test in Europe (DATE '07), Apr [12] Y. Sun and J. R. Cavallaro, J.R, A low-power 1-Gbps reonfigurable LDPC deoder design for multiple 4G wireless standards, 2008 IEEE International SOC Conferene, pp , 2008 [13] M. M. Mansour and N. R. Shanbhag, A 640-Mb/s 2048-bit programmable LDPC deoder hip, IEEE Journal of Solid-State Ciruits, vol. 41, issue 3. pp , [14] C. Studer, N. Preyss, C. Roth, and A. Burg, Configurable high-throughput deoder arhiteture for quasiyli LDPC odes, 42nd Asilomar Conferene on Signals, Systems and Computers, pp , [15] A. Darabiha, A. C. Carusone, and F. R. Kshishang, Blok-interlaed LDPC deoders with redued interonnet omplexity, IEEE Transations on Ciruits and Systems II, vol. 55, no. 1, pp , Jan [16] T. Mohsenin, D. Truong, and B. Baas, Multi-Split-Row Threshold deoding implementations for LDPC odes, 2009 IEEE International Symposium on Ciruits and Systems, pp [17] X. -Y. Hu, E. Eleftheriou, and D. M. Arnold, Regular and irregular progressive edge-growth tanner graphs, IEEE Trans. on Info. Theory, vol. 51, issue 1, pp , Jan [18] Z. Li and B. V. K. V. Kumar, A lass of good quasi-yli low-density parity hek odes based on progressive edge growth graph, Thirty-Eighth Asilomar Conferene on Signals, Systems and Computers, vol. 2, pp ,

DECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Euncheol Kim, Gwan Choi, Mark Yeary *

DECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Euncheol Kim, Gwan Choi, Mark Yeary * DECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Eunheol Kim, Gwan Choi, Mark Yeary * Dept. of Eletrial Engineering, Texas A&M University, College Station, TX-77840

More information

Pipelined Multipliers for Reconfigurable Hardware

Pipelined Multipliers for Reconfigurable Hardware Pipelined Multipliers for Reonfigurable Hardware Mithell J. Myjak and José G. Delgado-Frias Shool of Eletrial Engineering and Computer Siene, Washington State University Pullman, WA 99164-2752 USA {mmyjak,

More information

BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel

BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel I J C T A, 9(40), 2016, pp. 397-404 International Science Press ISSN: 0974-5572 BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel Neha Mahankal*, Sandeep Kakde* and Atish Khobragade**

More information

Design of High Speed Mac Unit

Design of High Speed Mac Unit Design of High Speed Ma Unit 1 Harish Babu N, 2 Rajeev Pankaj N 1 PG Student, 2 Assistant professor Shools of Eletronis Engineering, VIT University, Vellore -632014, TamilNadu, India. 1 harishharsha72@gmail.om,

More information

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Malaysian Journal of Computer Siene, Vol 10 No 1, June 1997, pp 36-41 A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Md Rafiqul Islam, Harihodin Selamat and Mohd Noor Md Sap Faulty of Computer Siene and

More information

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2 On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,

More information

Parallel Block-Layered Nonbinary QC-LDPC Decoding on GPU

Parallel Block-Layered Nonbinary QC-LDPC Decoding on GPU Parallel Blok-Layered Nonbinary QC-LDPC Deoding on GPU Huyen Thi Pham, Sabooh Ajaz and Hanho Lee Department of Information and Communiation Engineering, Inha University, Inheon, 42-751, Korea Abstrat This

More information

Graph-Based vs Depth-Based Data Representation for Multiview Images

Graph-Based vs Depth-Based Data Representation for Multiview Images Graph-Based vs Depth-Based Data Representation for Multiview Images Thomas Maugey, Antonio Ortega, Pasal Frossard Signal Proessing Laboratory (LTS), Eole Polytehnique Fédérale de Lausanne (EPFL) Email:

More information

This fact makes it difficult to evaluate the cost function to be minimized

This fact makes it difficult to evaluate the cost function to be minimized RSOURC LLOCTION N SSINMNT In the resoure alloation step the amount of resoures required to exeute the different types of proesses is determined. We will refer to the time interval during whih a proess

More information

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks A Dual-Hamiltonian-Path-Based Multiasting Strategy for Wormhole-Routed Star Graph Interonnetion Networks Nen-Chung Wang Department of Information and Communiation Engineering Chaoyang University of Tehnology,

More information

Partial Character Decoding for Improved Regular Expression Matching in FPGAs

Partial Character Decoding for Improved Regular Expression Matching in FPGAs Partial Charater Deoding for Improved Regular Expression Mathing in FPGAs Peter Sutton Shool of Information Tehnology and Eletrial Engineering The University of Queensland Brisbane, Queensland, 4072, Australia

More information

A {k, n}-secret Sharing Scheme for Color Images

A {k, n}-secret Sharing Scheme for Color Images A {k, n}-seret Sharing Sheme for Color Images Rastislav Luka, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos The Edward S. Rogers Sr. Dept. of Eletrial and Computer Engineering, University

More information

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1.

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1. Fuzzy Weighted Rank Ordered Mean (FWROM) Filters for Mixed Noise Suppression from Images S. Meher, G. Panda, B. Majhi 3, M.R. Meher 4,,4 Department of Eletronis and I.E., National Institute of Tehnology,

More information

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded Folding is verse of Unfolding Node A A Folding by N (N=folding fator) Folding A Unfolding by J A A J- Hardware Mapped vs. Time multiplexed l Hardware Mapped vs. Time multiplexed/mirooded FI : y x(n) h

More information

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines The Minimum Redundany Maximum Relevane Approah to Building Sparse Support Vetor Mahines Xiaoxing Yang, Ke Tang, and Xin Yao, Nature Inspired Computation and Appliations Laboratory (NICAL), Shool of Computer

More information

A Novel Validity Index for Determination of the Optimal Number of Clusters

A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers

More information

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer Communiations and Networ, 2013, 5, 69-73 http://dx.doi.org/10.4236/n.2013.53b2014 Published Online September 2013 (http://www.sirp.org/journal/n) Cross-layer Resoure Alloation on Broadband Power Line Based

More information

The AMDREL Project in Retrospective

The AMDREL Project in Retrospective The AMDREL Projet in Retrospetive K. Siozios 1, G. Koutroumpezis 1, K. Tatas 1, N. Vassiliadis 2, V. Kalenteridis 2, H. Pournara 2, I. Pappas 2, D. Soudris 1, S. Nikolaidis 2, S. Siskos 2, and A. Thanailakis

More information

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays nalysis of input and output onfigurations for use in four-valued D programmable logi arrays J.T. utler H.G. Kerkhoff ndexing terms: Logi, iruit theory and design, harge-oupled devies bstrat: s in binary,

More information

High-level synthesis under I/O Timing and Memory constraints

High-level synthesis under I/O Timing and Memory constraints Highlevel synthesis under I/O Timing and Memory onstraints Philippe Coussy, Gwenolé Corre, Pierre Bomel, Eri Senn, Eri Martin To ite this version: Philippe Coussy, Gwenolé Corre, Pierre Bomel, Eri Senn,

More information

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks Abouberine Ould Cheikhna Department of Computer Siene University of Piardie Jules Verne 80039 Amiens Frane Ould.heikhna.abouberine @u-piardie.fr

More information

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications System-Level Parallelism and hroughput Optimization in Designing Reonfigurable Computing Appliations Esam El-Araby 1, Mohamed aher 1, Kris Gaj 2, arek El-Ghazawi 1, David Caliga 3, and Nikitas Alexandridis

More information

We don t need no generation - a practical approach to sliding window RLNC

We don t need no generation - a practical approach to sliding window RLNC We don t need no generation - a pratial approah to sliding window RLNC Simon Wunderlih, Frank Gabriel, Sreekrishna Pandi, Frank H.P. Fitzek Deutshe Telekom Chair of Communiation Networks, TU Dresden, Dresden,

More information

Approximate logic synthesis for error tolerant applications

Approximate logic synthesis for error tolerant applications Approximate logi synthesis for error tolerant appliations Doohul Shin and Sandeep K. Gupta Eletrial Engineering Department, University of Southern California, Los Angeles, CA 989 {doohuls, sandeep}@us.edu

More information

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract

Learning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract CS 9 Projet Final Report: Learning Convention Propagation in BeerAdvoate Reviews from a etwork Perspetive Abstrat We look at the way onventions propagate between reviews on the BeerAdvoate dataset, and

More information

HEXA: Compact Data Structures for Faster Packet Processing

HEXA: Compact Data Structures for Faster Packet Processing Washington University in St. Louis Washington University Open Sholarship All Computer Siene and Engineering Researh Computer Siene and Engineering Report Number: 27-26 27 HEXA: Compat Data Strutures for

More information

Dr.Hazeem Al-Khafaji Dept. of Computer Science, Thi-Qar University, College of Science, Iraq

Dr.Hazeem Al-Khafaji Dept. of Computer Science, Thi-Qar University, College of Science, Iraq Volume 4 Issue 6 June 014 ISSN: 77 18X International Journal of Advaned Researh in Computer Siene and Software Engineering Researh Paper Available online at: www.ijarsse.om Medial Image Compression using

More information

Alleviating DFT cost using testability driven HLS

Alleviating DFT cost using testability driven HLS Alleviating DFT ost using testability driven HLS M.L.Flottes, R.Pires, B.Rouzeyre Laboratoire d Informatique, de Robotique et de Miroéletronique de Montpellier, U.M. CNRS 5506 6 rue Ada, 34392 Montpellier

More information

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks International Journal of Advanes in Computer Networks and Its Seurity IJCNS A Load-Balaned Clustering Protool for Hierarhial Wireless Sensor Networks Mehdi Tarhani, Yousef S. Kavian, Saman Siavoshi, Ali

More information

Gray Codes for Reflectable Languages

Gray Codes for Reflectable Languages Gray Codes for Refletable Languages Yue Li Joe Sawada Marh 8, 2008 Abstrat We lassify a type of language alled a refletable language. We then develop a generi algorithm that an be used to list all strings

More information

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY Dileep P, Bhondarkor Texas Instruments Inorporated Dallas, Texas ABSTRACT Charge oupled devies (CCD's) hove been mentioned as potential fast auxiliary

More information

Outline: Software Design

Outline: Software Design Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling

More information

Cluster-based Cooperative Communication with Network Coding in Wireless Networks

Cluster-based Cooperative Communication with Network Coding in Wireless Networks Cluster-based Cooperative Communiation with Network Coding in Wireless Networks Zygmunt J. Haas Shool of Eletrial and Computer Engineering Cornell University Ithaa, NY 4850, U.S.A. Email: haas@ee.ornell.edu

More information

Plot-to-track correlation in A-SMGCS using the target images from a Surface Movement Radar

Plot-to-track correlation in A-SMGCS using the target images from a Surface Movement Radar Plot-to-trak orrelation in A-SMGCS using the target images from a Surfae Movement Radar G. Golino Radar & ehnology Division AMS, Italy ggolino@amsjv.it Abstrat he main topi of this paper is the formulation

More information

Multi-Channel Wireless Networks: Capacity and Protocols

Multi-Channel Wireless Networks: Capacity and Protocols Multi-Channel Wireless Networks: Capaity and Protools Tehnial Report April 2005 Pradeep Kyasanur Dept. of Computer Siene, and Coordinated Siene Laboratory, University of Illinois at Urbana-Champaign Email:

More information

Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems

Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems Arne Hamann, Razvan Rau, Rolf Ernst Institute of Computer and Communiation Network Engineering Tehnial University of Braunshweig,

More information

Multiple Assignments

Multiple Assignments Two Outputs Conneted Together Multiple Assignments Two Outputs Conneted Together if (En1) Q

More information

Improved Circuit-to-CNF Transformation for SAT-based ATPG

Improved Circuit-to-CNF Transformation for SAT-based ATPG Improved Ciruit-to-CNF Transformation for SAT-based ATPG Daniel Tille 1 René Krenz-Bååth 2 Juergen Shloeffel 2 Rolf Drehsler 1 1 Institute of Computer Siene, University of Bremen, 28359 Bremen, Germany

More information

New Fuzzy Object Segmentation Algorithm for Video Sequences *

New Fuzzy Object Segmentation Algorithm for Video Sequences * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 521-537 (2008) New Fuzzy Obet Segmentation Algorithm for Video Sequenes * KUO-LIANG CHUNG, SHIH-WEI YU, HSUEH-JU YEH, YONG-HUAI HUANG AND TA-JEN YAO Department

More information

Acoustic Links. Maximizing Channel Utilization for Underwater

Acoustic Links. Maximizing Channel Utilization for Underwater Maximizing Channel Utilization for Underwater Aousti Links Albert F Hairris III Davide G. B. Meneghetti Adihele Zorzi Department of Information Engineering University of Padova, Italy Email: {harris,davide.meneghetti,zorzi}@dei.unipd.it

More information

arxiv: v1 [cs.db] 13 Sep 2017

arxiv: v1 [cs.db] 13 Sep 2017 An effiient lustering algorithm from the measure of loal Gaussian distribution Yuan-Yen Tai (Dated: May 27, 2018) In this paper, I will introdue a fast and novel lustering algorithm based on Gaussian distribution

More information

Implementing Load-Balanced Switches With Fat-Tree Networks

Implementing Load-Balanced Switches With Fat-Tree Networks Implementing Load-Balaned Swithes With Fat-Tree Networks Hung-Shih Chueh, Ching-Min Lien, Cheng-Shang Chang, Jay Cheng, and Duan-Shin Lee Department of Eletrial Engineering & Institute of Communiations

More information

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks Flow Demands Oriented Node Plaement in Multi-Hop Wireless Networks Zimu Yuan Institute of Computing Tehnology, CAS, China {zimu.yuan}@gmail.om arxiv:153.8396v1 [s.ni] 29 Mar 215 Abstrat In multi-hop wireless

More information

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT?

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? 3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? Bernd Girod, Peter Eisert, Marus Magnor, Ekehard Steinbah, Thomas Wiegand Te {girod eommuniations Laboratory, University of Erlangen-Nuremberg

More information

Extracting Partition Statistics from Semistructured Data

Extracting Partition Statistics from Semistructured Data Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk

More information

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index

An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index IJCSES International Journal of Computer Sienes and Engineering Systems, ol., No.4, Otober 2007 CSES International 2007 ISSN 0973-4406 253 An Optimized Approah on Applying Geneti Algorithm to Adaptive

More information

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks Unsupervised Stereosopi Video Objet Segmentation Based on Ative Contours and Retrainable Neural Networks KLIMIS NTALIANIS, ANASTASIOS DOULAMIS, and NIKOLAOS DOULAMIS National Tehnial University of Athens

More information

Low Complexity Quasi-Cyclic LDPC Decoder Architecture for IEEE n

Low Complexity Quasi-Cyclic LDPC Decoder Architecture for IEEE n Low Complexity Quasi-Cyclic LDPC Decoder Architecture for IEEE 802.11n Sherif Abou Zied 1, Ahmed Tarek Sayed 1, and Rafik Guindi 2 1 Varkon Semiconductors, Cairo, Egypt 2 Nile University, Giza, Egypt Abstract

More information

Boosted Random Forest

Boosted Random Forest Boosted Random Forest Yohei Mishina, Masamitsu suhiya and Hironobu Fujiyoshi Department of Computer Siene, Chubu University, 1200 Matsumoto-ho, Kasugai, Aihi, Japan {mishi, mtdoll}@vision.s.hubu.a.jp,

More information

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System Algorithms, Mehanisms and Proedures for the Computer-aided Projet Generation System Anton O. Butko 1*, Aleksandr P. Briukhovetskii 2, Dmitry E. Grigoriev 2# and Konstantin S. Kalashnikov 3 1 Department

More information

Performance of Histogram-Based Skin Colour Segmentation for Arms Detection in Human Motion Analysis Application

Performance of Histogram-Based Skin Colour Segmentation for Arms Detection in Human Motion Analysis Application World Aademy of Siene, Engineering and Tehnology 8 009 Performane of Histogram-Based Skin Colour Segmentation for Arms Detetion in Human Motion Analysis Appliation Rosalyn R. Porle, Ali Chekima, Farrah

More information

-z c = c T - c T B B-1 A 1 - c T B B-1 b. x B B -1 A 0 B -1 b. (a) (b) Figure 1. Simplex Tableau in Matrix Form

-z c = c T - c T B B-1 A 1 - c T B B-1 b. x B B -1 A 0 B -1 b. (a) (b) Figure 1. Simplex Tableau in Matrix Form 3. he Revised Simple Method he LP min, s.t. A = b ( ),, an be represented by Figure (a) below. At any Simple step, with known and -, the Simple tableau an be represented by Figure (b) below. he minimum

More information

Direct-Mapped Caches

Direct-Mapped Caches A Case for Diret-Mapped Cahes Mark D. Hill University of Wisonsin ahe is a small, fast buffer in whih a system keeps those parts, of the ontents of a larger, slower memory that are likely to be used soon.

More information

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking Algorithms for External Memory Leture 6 Graph Algorithms - Weighted List Ranking Leturer: Nodari Sithinava Sribe: Andi Hellmund, Simon Ohsenreither 1 Introdution & Motivation After talking about I/O-effiient

More information

A Dictionary based Efficient Text Compression Technique using Replacement Strategy

A Dictionary based Efficient Text Compression Technique using Replacement Strategy A based Effiient Text Compression Tehnique using Replaement Strategy Debashis Chakraborty Assistant Professor, Department of CSE, St. Thomas College of Engineering and Tehnology, Kolkata, 700023, India

More information

Divide-and-conquer algorithms 1

Divide-and-conquer algorithms 1 * 1 Multipliation Divide-and-onquer algorithms 1 The mathematiian Gauss one notied that although the produt of two omplex numbers seems to! involve four real-number multipliations it an in fat be done

More information

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections SVC-DASH-M: Salable Video Coding Dynami Adaptive Streaming Over HTTP Using Multiple Connetions Samar Ibrahim, Ahmed H. Zahran and Mahmoud H. Ismail Department of Eletronis and Eletrial Communiations, Faulty

More information

And, the (low-pass) Butterworth filter of order m is given in the frequency domain by

And, the (low-pass) Butterworth filter of order m is given in the frequency domain by Problem Set no.3.a) The ideal low-pass filter is given in the frequeny domain by B ideal ( f ), f f; =, f > f. () And, the (low-pass) Butterworth filter of order m is given in the frequeny domain by B

More information

CleanUp: Improving Quadrilateral Finite Element Meshes

CleanUp: Improving Quadrilateral Finite Element Meshes CleanUp: Improving Quadrilateral Finite Element Meshes Paul Kinney MD-10 ECC P.O. Box 203 Ford Motor Company Dearborn, MI. 8121 (313) 28-1228 pkinney@ford.om Abstrat: Unless an all quadrilateral (quad)

More information

Detection and Recognition of Non-Occluded Objects using Signature Map

Detection and Recognition of Non-Occluded Objects using Signature Map 6th WSEAS International Conferene on CIRCUITS, SYSTEMS, ELECTRONICS,CONTROL & SIGNAL PROCESSING, Cairo, Egypt, De 9-31, 007 65 Detetion and Reognition of Non-Oluded Objets using Signature Map Sangbum Park,

More information

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating

Capturing Large Intra-class Variations of Biometric Data by Template Co-updating Capturing Large Intra-lass Variations of Biometri Data by Template Co-updating Ajita Rattani University of Cagliari Piazza d'armi, Cagliari, Italy ajita.rattani@diee.unia.it Gian Lua Marialis University

More information

Cluster-Based Cumulative Ensembles

Cluster-Based Cumulative Ensembles Cluster-Based Cumulative Ensembles Hanan G. Ayad and Mohamed S. Kamel Pattern Analysis and Mahine Intelligene Lab, Eletrial and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1,

More information

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework EECS 33 There be Dragons here http://ziyang.ees.northwestern.edu/ees33/ Teaher: Offie: Email: Phone: L477 Teh dikrp@northwestern.edu 847 467 2298 Today s material might at first appear diffiult Perhaps

More information

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality INTERNATIONAL CONFERENCE ON MANUFACTURING AUTOMATION (ICMA200) Multi-Piee Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality Stephen Stoyan, Yong Chen* Epstein Department of

More information

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Ho Wireless Networks Giorgio Quer, Federio Librino, Lua Canzian, Leonardo Badia, Mihele Zorzi, University of California San Diego La

More information

Improved flooding of broadcast messages using extended multipoint relaying

Improved flooding of broadcast messages using extended multipoint relaying Improved flooding of broadast messages using extended multipoint relaying Pere Montolio Aranda a, Joaquin Garia-Alfaro a,b, David Megías a a Universitat Oberta de Catalunya, Estudis d Informàtia, Mulimèdia

More information

Multi-hop Fast Conflict Resolution Algorithm for Ad Hoc Networks

Multi-hop Fast Conflict Resolution Algorithm for Ad Hoc Networks Multi-hop Fast Conflit Resolution Algorithm for Ad Ho Networks Shengwei Wang 1, Jun Liu 2,*, Wei Cai 2, Minghao Yin 2, Lingyun Zhou 2, and Hui Hao 3 1 Power Emergeny Center, Sihuan Eletri Power Corporation,

More information

Accommodations of QoS DiffServ Over IP and MPLS Networks

Accommodations of QoS DiffServ Over IP and MPLS Networks Aommodations of QoS DiffServ Over IP and MPLS Networks Abdullah AlWehaibi, Anjali Agarwal, Mihael Kadoh and Ahmed ElHakeem Department of Eletrial and Computer Department de Genie Eletrique Engineering

More information

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425) Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University

More information

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION Ken Sauer and Charles A. Bouman Department of Eletrial Engineering, University of Notre Dame Notre Dame, IN 46556, (219) 631-6999 Shool of

More information

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method 3537 Multiple-Criteria Deision Analysis: A Novel Rank Aggregation Method Derya Yiltas-Kaplan Department of Computer Engineering, Istanbul University, 34320, Avilar, Istanbul, Turkey Email: dyiltas@ istanbul.edu.tr

More information

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0;

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0; Naïve line drawing algorithm // Connet to grid points(x0,y0) and // (x1,y1) by a line. void drawline(int x0, int y0, int x1, int y1) { int x; double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx;

More information

Particle Swarm Optimization for the Design of High Diffraction Efficient Holographic Grating

Particle Swarm Optimization for the Design of High Diffraction Efficient Holographic Grating Original Artile Partile Swarm Optimization for the Design of High Diffration Effiient Holographi Grating A.K. Tripathy 1, S.K. Das, M. Sundaray 3 and S.K. Tripathy* 4 1, Department of Computer Siene, Berhampur

More information

A PROTOTYPE OF INTELLIGENT VIDEO SURVEILLANCE CAMERAS

A PROTOTYPE OF INTELLIGENT VIDEO SURVEILLANCE CAMERAS INTERNATIONAL JOURNAL OF INFORMATION AND SYSTEMS SCIENCES Volume 1, 3, Number 1, 3, Pages 1-22 365-382 2007 Institute for Sientifi Computing and Information A PROTOTYPE OF INTELLIGENT VIDEO SURVEILLANCE

More information

Gradient based progressive probabilistic Hough transform

Gradient based progressive probabilistic Hough transform Gradient based progressive probabilisti Hough transform C.Galambos, J.Kittler and J.Matas Abstrat: The authors look at the benefits of exploiting gradient information to enhane the progressive probabilisti

More information

INTERPOLATED AND WARPED 2-D DIGITAL WAVEGUIDE MESH ALGORITHMS

INTERPOLATED AND WARPED 2-D DIGITAL WAVEGUIDE MESH ALGORITHMS Proeedings of the COST G-6 Conferene on Digital Audio Effets (DAFX-), Verona, Italy, Deember 7-9, INTERPOLATED AND WARPED -D DIGITAL WAVEGUIDE MESH ALGORITHMS Vesa Välimäki Lab. of Aoustis and Audio Signal

More information

KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION

KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION Cuiui Kang 1, Shengai Liao, Shiming Xiang 1, Chunhong Pan 1 1 National Laboratory of Pattern Reognition, Institute of Automation, Chinese

More information

Contents Contents...I List of Tables...VIII List of Figures...IX 1. Introduction Information Retrieval... 8

Contents Contents...I List of Tables...VIII List of Figures...IX 1. Introduction Information Retrieval... 8 Contents Contents...I List of Tables...VIII List of Figures...IX 1. Introdution... 1 1.1. Internet Information...2 1.2. Internet Information Retrieval...3 1.2.1. Doument Indexing...4 1.2.2. Doument Retrieval...4

More information

Chapter 2: Introduction to Maple V

Chapter 2: Introduction to Maple V Chapter 2: Introdution to Maple V 2-1 Working with Maple Worksheets Try It! (p. 15) Start a Maple session with an empty worksheet. The name of the worksheet should be Untitled (1). Use one of the standard

More information

Robust Dynamic Provable Data Possession

Robust Dynamic Provable Data Possession Robust Dynami Provable Data Possession Bo Chen Reza Curtmola Department of Computer Siene New Jersey Institute of Tehnology Newark, USA Email: b47@njit.edu, rix@njit.edu Abstrat Remote Data Cheking (RDC)

More information

Background/Review on Numbers and Computers (lecture)

Background/Review on Numbers and Computers (lecture) Bakground/Review on Numbers and Computers (leture) ICS312 Mahine-Level and Systems Programming Henri Casanova (henri@hawaii.edu) Numbers and Computers Throughout this ourse we will use binary and hexadeimal

More information

13.1 Numerical Evaluation of Integrals Over One Dimension

13.1 Numerical Evaluation of Integrals Over One Dimension 13.1 Numerial Evaluation of Integrals Over One Dimension A. Purpose This olletion of subprograms estimates the value of the integral b a f(x) dx where the integrand f(x) and the limits a and b are supplied

More information

A Coarse-to-Fine Classification Scheme for Facial Expression Recognition

A Coarse-to-Fine Classification Scheme for Facial Expression Recognition A Coarse-to-Fine Classifiation Sheme for Faial Expression Reognition Xiaoyi Feng 1,, Abdenour Hadid 1 and Matti Pietikäinen 1 1 Mahine Vision Group Infoteh Oulu and Dept. of Eletrial and Information Engineering

More information

Interconnect Delay Minimization through Interlayer Via Placement in 3-D ICs

Interconnect Delay Minimization through Interlayer Via Placement in 3-D ICs Interonnet Delay Minimization through Interlayer Via Plaement in -D ICs Vasilis F. Pavlidis, Eby G. Friedman Department of Eletrial and Computer Engineering University of Rohester Rohester, New York 467,

More information

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall Department of Eletrial and Computer Engineering University of Wisonsin Madison ECE 553: Testing and Testable Design of Digital Systems Fall 2014-2015 Assignment #2 Date Tuesday, September 25, 2014 Due

More information

Compressed Sensing mm-wave SAR for Non-Destructive Testing Applications using Side Information

Compressed Sensing mm-wave SAR for Non-Destructive Testing Applications using Side Information Compressed Sensing mm-wave SAR for Non-Destrutive Testing Appliations using Side Information Mathias Bequaert 1,2, Edison Cristofani 1,2, Gokarna Pandey 2, Marijke Vandewal 1, Johan Stiens 2,3 and Nikos

More information

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization

Self-Adaptive Parent to Mean-Centric Recombination for Real-Parameter Optimization Self-Adaptive Parent to Mean-Centri Reombination for Real-Parameter Optimization Kalyanmoy Deb and Himanshu Jain Department of Mehanial Engineering Indian Institute of Tehnology Kanpur Kanpur, PIN 86 {deb,hjain}@iitk.a.in

More information

Exploring the Commonality in Feature Modeling Notations

Exploring the Commonality in Feature Modeling Notations Exploring the Commonality in Feature Modeling Notations Miloslav ŠÍPKA Slovak University of Tehnology Faulty of Informatis and Information Tehnologies Ilkovičova 3, 842 16 Bratislava, Slovakia miloslav.sipka@gmail.om

More information

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study What are Cyle-Stealing Systems Good For? A Detailed Performane Model Case Study Wayne Kelly and Jiro Sumitomo Queensland University of Tehnology, Australia {w.kelly, j.sumitomo}@qut.edu.au Abstrat The

More information

THROUGHPUT EVALUATION OF AN ASYMMETRICAL FDDI TOKEN RING NETWORK WITH MULTIPLE CLASSES OF TRAFFIC

THROUGHPUT EVALUATION OF AN ASYMMETRICAL FDDI TOKEN RING NETWORK WITH MULTIPLE CLASSES OF TRAFFIC THROUGHPUT EVALUATION OF AN ASYMMETRICAL FDDI TOKEN RING NETWORK WITH MULTIPLE CLASSES OF TRAFFIC Priya N. Werahera and Anura P. Jayasumana Department of Eletrial Engineering Colorado State University

More information

Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis

Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis Journal of Computer Siene 4 (): 9-97, 008 ISSN 549-3636 008 Siene Publiations Fuzzy Meta Node Fuzzy Metagraph and its Cluster Analysis Deepti Gaur, Aditya Shastri and Ranjit Biswas Department of Computer

More information

Cluster Centric Fuzzy Modeling

Cluster Centric Fuzzy Modeling 10.1109/TFUZZ.014.300134, IEEE Transations on Fuzzy Systems TFS-013-0379.R1 1 Cluster Centri Fuzzy Modeling Witold Pedryz, Fellow, IEEE, and Hesam Izakian, Student Member, IEEE Abstrat In this study, we

More information

M32: A Constructive Multilevel Logic Synthesis System*

M32: A Constructive Multilevel Logic Synthesis System* M32: A Construtive Multilevel Logi Synthesis System* Vitor N. Kravets Karem A. Sakallah Department of Eletrial Engineering and Computer Siene University of Mihigan, Ann Arbor, MI 48109 {vkravets, karem}@ees.umih.edu

More information

Using Augmented Measurements to Improve the Convergence of ICP

Using Augmented Measurements to Improve the Convergence of ICP Using Augmented Measurements to Improve the onvergene of IP Jaopo Serafin, Giorgio Grisetti Dept. of omputer, ontrol and Management Engineering, Sapienza University of Rome, Via Ariosto 25, I-0085, Rome,

More information

SINR-based Network Selection for Optimization in Heterogeneous Wireless Networks (HWNs)

SINR-based Network Selection for Optimization in Heterogeneous Wireless Networks (HWNs) 48 J. ICT Res. Appl., Vol. 9, No., 5, 48-6 SINR-based Network Seletion for Optimization in Heterogeneous Wireless Networks (HWNs) Abubakar M. Miyim, Mahamod Ismail & Rosdiadee Nordin Department of Eletrial,

More information

FUZZY WATERSHED FOR IMAGE SEGMENTATION

FUZZY WATERSHED FOR IMAGE SEGMENTATION FUZZY WATERSHED FOR IMAGE SEGMENTATION Ramón Moreno, Manuel Graña Computational Intelligene Group, Universidad del País Vaso, Spain http://www.ehu.es/winto; {ramon.moreno,manuel.grana}@ehu.es Abstrat The

More information

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 9, September 2013 ISSN: 2277 128X International Journal of Advaned Researh in Computer Siene and Software Engineering Researh Paper Available online at: www.ijarsse.om A New-Fangled Algorithm

More information

Parallelization and Performance of 3D Ultrasound Imaging Beamforming Algorithms on Modern Clusters

Parallelization and Performance of 3D Ultrasound Imaging Beamforming Algorithms on Modern Clusters Parallelization and Performane of 3D Ultrasound Imaging Beamforming Algorithms on Modern Clusters F. Zhang, A. Bilas, A. Dhanantwari, K.N. Plataniotis, R. Abiprojo, and S. Stergiopoulos Dept. of Eletrial

More information

C 2 C 3 C 1 M S. f e. e f (3,0) (0,1) (2,0) (-1,1) (1,0) (-1,0) (1,-1) (0,-1) (-2,0) (-3,0) (0,-2)

C 2 C 3 C 1 M S. f e. e f (3,0) (0,1) (2,0) (-1,1) (1,0) (-1,0) (1,-1) (0,-1) (-2,0) (-3,0) (0,-2) SPECIAL ISSUE OF IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION: MULTI-ROBOT SSTEMS, 00 Distributed reonfiguration of hexagonal metamorphi robots Jennifer E. Walter, Jennifer L. Welh, and Nany M. Amato Abstrat

More information

Optimizing Sparse Matrix Operations on GPUs using Merge Path

Optimizing Sparse Matrix Operations on GPUs using Merge Path 21 IEEE 29th International Parallel and Distributed Proessing Symposium Optimizing Sparse Matrix Operations on GPUs using Merge Path Steven Dalton, Luke Olson Department of Computer Siene University of

More information