Partially-Parallel LDPC Decoder Achieving High-Efficiency Message-Passing Schedule

Size: px
Start display at page:

Download "Partially-Parallel LDPC Decoder Achieving High-Efficiency Message-Passing Schedule"

Transcription

1 IEICE TRANS. FUNDAMENTALS, VOL.E89 A, NO.4 APRIL PAPER Special Section on Selected Papers from the 18th Workshop on Circuits and Systems in Karuizawa Partially-Parallel LDPC Decoder Achieving High-Efficiency Message-Passing Schedule Kazunori SHIMIZU a), Tatsuyuki ISHIKAWA, Nonmembers, Nozomu TOGAWA, Takeshi IKENAGA, Members, and Satoshi GOTO, Fellow SUMMARY In this paper, we propose a partially-parallel LDPC decoder which achieves a high-efficiency message-passing schedule. The proposed LDPC decoder is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) The proposed parallel pipelined bit functional unit enables the column operation module to compute every message in each bit node which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay when the row and column operations are performed concurrently. Therefore, the proposed decoder performs the column operations more frequently in a single iterative decoding, and achieves a high-efficiency message-passing schedule within the limited decoding delay time. Hardware implementation on an FPGA and simulation results show that the proposed partially-parallel LDPC decoder improves the decoding throughput and bit error performance with a small hardware overhead. key words: low-density parity-check codes, partially-parallel LDPC decoder, message-passing algorithm, FPGA 1. Introduction Low-Density Parity-Check (LDPC) codes achieve information rates very close to the Shannon limit by using the message-passing algorithm [1] [4]. In the last few years some work has been done on designing LDPC decoder in Refs. [7] [13]. LDPC decoders are composed of a check functional unit (CFU) and a bit functional unit (BFU), where the CFU performs row operations for check nodes and the BFU performs column operations for bit nodes. References [7], [8] have proposed a fully-parallel LDPC decoder. Considering the trade-offs between hardware cost and decoding throughput, the partially-parallel LDPC decoder is the most practical implementation as indicated in Refs. [9] [13]. The decoding throughput for the LDPC decoder is determined by the decoding delay time which is the product of the single iterative decoding delay, times the number of iterations. The increase of the decoding delay time not only degrades the decoding throughput but also the bit error performance. This is because the decoder has to correct as many bit errors as possible within the limited decoding delay time. Manuscript received June 23, Manuscript revised September 30, Final manuscript received November 30, The authors are with the Graduate School of Information, Production and Systems, Waseda University, Kitakyushu-shi, Japan. The author is with the Dept. of Computer Science, Waseda University, Tokyo, Japan. a) kazu@suou.waseda.jp DOI: /ietfec/e89 a The requirements in order to improve the decoding throughput and bit error performance are as follows: (1) The single iterative decoding delay should be reduced by performing the row and column operations concurrently. (2) The number of iterations until the decoding convergence is reached should be reduced by improving the message-passing efficiency. The requirements (1) and (2) are dependent on the messagepassing schedule for the row and column operations. On the other hand, the requirement from a hardware implementation point of view is as follows: (3) The message-passing schedule should not complicate the hardware design. Especially, the message-passing schedule should not partition the memory into a large number of memory banks. The decoder shown in Refs. [9] [11] performs the row and column operations independently. This schedule enables the decoder to perform the row and column operations concurrently using a dual memory architecture. Therefore, the single iterative decoding delay is reduced. However, the message-passing efficiency between the check and bit nodes is degraded since the row and column operations are performed independently. (i.e. the decoder does not meet the requirement (2).) On the other hand, in the partially-parallel LDPC decoder shown in Refs. [12], [13], the row operations follow the column operations. By approximating the column operation in the decoder Ref. [12], the single iterative decoding delay and the number of memory banks and words can be reduced. However, the approximation degrades the messagepassing efficiency. (i.e. the decoder does not meet the requirement (2).) In the decoder shown in Ref. [13], each column operation computes only a single message in association with each row operation. In order to compute a single message in the column operation and perform the row operation concurrently, the decoder partitions the memory into a large number of memory banks. (i.e. the decoder does not meet the requirement (3).) In this paper, we propose an efficient architecture for the partially-parallel LDPC decoder which meets the requirements (1), (2) and (3) simultaneously. The proposed decoder is based on the simple addressing and control logic as shown in Refs. [9] [11]. Firstly, the proposed schedule Copyright c 2006 The Institute of Electronics, Information and Communication Engineers

2 970 IEICE TRANS. FUNDAMENTALS, VOL.E89 A, NO.4 APRIL 2006 performs the row operations determining positions for the column operations. The column operations are then performed at these positions. We propose a pipelined architecture to ensure that the row and column operations are performed concurrently. Secondly, we focus on the fact that the computational complexity of the column operation is less than that of the row operation. In the proposed schedule, the row and column operations are performed concurrently, as a result of which the column operations can be performed more frequently in a single iterative decoding. From this point of view, the proposed parallel pipelined bit functional unit enables the column operation module to compute every message in each bit node which is updated by the row operations. These column operations can be performed without extending the single iterative decoding delay. By using the proposed schedule, the row and column operations can be performed concurrently, and the messagepassing efficiency is improved significantly. The proposed partially-parallel LDPC decoder was implemented on an FPGA, and the bit error performance of the decoder was simulated. Hardware implementation and simulation results show that the proposed decoder improves the decoding throughput and bit error performance with a small hardware overhead. 2. Partially-Parallel LDPC Decoder Low-Density Parity-Check (LDPC) codes are a class of linear block codes with very sparse parity check matrices. The size of the parity check matrix H is defined by M N,where M represents the total number of check bits and N represents the total number of codeword bits. The codeword y satisfies the parity check equation of H y = 0. The parity check matrix is represented by a bipartite graph called Tanner graph shown in Fig. 1. There are two types of nodes in the graph, called bit and check nodes. Each check node c m, m = 1,, M is connected to bit node b n, n = 1,, N, where elements in the matrix H are one. LDPC codes can be decoded iteratively using a message-passing algorithm as described in Ref. [6]. Each iteration of message-passing algorithm is composed of two phases. Phase 1 is called row operation. This updates messages (α mn ) of all check nodes, and sends the messages to bit nodes. Phase 2 called column operation updates messages (β mn ) of all bit nodes, and sends the messages to check nodes. The message-passing algorithm is defined in Fig. 2, where A(m) {n H mn = 1}, B(n) {m H mn = 1}, andgalllager function f (x) ln exp(x)+1 exp(x) 1. A partially-parallel LDPC decoder performs the Phase 1 (row operations) and the Phase 2 (column operations) partially in parallel. For the partially-parallel LDPC decoder, the parity check matrix has to be structured in order to reuse the parallel CFUs and BFUs. Figure 3 shows that the blockstructured parity check matrix for a (w c,w r )-regular LDPC code. The matrix is composed of w c w r sub-blocks. The diagonal line in each sub-block in Fig. 3 represents a one in the sub-block. The b b square matrix is defined by shifting each row of the identity matrix I b b to the right. We determine the shift value by cyclotomic cosets as shown in Ref. [10]. If the partially-parallel LDPC decoder has k CFUs for each w c row block and k BFUs for each w r column block, the decoder can perform the k w c row operations and k w r column operations in parallel. In order to perform row and column operations concurrently, a dual memory architecture is used in the LDPC decoder. By using the dual memory architecture, messagesβ obtained from the column operation can be stored in the memory of the row operation module while messages α obtained from the row operation are stored in the memory of the column operation module [9], [10]. Initialization : Compute the log likelihood ratio (LLR) λ n for bit nodes (n = 1, 2,, N), and set β mn = λ n for each (m, n) satisfying H mn = 1. Phase 1 : For all the check nodes c m in the order corresponding to m = 1, 2,, M; Compute message α mn with the following equation, where each set (m, n) satisfies H mn = 1. α mn = sign(β mn ) f f ( β mn ). (1) n A(m)\n n A(m)\n Phase 2 : For all the bit nodes b n in the order corresponding to n = 1, 2,, N; Compute message β mn with the following equation, where each set (m, n) satisfies H mn = 1. β mn = λ n + α m n. (2) m B(n)\m Tentative decision : Compute all the tentative LDPC codeword ŷ n for n = 1, 2,, N with the following equation. { 0, sign(λn + m ŷ n = B(n) α m n) = 1. 1, sign(λ n + m B(n) α m n) = 1. (3) Parity Check : If the tentative LDPC codeword yˆ n satisfies the parity check equation as shown in Eq. (4), or if the maximum number of iterations is reached then stop the algorithm, otherwise go to Phase 1, and continue iterations. H (ˆ y 1, yˆ 2,, yˆ N ) T = 0. (4) Fig. 1 Tanner graph of a parity check matrix. Fig. 2 Message-passing algorithm.

3 SHIMIZU et al.: PARTIALLY-PARALLEL LDPC DECODER ACHIEVING HIGH-EFFICIENCY MESSAGE-PASSING SCHEDULE 971 Fig. 3 Block-structured parity check matrix for a partially-parallel LDPC decoder. 3. Partially-Parallel LDPC Decoder Achieving High- Efficiency Message-Passing Schedule In this section, we propose a novel high-efficiency messagepassing schedule and its hardware architecture for the partially-parallel LDPC decoder. 3.1 High-Efficiency Message-Passing Schedule In order to improve the decoding throughput and bit error performance, the message-passing schedule and its hardware architecture should meet the requirements (1), (2) and (3) described in Section 1. In order to meet the requirement (3), we propose the message-passing schedule and hardware architecture based on the simple addressing and control logic as shown in Refs. [9] [11]. Figure 4 shows an example of the message-passing schedule shown in Refs. [9] [11]. The message-passing schedule performs i 1, i 2, i 3 -th row operations starting from 1 to b, andj 1,, j 6 -th column operations starting from 1 to b in parallel. The row and column operations are performed independently. In q- th column block, three messages α (i 1(q) ),α (i 2(q) ),α (i 3(q) ) are updated by i 1, i 2, i 3 -th row operations, and three messages β ( j q(1) ),β ( j q(2) ),β ( j q(3) ) are updated by j q -th column operation. The row and column operations update the message α and β, where the elements of each i 1, i 2, i 3 -th row and j 1,, j 6 -th column are one. Therefore, the position of the updated message α is different from that of the updated message β. The updated messages α is not used by the column operation until the column number j q reaches the column position of the updated message α. In addition, the timing when the column operation is performed with the updated messages α is different among sub-blocks. This is because the shift value of identity matrix I b b for each sub-block is different from that for the other sub-blocks (see Fig. 3). From this point of view, we propose a high-efficiency message-passing schedule improving the timing when the column operations are performed with the latest messages Fig. 4 block. Message-passing schedule shown in Refs. [9] [11] in q-th column α updated by the row operations. In addition, we focus on the fact that the computational complexity of the row and column operation are proportional to the number of inputs. For the (w c,w r )-regular LDPC codes, the number of inputs in the row operation and inputs in the column operation are degree w r in each row and degree w c in each column of a parity check matrix respectively. Code rate R of the (w c,w r )- regular LDPC codes satisfies 0 < 1 w c /w r R < 1, accordingly the degree w r and w c satisfy w c <w r. This indicates that the computational complexity of the column operation is less than that of the row operation. Therefore, the column operations can be performed more frequently than the row operations in the single iterative decoding delay. In order to meet the requirement (2), we propose a high-efficiency message-passing schedule which is characterized as follows: (i) First, the proposed schedule performs the row operations determining positions for the column operations. The column operations are then performed at these positions.

4 972 IEICE TRANS. FUNDAMENTALS, VOL.E89 A, NO.4 APRIL 2006 Fig. 5 High-efficiency message-passing schedule in q-th column block. Fig. 7 The average number of iterations (l max = 4). Input : β(i 1(q) ), β(i 2(q) ), β(i 3(q) ). Step 1 : i 1, i 2, i 3 -th row operations in each row block are performed in parallel. i 1 -th row operation updates the message α (i 1(q) ), i 2 -th row operation updates the message α (i 2(q) )andi 3 -th row operation updates the message α (i 3(q) ). Step 2 : In each q-th column block, j q1 -th column operation is performed by using the updated message α (i 1(q) ). j q2 -th column operation is performed by using the updated message α (i 2(q) ). j q3 -th column operation is performed by using the updated message α (i 3(q) ). Step 3 : j q1 -th column operation updates the message β ( j q1(1) ), β ( j q1(2) ), and β ( j q1(3) ). j q2 -th column operation updates the message β ( j q2(1) ), β ( j q2(2) ), and β ( j q2(3) ). j q3 -th column operation updates the message β ( j q3(1) ), β ( j q3(2) ), and β ( j q3(3) ). Output : β ( j q1(1) ), β ( j q1(2) ), β ( j q1(3) ), β ( j q2(1) ), β ( j q2(2) ), β ( j q2(3) ), β ( j q3(1) ), β ( j q3(2) ), β ( j q3(3) ) Fig. 6 High-efficiency message-passing schedule in q-th column block. Fig. 8 Bit error performance (l max = 4). (ii) The proposed schedule performs every column operation using every message α updated by the row operations. The proposed schedule is shown in Figs. 5 and 6. In the proposed schedule, the column operations are always performed using the updated messages α immediately after the the row operations are performed. In addition, three column operations are performed using the three messages α in each j q -th column block. As shown in Figs. 4 and 5, the number of column operations based on the proposed schedule increases three times compared to that based on the schedule shown in Refs. [9] [11]. Accordingly, the proposed schedule also accelerates the timing when the row operations are performed with the latest messages β updated by the column operations. By using the high-efficiency message-passing schedule, the number of iterations for decoding can be reduced. This allows the decoder not only to increase the decoding throughput but also to improve the bit error performance within a limited decoding delay time. We evaluate the message-passing efficiency based on the schedule in Refs. [9] [13] and the proposed schedule in the algorithm level. We employ the (3,6)-regular LDPC codes whose codeword length is b w r = = 3336 [bits], where each sub-block of the parity check matrix is defined as b = 556. The maximum number of iterations is set to l max = 4 and 8 respectively. In the simulations, we assume the channel model to be AWGN (Additive White Gaussian Noise) Channel. Figures 7 and 9 shows the average number of iterations until the decoding convergence is reached. Figures 8 and 10 shows the bit error performance based on each schedule. The results show that the average number of iterations based on the proposed schedule reduces up to about 35% and obtains up to about 1.5 [db] better coding gain compared to the schedule in Refs. [9] [11]. The average number of iterations based on the schedule in Refs. [12],[13] is also better than that based on the schedule in Refs. [9] [11]. This is because the schedule in Refs. [12],[13] performs the row operation following the column operation, and it improves the timing when the row

5 SHIMIZU et al.: PARTIALLY-PARALLEL LDPC DECODER ACHIEVING HIGH-EFFICIENCY MESSAGE-PASSING SCHEDULE 973 Fig. 9 The average number of iterations (l max = 8). Fig. 11 Proposed row operation module. the proposed schedule performs the column operations more frequently compared to the other schedules in addition to improving the timing of the column operations. In the next section, we propose an efficient hardware architecture for the proposed high-efficiency message-passing schedule to ensure that the row and column operations are performed concurrently. 3.2 Hardware Architecture Fig. 10 Bit error performance (l max = 8). operations are performed with the latest message β which is updated by the column operations. However, the schedule in Ref. [12] approximates the column operations, and the schedule in Ref. [13] computes only a single message in association with each row operation. Compared with the schedule in Ref. [12], the average number of iterations based on the proposed schedule reduces up to about 14% and obtains up to about 0.73 [db] better coding gain. Compared with the schedule in Ref. [13], the average number of iterations based on the proposed schedule reduces up to about 16% and obtains up to about 0.27 [db] better coding gain. The results show that the schedule in Refs. [9] [11] degrades the bit error performance significantly when the maximum number of iterations (l max ) is small. On the other hand, even if the maximum number of iterations (l max )is increased, the schedule in Ref. [12] shows only a slight improvement in the bit error performance because of the approximation of the column operations. The results show that the proposed schedule achieves the best performance compared to the schedule in Refs. [9] [13]. This is because Main modules of the partially-parallel LDPC decoder are row operation module and column operation module. In order to meet the requirement (1), we propose a hardware architecture of the partially-parallel LDPC decoder based on the high-efficiency message-passing schedule, which is characterized as follows: (i) The column operations follow the row operations in a pipelined architecture to ensure that the row and column operations are performed concurrently. (ii) Our parallel pipelined bit functional unit enables the decoder to complete three column operations within a single row operation delay. Therefore, these column operations can be performed without extending the single iterative decoding delay. In the following section, we design the partially-parallel LDPC decoder for (3, 6)-regular LDPC codes Row Operation Module The row operation module is shown in Fig. 11. Each row operation module has six memory banks (β(i p(1) ),,β(i p(6) )) for messages β which are updated by six column operations in parallel. When the row operation module has k CFUs,

6 974 IEICE TRANS. FUNDAMENTALS, VOL.E89 A, NO.4 APRIL 2006 Fig. 13 Serial bit functional unit. Fig. 12 Proposed column operation module. Fig. 14 Proposed parallel pipelined bit functional unit. each memory bank has k sets of messages β in a single word, and each memory bank is composed of b/k words. The address translator (R2C) translates the row address to column address corresponding to the positions of one in each sub-block. For the i p -th row operation, the row operation module inputs the row address i p to the memory banks for messages β and address translator. The CFU computes Eq. (1) using messages β(i p(1) ),,β(i p(6) ), where function f (x) is called Gallager function. Since the hardware for the Gallager function has a large number of gates, an approximated minimum function can be applied to the parallel LDPC decoder [6]. The row operation module outputs six sets of the message α (i p(1) ),,α (i p(6) ) and corresponding column address ( j 1p,, j 6p ) to each of the six column operation modules Column Operation Module The proposed column operation module is shown in Fig. 12. Each column operation module has three memory banks for messages α which are updated by the three row operation modules. The row operation module has the memory banks for messages β and column operation module has the memory banks for messages α (i.e. dual memory architecture). Therefore, the row and column operations can be performed concurrently. Each column operation module receives three sets of the messages α and corresponding column addresses from the three row operation modules. The addressing unit in the proposed column operation module stores the three column addresses. In the proposed column operation module, the BFU performs the three column operations for the three column addresses j qx (x = 1,, 3) with Eq. (2) sequentially. The address translator (C2R) translates the column address to row address corresponding to the positions of one in each sub-block. For a single j qx -th column operation, the column operation module input the column address j qx to the memory banks for messages α and address translator. The single column operation computes three messages β ( j qx(1) ),β ( j qx(2) ),β ( j qx(3) ) by using the updated message α (i x(q) ). In order to achieve a high operation frequency as achieved in the CFU, the serial BFU architecture shown in Fig. 13 can be applied to the column operation module [10]. The serial BFU computes the additions shown in Eq. (2) serially. However, by using the serial BFU, three column operations by the proposed schedule take more clock cycles than a single row operation. Therefore, three column operations by the proposed high-efficiency message-passing schedule become a bottleneck in a single iterative decoding when a single row operation and three column operations are performed concurrently. From this point of view, we propose a parallel pipelined BFU as shown in Fig. 14. The proposed parallel pipelined BFU computes the three sets of the additions in parallel. Figure 15 shows the timing diagram of the row and column operation. The upper half of Fig. 15 shows that the row operation module computes absolute value β and determine the first and second minimum values β in parallel using the pipeline architecture. In order to achieve high operation frequency of the decoder, the CFU in the row operation module compares the β once in each cycle. It takes 7 clock cycles to determine the first and second minimum values. The row operation module takes 10 clock cycles totally to perform a single row operation as shown in Fig. 15. The lower half of Fig. 15 shows that each column operation module performs three column operations after the row operations. The BFU with the pipeline architecture takes 6 clock cycles totally to perform three column operations. The number of clock cycles for three column operations is less than that for a single row operation. In addition, the critical path delay of the proposed column operation module is expected to be smaller than that of the row operation module. Therefore, our partially-parallel LDPC decoder does not degrade the decoding throughput when a single row operation and three column operations are performed concurrently. In the proposed column operation module, the unit that has most overhead is the addressing unit (as shown in the

7 SHIMIZU et al.: PARTIALLY-PARALLEL LDPC DECODER ACHIEVING HIGH-EFFICIENCY MESSAGE-PASSING SCHEDULE 975 Fig. 15 Timing diagram of the row and column operation. Schedule Table 1 Comparison of the number of memory banks and words based on each schedule. message (α mn ) message (β mn ) input (λ n ) output ( yˆ n ) total (w r =6,w c =3,b=556) banks words banks words banks words banks words banks words Ref. [12] w r w r b w r w r b w r w r b Ref. [13] w r w c w r w c b 2 w r w c 2 w r b w r w c w r b w r w c w r b Refs. [9] [11], Proposed Schedule w r w c w r w c b w r w c w r w c b w r w r b w r w r b gray area of Fig. 12) to perform three column operations after the row operations. In addition, the hardware size of the proposed parallel pipelined BFU is to be larger than that of the serial BFU. 4. Implementation Results Firstly, we evaluate the number of memory banks and words for the LDPC decoder. Memories are required for the messages α mn, β mn, input values λ n, and tentative output values yˆ n. The required number of memory banks and words based on the schedule shown in Refs. [9] [13] and the proposed schedule are shown in Table 1. In the table, the number of memory banks and words are calculated based on the parity check matrix shown in Fig. 3. The total number of memory banks and words are obtained from the parity check matrix, where w r = 6, w c = 3, b = 556. The schedule in Ref. [12] is designed based on a single memory architecture, and the row operations follow the column operations. Therefore, the decoder based on this schedule does not perform the row and column operation concurrently. By approximating the column operation, the single iterative decoding delay can be reduced. In the approximation of the column operation, w c messages β in each bit node are to be same value. Therefore, the required number of memory banks and words can be reduced significantly. However, as shown in Figs. 8 and 10 this approximation degrades the bit error performance significantly. In the schedule shown in Ref. [13], each column operation computes only a single message in association with each row operation. Therefore, the required number of memory words for the messages β based on this schedule is less than that based on the proposed schedule. Clearly, the number of message-passings from the bit nodes based on this schedule is less than that based on the proposed schedule. The schedule in Ref. [13] partitions the memory into a large number of memory banks in order to compute the single message in the column operation and perform the row operation concurrently. The number of memory banks based on this schedule is about twice as many as that based on the proposed schedule. Presence of a large number of memory banks increases the hardware overhead caused by the duplication of addressing and control logic and the wires required to exchange the messages. This makes layouts of VLSI circuit difficult [14]. The proposed schedule is designed based on the schedule in Refs. [9] [11]. The schedule enables the decoder to perform the row and column operations concurrently using the dual memory architecture. The number of memory banks and words using the dual memory architecture based on the proposed schedule is same as that based on the schedule in Refs. [9] [11]. We evaluate the hardware overhead by the proposed schedule in terms of logical parts of the decoder compared to the decoder by the schedule in Refs. [9] [11]. We design a partially-parallel LDPC decoder for (3, 6)-regular LDPC codes according to the parity check matrix shown in Fig. 3. We design the decoder with k = 4 and each sub-block in the parity check matrix is defined by b = 556. Therefore, this decoder performs 4 3 = 12 row operations and 4 6 = 24 column operations in parallel, and decodes LDPC codeword whose total number of codeword bits is = 3336 bits. All intermediate messages are quantized, since the messages are obtained from fixed-point computations in the row and column operations. We define the quantization bits to be 8 bits which are divided into one sign bit, four integer bits

8 976 IEICE TRANS. FUNDAMENTALS, VOL.E89 A, NO.4 APRIL 2006 Module Name Table 2 Comparisons of FPGA synthesis and implementation results based on the schedule in Refs. [9] [11] and the proposed schedule (k=4) (VertexII xc2v2000-5bf957). Slice Usage Slice F/F 4-input LUT Total Slices Block RAM Delay [ns] Input Module for λ 0(0%) 0(0%) 0(0%) 6 (10%) Output & Parity Check Module 233 (1%) 688 (3%) 651 (6%) 6 (10%) Row Operation Module 1676 (7%) 3030 (14%) 2063 (19%) 18 (32%) Column Operation Module Refs. [9] [11] 3134 (14%) 3888 (18%) 3046 (28%) 18 (32%) Proposed Schedule 3933 (18%) 5132 (23%) 3754 (34%) 18 (32%) Controller Refs. [9] [11] 249 (1%) 155 (0%) 178 (1%) 0(0%) Proposed Schedule 253 (1%) 160 (0%) 181 (1%) 0(0%) LDPC Decoder Refs. [9] [11] 4812 (22%) 7487 (34%) 7408 (68%) 48 (85%) (Place & Route) Proposed Schedule 5552 (25%) 8680 (40%) 7808 (72%) 48 (85%) and three fractional bits. The proposed partially-parallel LDPC decoder is composed of input module, output and parity check module, row operation module, column operation module, and controller. Computation of the tentative decision in Eq. (3) is almost same as the column operation. Therefore, the column operation module also computes the tentative output values. In comparison to the hardware implementation based on the schedule in Refs. [9] [11] and the proposed schedule, we apply an approximated minimum function to each row operation module. The serial BFU shown in Fig. 13 is applied to the column operation module based on the schedule in Refs. [9] [11], and the parallel pipelined BFU shown in Fig. 14 is applied to the column operation module based on the proposed schedule respectively. We implement the design on the Xilinx Vertex II xc2v2000-5bf957 FPGA. Synthesis and implementation are carried out using ISE Ver.6.2 provided by Xilinx. The synthesis and implementation results are shown in Table 2. The value in parenthesis in the table represents the utilization of the FPGA resource. A single Slice of an FPGA resource has two 4-input LUTs and two Slice Flip Flops. The implementation results (i.e. Place & Route results) show that the number of occupied Slices for the partially-parallel LDPC decoder based on the proposed schedule increases by about 5% compared to that based on the schedule in Refs. [9] [11]. In the Slices, the number of occupied Slice Flip Flops which indicates the number of registers increases by about 15%, and the number of occupied 4-input LUTs increases by about 16%. The implementation results show that the proposed schedule can be implemented with a small hardware overhead. The critical path delay of the decoder based on the proposed schedule is almost same as that based on the schedule in Refs. [9] [11]. The implementation result shows that operation frequency of the partially-parallel LDPC decoder based on the proposed schedule achieves up to more than 100 [MHz]. We evaluate the LDPC decoding performance of the partially-parallel LDPC decoder based on the schedule in Refs. [9] [13] and proposed schedule. The decoding throughput is determined by the decoding delay time which is the product of the single iterative decoding delay, times the number of iterations. A single iterative decoding for the parity check matrix shown in When each w c row operation module has k CFUs and each w r column operation module has k BFUs respectively, the row operations delay in a single iterative decoding is b w c k w c d r,whered r denotes a single row operation delay, and the column operations delay in a single iterative decoding is b w r k w r d c,whered c denotes a single column operation delay. When the row and column operations are performed concurrently, the row operations delay is larger than the column operations delay (i.e. b k d r > b k d c). In this case, the single iterative decoding delay is b k d r.the schedule in Refs. [9] [11], [13] and the proposed schedule performs the row and column operations concurrently. The decoder based on the schedule in Ref. [12] does not perform the row and column operations concurrently since the schedule is designed based on a single memory architecture. However, the schedule in Ref. [12] approximates the column operation, and reduces the column operation delay significantly, accordingly the single iterative decoding delay based on this schedule is almost same as performing the row and column operations concurrently. Therefore, the decoding throughput based on each schedule is calculated from the following equation, where b w r represents the codeword length and l denotes the number of iterations. b w r [bit] Throughput = (5) l b/k d r [sec] When the row operation module shown in Fig. 11 is applied to each schedule, a single row operation takes 10 clock cycles. However, the row operation module with the pipelined architecture performs the row operations consecutively, and the following row operation starts at the 8-th clock cycle of the previous row operation. Therefore, a single row operation delay is d r = 7 10 [ns] when a operating frequency is 100 [MHz]. Tables 3 and 4 show the comparison of the number of iterations, decoding throughput, and bit error performance, where l max = 4, SNR=5.0, and l max = 8, SNR=4.5 respectively. The decoding throughput can be obtained from Eq. (5) and the average number of iterations. The average number of iterations for decoding is simulated in the algorithm level as shown in Figs. 7 and 9. These results show that the proposed schedule achieves best decoding throughput. In Table 4, the proposed decoder achieves about 52% higher decoding throughput compared to that based on the schedule in Refs. [9] [11]. Compared with the schedule in Ref. [12] and Ref. [13], the proposed schedule achieves about 13% and 14% higher decoding throughput respec-

9 SHIMIZU et al.: PARTIALLY-PARALLEL LDPC DECODER ACHIEVING HIGH-EFFICIENCY MESSAGE-PASSING SCHEDULE 977 Table 3 Comparison of the decoding performance based on each schedule. (l max = 4, SNR=5.0) Schedule Iterations Throughput BER Refs. [9] [11] [Mbps] Ref. [12] [Mbps] Ref. [13] [Mbps] Proposed Schedule [Mbps] Table 4 Comparison of the decoding performance based on each schedule. (l max = 8, SNR=4.5) Schedule Iterations Throughput BER Refs. [9] [11] [Mbps] Ref. [12] [Mbps] Ref. [13] [Mbps] Proposed Schedule [Mbps] tively. In addition, the bit error performance based on the proposed schedule achieves the value of zero in each result. This is because the proposed schedule accelerates the decoding convergence within a limited decoding delay time. From a hardware implementation point of view, the proposed schedule can be implemented with 5% FPGA resource overhead and same number of memory banks as that based on the schedule in Refs. [9] [11], where the number of memory banks is about half compared to that in the schedule in Ref. [13]. The number of memory banks and words based on the proposed schedule is larger than that based on the schedule in Ref. [12]. However, the proposed schedule reduces the single iterative decoding delay without approximating the column operation. Therefore, the proposed schedule achieves much better bit error performance compared to the schedule in Ref. [12]. 5. Conclusion In this paper, we propose a partially-parallel LDPC decoder which achieves a high-efficiency message-passing schedule. In the proposed schedule, accelerating the decoding convergence without extending the single iterative decoding delay enables the decoder not only to increase the decoding throughput but also to improve the bit error performance within a limited decoding delay time. Hardware implementation and simulation results show that the proposed decoder achieves up to about 54% higher decoding throughput, and obtains up to about 1.5 [db] better coding gain with 5% FPGA resource overhead compared to that based on the schedule shown in Refs. [9] [11]. Compared with the schedule shown in Ref. [12], the proposed schedule achieves up to about 16% higher decoding throughput, and obtains up to about 0.73 [db] better coding gain. In addition, the proposed schedule does not degrade the decoding performance even if the number of iterations for decoding is increased. Compared with the schedule shown in Ref. [13], the proposed schedule achieves up to about 19% higher decoding throughput, and obtains up to about 0.27 [db] better coding gain with about half the number of memory banks. These results show that the proposed schedule achieves a high-efficiency message-passing schedule and is efficient for proctical hardware implementations. Acknowledgements This work was supported by fund from the MEXT via Kitakyushu innovaive cluster project. References [1] D.J.C. MacKay and R.N. Neal, Near Shannon limit performance of low density parity check codes, IEEE Electron. Lett., vol.32, no.18, pp , Aug [2] D.J.C. MacKay, Good error-correcting codes based on very sparse matrices, IEEE Trans. Inf. Theory, vol.47, no.2, pp , [3] S.Y. Chung, G.D. Forney, Jr., T.J. Richardson, and R.L. Urbanke, On the design of low-density parity-check codes within db of the Shannon limit, IEEE Commun. Lett., vol.5, no.2, pp.58 60, Feb [4] T.J. Richardson and R.L. Urbanke, The capacity of low-density parity-check codes under message-passing decoding, IEEE Trans. Inf. Theory, vol.42, no.2, pp , [5] M. Fossorier, M. Mihaljevic, and H. Imai, Reduced complexity iterative decoding of low density parity check codes based on belief propagation, IEEE Trans. Commun., vol.47, no.5, pp , May [6] R.G. Gallager, Low-Density Parity-Check Codes, MIT Press, Cambridge, MA, [7] A. Blanksby and C. Howland, A 690-mW 1-Gbps 1024-b, rate- 1/2 low-density parity-check code decoder, IEEE J. Solid State Circuits, vol.37, no.3, pp , March [8] E. Liao, E. Yeo, and B. Nikolic, Low-density parity-check code constructions for hardware implementation, Proc IEEE International Conference on Communications, ICC 04, pp , Paris, June [9] Y. Chen and D. Hocevar, A FPGA and ASIC implementation of rate 1/ b irregular low density parity check decoder, IEEE Global Telecmmunications Conference, GLOBECOM 03, pp , [10] M. Mansour and N. Shanbhag, Low power VLSI decoder architectures for LDPC codes, Proc. International Symposium on Low Power Electronics and Design, pp , [11] M. Karkooti and J.R. Cavallaro, Semi-parallel reconfigurable architectures for real-time LDPC decoding, IEEE Proc. International Conference on Information Thechnology: Coding and Computing, ITCC 04, vol.1, pp , April [12] E. Yeo, P. Pakzad, B. Nikolic, and V. Anantharam, High throughput low-density parity-check decoder architectures, GLOBECOM 2001 IEEE Global Telecommunications Conference, no.1, pp , Nov [13] E. Boutillon, J. Castura, and F.R. Kschischang, Decoder-first code design, 2nd International Symposium on Turbo Codes and Related Topics, pp , Sept [14] L. Benini, L. Macchiarulo, A. Macii, and M. Poncino, Layoutdriven memory synthesis for embedded systems-on-chip, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol.10, no.2, pp , April [15] Xilinx, Inc., Vertex-II Platform FPGAs: Complete Data Sheet, DS031 (v3.3), pp.21 22, June 2004.

10 978 IEICE TRANS. FUNDAMENTALS, VOL.E89 A, NO.4 APRIL 2006 Kazunori Shimizu received the B.Eng. and M.Eng. degrees from Waseda University in 2002 and 2004 respectively, all in electronics, information and communication engineering. He is currently working towards the Dr. Eng. degree. His research interests are design and verification of VLSIs, especially reconfigurable hardware systems. Tatsuyuki Ishikawa received the B.Eng. in Electronic Engineering from Gunma University in He joined Toshiba Microelectronics in 1999, where he has been undertaking design and implementation of ASICs. He is currently working towards the M.Eng. degree in Waseda University. His research interests are design and implementation of VLSIs. Nozomu Togawa received the B.Eng., M.Eng., and Dr.Eng. degrees from Waseda University in 1992, 1994, and 1997, respectively, all in electrical engineering. He is presently an Associate Professor in the Department of Computer Science, Waseda University. His research interests are VLSI design, graph theory, and computational geometry. He is a member of IEEE and the Information Processing Society of Japan. Takeshi Ikenaga received the B.E. and M.E. degrees in electrical engineering and the Ph.D degree in information & computer science from Waseda University, Tokyo, Japan, in 1988, 1990, and 2002, respectively. He joined LSI Laboratories, Nippon Telegraph and Telephone Corporation (NTT) in 1990, where he has been undertaking research on the design and test methodologies for high-performance ASICs, a real-time MPEG2 encoder chip set, and a highly parallel LSI & system design for image-understanding processing. He is presently an associate professor in the system LSI field of the Graduate School of Information, Production and Systems, Waseda University. His current interests are application SoCs for image, security and network processing. Dr. Ikenaga is a member of the IPSJ and the IEEE. He received the IEICE Research Encouragement Award in Satoshi Goto received the B.Eng. and the M.Eng. degrees in Electronics and Communication Engineering from Waseda University in 1968 and 1970, respectively. He also received the Dr. degree of Engineering from the same university in He joined Central Research Laboratory of NEC in 1970, and become a professor of Waseda University since He is an IEEE Fellow and a member of Academy Engineering Society of Japan. His research interests include LSI system and multimedia system.

HIGH-THROUGHPUT MULTI-RATE LDPC DECODER BASED ON ARCHITECTURE-ORIENTED PARITY CHECK MATRICES

HIGH-THROUGHPUT MULTI-RATE LDPC DECODER BASED ON ARCHITECTURE-ORIENTED PARITY CHECK MATRICES HIGH-THROUGHPUT MULTI-RATE LDPC DECODER BASED ON ARCHITECTURE-ORIENTED PARITY CHECK MATRICES Predrag Radosavljevic, Alexandre de Baynast, Marjan Karkooti, Joseph R. Cavallaro ECE Department, Rice University

More information

A NOVEL HARDWARE-FRIENDLY SELF-ADJUSTABLE OFFSET MIN-SUM ALGORITHM FOR ISDB-S2 LDPC DECODER

A NOVEL HARDWARE-FRIENDLY SELF-ADJUSTABLE OFFSET MIN-SUM ALGORITHM FOR ISDB-S2 LDPC DECODER 18th European Signal Processing Conference (EUSIPCO-010) Aalborg, Denmark, August -7, 010 A NOVEL HARDWARE-FRIENDLY SELF-ADJUSTABLE OFFSET MIN-SUM ALGORITHM FOR ISDB-S LDPC DECODER Wen Ji, Makoto Hamaminato,

More information

New Message-Passing Decoding Algorithm of LDPC Codes by Partitioning Check Nodes 1

New Message-Passing Decoding Algorithm of LDPC Codes by Partitioning Check Nodes 1 New Message-Passing Decoding Algorithm of LDPC Codes by Partitioning Check Nodes 1 Sunghwan Kim* O, Min-Ho Jang*, Jong-Seon No*, Song-Nam Hong, and Dong-Joon Shin *School of Electrical Engineering and

More information

Low Complexity Quasi-Cyclic LDPC Decoder Architecture for IEEE n

Low Complexity Quasi-Cyclic LDPC Decoder Architecture for IEEE n Low Complexity Quasi-Cyclic LDPC Decoder Architecture for IEEE 802.11n Sherif Abou Zied 1, Ahmed Tarek Sayed 1, and Rafik Guindi 2 1 Varkon Semiconductors, Cairo, Egypt 2 Nile University, Giza, Egypt Abstract

More information

Memory Efficient Decoder Architectures for Quasi-Cyclic LDPC Codes

Memory Efficient Decoder Architectures for Quasi-Cyclic LDPC Codes Memory Efficient Decoder Architectures for Quasi-Cyclic LDPC Codes Yongmei Dai, Ning Chen and Zhiyuan Yan Department of Electrical and Computer Engineering Lehigh University, PA 805, USA E-mails: {yod30,

More information

Partly Parallel Overlapped Sum-Product Decoder Architectures for Quasi-Cyclic LDPC Codes

Partly Parallel Overlapped Sum-Product Decoder Architectures for Quasi-Cyclic LDPC Codes Partly Parallel Overlapped Sum-Product Decoder Architectures for Quasi-Cyclic LDPC Codes Ning Chen, Yongmei Dai, and Zhiyuan Yan Department of Electrical and Computer Engineering, Lehigh University, PA

More information

Overlapped Scheduling for Folded LDPC Decoding Based on Matrix Permutation

Overlapped Scheduling for Folded LDPC Decoding Based on Matrix Permutation Overlapped Scheduling for Folded LDPC Decoding Based on Matrix Permutation In-Cheol Park and Se-Hyeon Kang Department of Electrical Engineering and Computer Science, KAIST {icpark, shkang}@ics.kaist.ac.kr

More information

HDL Implementation of an Efficient Partial Parallel LDPC Decoder Using Soft Bit Flip Algorithm

HDL Implementation of an Efficient Partial Parallel LDPC Decoder Using Soft Bit Flip Algorithm I J C T A, 9(20), 2016, pp. 75-80 International Science Press HDL Implementation of an Efficient Partial Parallel LDPC Decoder Using Soft Bit Flip Algorithm Sandeep Kakde* and Atish Khobragade** ABSTRACT

More information

Interlaced Column-Row Message-Passing Schedule for Decoding LDPC Codes

Interlaced Column-Row Message-Passing Schedule for Decoding LDPC Codes Interlaced Column-Row Message-Passing Schedule for Decoding LDPC Codes Saleh Usman, Mohammad M. Mansour, Ali Chehab Department of Electrical and Computer Engineering American University of Beirut Beirut

More information

MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA

MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA Predrag Radosavljevic, Alexandre de Baynast, Marjan Karkooti, and Joseph R. Cavallaro Department of Electrical

More information

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can 208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer

More information

ISSN (Print) Research Article. *Corresponding author Akilambigai P

ISSN (Print) Research Article. *Corresponding author Akilambigai P Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2016; 4(5):223-227 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

Layered Decoding With A Early Stopping Criterion For LDPC Codes

Layered Decoding With A Early Stopping Criterion For LDPC Codes 2012 2 nd International Conference on Information Communication and Management (ICICM 2012) IPCSIT vol. 55 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V55.14 ayered Decoding With A Early

More information

AN FPGA BASED OVERLAPPED QUASI CYCLIC LDPC DECODER FOR WI-MAX

AN FPGA BASED OVERLAPPED QUASI CYCLIC LDPC DECODER FOR WI-MAX 2 th May 24. Vol. 63 No.2 25-24 JATIT & LLS. All rights reserved. ISSN: 992-8645 www.jatit.org E-ISSN: 87-395 AN FPGA BASED OVERLAPPED QUASI CYCLIC LDPC DECODER FOR WI-MAX G.AMIRTHA GOWRI, 2 S.SUBHA RANI

More information

Reconfigurable Variable Block Size Motion Estimation Architecture for Search Range Reduction Algorithm

Reconfigurable Variable Block Size Motion Estimation Architecture for Search Range Reduction Algorithm 440 PAPER Special Section on Advanced Technologies in Digital LSIs and Memories Reconfigurable Variable Block Size Motion Estimation Architecture for Search Range Reduction Algorithm Yibo FAN a), Nonmember,TakeshiIKENAGA,

More information

THE DESIGN OF STRUCTURED REGULAR LDPC CODES WITH LARGE GIRTH. Haotian Zhang and José M. F. Moura

THE DESIGN OF STRUCTURED REGULAR LDPC CODES WITH LARGE GIRTH. Haotian Zhang and José M. F. Moura THE DESIGN OF STRUCTURED REGULAR LDPC CODES WITH LARGE GIRTH Haotian Zhang and José M. F. Moura Department of Electrical and Computer Engineering Carnegie Mellon University, Pittsburgh, PA 523 {haotian,

More information

Performance Analysis of Gray Code based Structured Regular Column-Weight Two LDPC Codes

Performance Analysis of Gray Code based Structured Regular Column-Weight Two LDPC Codes IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 11, Issue 4, Ver. III (Jul.-Aug.2016), PP 06-10 www.iosrjournals.org Performance Analysis

More information

Disclosing the LDPC Code Decoder Design Space

Disclosing the LDPC Code Decoder Design Space Disclosing the LDPC Code Decoder Design Space Torben Brack, Frank Kienle, Norbert Wehn Microelectronic System Design Research Group University of Kaiserslautern Erwin-Schrödinger-Straße 67663 Kaiserslautern,

More information

lambda-min Decoding Algorithm of Regular and Irregular LDPC Codes

lambda-min Decoding Algorithm of Regular and Irregular LDPC Codes lambda-min Decoding Algorithm of Regular and Irregular LDPC Codes Emmanuel Boutillon, Frédéric Guillou, Jean-Luc Danger To cite this version: Emmanuel Boutillon, Frédéric Guillou, Jean-Luc Danger lambda-min

More information

Optimized Min-Sum Decoding Algorithm for Low Density PC Codes

Optimized Min-Sum Decoding Algorithm for Low Density PC Codes Optimized Min-Sum Decoding Algorithm for Low Density PC Codes Dewan Siam Shafiullah, Mohammad Rakibul Islam, Mohammad Mostafa Amir Faisal, Imran Rahman, Dept. of Electrical and Electronic Engineering,

More information

On the construction of Tanner graphs

On the construction of Tanner graphs On the construction of Tanner graphs Jesús Martínez Mateo Universidad Politécnica de Madrid Outline Introduction Low-density parity-check (LDPC) codes LDPC decoding Belief propagation based algorithms

More information

International Journal of Engineering Trends and Technology (IJETT) - Volume4Issue5- May 2013

International Journal of Engineering Trends and Technology (IJETT) - Volume4Issue5- May 2013 Design of Low Density Parity Check Decoder for WiMAX and FPGA Realization M.K Bharadwaj #1, Ch.Phani Teja *2, K.S ROY #3 #1 Electronics and Communications Engineering,K.L University #2 Electronics and

More information

ERROR correcting codes are used to increase the bandwidth

ERROR correcting codes are used to increase the bandwidth 404 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 3, MARCH 2002 A 690-mW 1-Gb/s 1024-b, Rate-1/2 Low-Density Parity-Check Code Decoder Andrew J. Blanksby and Chris J. Howland Abstract A 1024-b, rate-1/2,

More information

Reduced Complexity of Decoding Algorithm for Irregular LDPC Codes Using a Split Row Method

Reduced Complexity of Decoding Algorithm for Irregular LDPC Codes Using a Split Row Method Journal of Wireless Networking and Communications 2012, 2(4): 29-34 DOI: 10.5923/j.jwnc.20120204.01 Reduced Complexity of Decoding Algorithm for Irregular Rachid El Alami *, Mostafa Mrabti, Cheikh Bamba

More information

Capacity-approaching Codes for Solid State Storages

Capacity-approaching Codes for Solid State Storages Capacity-approaching Codes for Solid State Storages Jeongseok Ha, Department of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) Contents Capacity-Approach Codes Turbo

More information

Lowering the Error Floors of Irregular High-Rate LDPC Codes by Graph Conditioning

Lowering the Error Floors of Irregular High-Rate LDPC Codes by Graph Conditioning Lowering the Error Floors of Irregular High- LDPC Codes by Graph Conditioning Wen-Yen Weng, Aditya Ramamoorthy and Richard D. Wesel Electrical Engineering Department, UCLA, Los Angeles, CA, 90095-594.

More information

On combining chase-2 and sum-product algorithms for LDPC codes

On combining chase-2 and sum-product algorithms for LDPC codes University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2012 On combining chase-2 and sum-product algorithms

More information

RECENTLY, low-density parity-check (LDPC) codes have

RECENTLY, low-density parity-check (LDPC) codes have 892 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 53, NO. 4, APRIL 2006 Code Construction and FPGA Implementation of a Low-Error-Floor Multi-Rate Low-Density Parity-Check Code Decoder

More information

PERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS

PERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS American Journal of Applied Sciences 11 (4): 558-563, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.558.563 Published Online 11 (4) 2014 (http://www.thescipub.com/ajas.toc) PERFORMANCE

More information

Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) Codes for Deep Space and High Data Rate Applications

Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) Codes for Deep Space and High Data Rate Applications Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) Codes for Deep Space and High Data Rate Applications Nikoleta Andreadou, Fotini-Niovi Pavlidou Dept. of Electrical & Computer Engineering Aristotle University

More information

MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA

MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA MULTIRATE HIGHTHROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA Predrag Radosavljevic, Alexandre de Baynast, Marjan Karkooti, and Joseph R. Cavallaro Department of Electrical

More information

Performance analysis of LDPC Decoder using OpenMP

Performance analysis of LDPC Decoder using OpenMP Performance analysis of LDPC Decoder using OpenMP S. V. Viraktamath Faculty, Dept. of E&CE, SDMCET, Dharwad. Karnataka, India. Jyothi S. Hosmath Student, Dept. of E&CE, SDMCET, Dharwad. Karnataka, India.

More information

Design of a Low Density Parity Check Iterative Decoder

Design of a Low Density Parity Check Iterative Decoder 1 Design of a Low Density Parity Check Iterative Decoder Jean Nguyen, Computer Engineer, University of Wisconsin Madison Dr. Borivoje Nikolic, Faculty Advisor, Electrical Engineer, University of California,

More information

Distributed Decoding in Cooperative Communications

Distributed Decoding in Cooperative Communications Distributed Decoding in Cooperative Communications Marjan Karkooti and Joseph R. Cavallaro Rice University, Department of Electrical and Computer Engineering, Houston, TX, 77005 {marjan,cavallar} @rice.edu

More information

Tradeoff Analysis and Architecture Design of High Throughput Irregular LDPC Decoders

Tradeoff Analysis and Architecture Design of High Throughput Irregular LDPC Decoders IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 1, NO. 1, NOVEMBER 2006 1 Tradeoff Analysis and Architecture Design of High Throughput Irregular LDPC Decoders Predrag Radosavljevic, Student

More information

Finding Small Stopping Sets in the Tanner Graphs of LDPC Codes

Finding Small Stopping Sets in the Tanner Graphs of LDPC Codes Finding Small Stopping Sets in the Tanner Graphs of LDPC Codes Gerd Richter University of Ulm, Department of TAIT Albert-Einstein-Allee 43, D-89081 Ulm, Germany gerd.richter@uni-ulm.de Abstract The performance

More information

HIGH THROUGHPUT LOW POWER DECODER ARCHITECTURES FOR LOW DENSITY PARITY CHECK CODES

HIGH THROUGHPUT LOW POWER DECODER ARCHITECTURES FOR LOW DENSITY PARITY CHECK CODES HIGH THROUGHPUT LOW POWER DECODER ARCHITECTURES FOR LOW DENSITY PARITY CHECK CODES A Dissertation by ANAND MANIVANNAN SELVARATHINAM Submitted to the Office of Graduate Studies of Texas A&M University in

More information

BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel

BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel I J C T A, 9(40), 2016, pp. 397-404 International Science Press ISSN: 0974-5572 BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel Neha Mahankal*, Sandeep Kakde* and Atish Khobragade**

More information

Cost efficient FPGA implementations of Min- Sum and Self-Corrected-Min-Sum decoders

Cost efficient FPGA implementations of Min- Sum and Self-Corrected-Min-Sum decoders Cost efficient FPGA implementations of Min- Sum and Self-Corrected-Min-Sum decoders Oana Boncalo (1), Alexandru Amaricai (1), Valentin Savin (2) (1) University Politehnica Timisoara, Romania (2) CEA-LETI,

More information

Performance of the Sum-Product Decoding Algorithm on Factor Graphs With Short Cycles

Performance of the Sum-Product Decoding Algorithm on Factor Graphs With Short Cycles Performance of the Sum-Product Decoding Algorithm on Factor Graphs With Short Cycles Kevin Jacobson Abstract Originally invented by R. G. Gallager in 962, lowdensity parity-check (LDPC) codes have reemerged

More information

A Generic Architecture of CCSDS Low Density Parity Check Decoder for Near-Earth Applications

A Generic Architecture of CCSDS Low Density Parity Check Decoder for Near-Earth Applications A Generic Architecture of CCSDS Low Density Parity Check Decoder for Near-Earth Applications Fabien Demangel, Nicolas Fau, Nicolas Drabik, François Charot, Christophe Wolinski To cite this version: Fabien

More information

COMPARISON OF SIMPLIFIED GRADIENT DESCENT ALGORITHMS FOR DECODING LDPC CODES

COMPARISON OF SIMPLIFIED GRADIENT DESCENT ALGORITHMS FOR DECODING LDPC CODES COMPARISON OF SIMPLIFIED GRADIENT DESCENT ALGORITHMS FOR DECODING LDPC CODES Boorle Ashok Kumar 1, G Y. Padma Sree 2 1 PG Scholar, Al-Ameer College Of Engineering & Information Technology, Anandapuram,

More information

A Parallel Decoding Algorithm of LDPC Codes using CUDA

A Parallel Decoding Algorithm of LDPC Codes using CUDA A Parallel Decoding Algorithm of LDPC Codes using CUDA Shuang Wang and Samuel Cheng School of Electrical and Computer Engineering University of Oklahoma-Tulsa Tulsa, OK 735 {shuangwang, samuel.cheng}@ou.edu

More information

Hardware Implementation

Hardware Implementation Low Density Parity Check decoder Hardware Implementation Ruchi Rani (2008EEE2225) Under guidance of Prof. Jayadeva Dr.Shankar Prakriya 1 Indian Institute of Technology LDPC code Linear block code which

More information

A SCALABLE COMPUTING AND MEMORY ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS. Theepan Moorthy and Andy Ye

A SCALABLE COMPUTING AND MEMORY ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS. Theepan Moorthy and Andy Ye A SCALABLE COMPUTING AND MEMORY ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS Theepan Moorthy and Andy Ye Department of Electrical and Computer Engineering Ryerson

More information

BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU

BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU 2013 8th International Conference on Communications and Networking in China (CHINACOM) BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU Xiang Chen 1,2, Ji Zhu, Ziyu Wen,

More information

Energy Efficient Layer Decoding Architecture for LDPC Decoder

Energy Efficient Layer Decoding Architecture for LDPC Decoder eissn:232-225x;pissn:232-224 Volume: 4; Issue: ; January -25 Energy Efficient Layer Decoding Architecture for LDPC Decoder Jyothi B R Lecturer KLS s VDRIT Haliyal-58329 Abstract- Low Density Parity-Check

More information

Performance Analysis of Min-Sum LDPC Decoding Algorithm S. V. Viraktamath 1, Girish Attimarad 2

Performance Analysis of Min-Sum LDPC Decoding Algorithm S. V. Viraktamath 1, Girish Attimarad 2 Performance Analysis of Min-Sum LDPC Decoding Algorithm S. V. Viraktamath 1, Girish Attimarad 2 1 Department of ECE, SDM College of Engineering and Technology, Dharwad, India 2 Department of ECE, Dayanand

More information

Design of a Quasi-Cyclic LDPC Decoder Using Generic Data Packing Scheme

Design of a Quasi-Cyclic LDPC Decoder Using Generic Data Packing Scheme Design of a Quasi-Cyclic LDPC Decoder Using Generic Data Packing Scheme Jinlei Chen, Yan Zhang and Xu Wang Key Laboratory of Network Oriented Intelligent Computation Shenzhen Graduate School, Harbin Institute

More information

LOW-DENSITY parity-check (LDPC) codes, which are defined

LOW-DENSITY parity-check (LDPC) codes, which are defined 734 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 9, SEPTEMBER 2009 Design of a Multimode QC-LDPC Decoder Based on Shift-Routing Network Chih-Hao Liu, Chien-Ching Lin, Shau-Wei

More information

A Reduced Routing Network Architecture for Partial Parallel LDPC Decoders

A Reduced Routing Network Architecture for Partial Parallel LDPC Decoders A Reduced Routing Network Architecture for Partial Parallel LDPC Decoders Houshmand ShiraniMehr 1, Tinoosh Mohsenin 2 and Bevan Baas 1 1 ECE Department, University of California, Davis, 2 CSEE Department,

More information

Low Error Rate LDPC Decoders

Low Error Rate LDPC Decoders Low Error Rate LDPC Decoders Zhengya Zhang, Lara Dolecek, Pamela Lee, Venkat Anantharam, Martin J. Wainwright, Brian Richards and Borivoje Nikolić Department of Electrical Engineering and Computer Science,

More information

Design and Implementation of Low Density Parity Check Codes

Design and Implementation of Low Density Parity Check Codes IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 09 (September. 2014), V2 PP 21-25 www.iosrjen.org Design and Implementation of Low Density Parity Check Codes

More information

Informed Dynamic Scheduling for Belief-Propagation Decoding of LDPC Codes

Informed Dynamic Scheduling for Belief-Propagation Decoding of LDPC Codes Informed Dynamic Scheduling for Belief-Propagation Decoding of LDPC Codes Andres I. Vila Casado, Miguel Griot and Richard D. Wesel Department of Electrical Engineering, University of California, Los Angeles,

More information

Low complexity FEC Systems for Satellite Communication

Low complexity FEC Systems for Satellite Communication Low complexity FEC Systems for Satellite Communication Ashwani Singh Navtel Systems 2 Rue Muette, 27000,Houville La Branche, France Tel: +33 237 25 71 86 E-mail: ashwani.singh@navtelsystems.com Henry Chandran

More information

LowcostLDPCdecoderforDVB-S2

LowcostLDPCdecoderforDVB-S2 LowcostLDPCdecoderforDVB-S2 John Dielissen*, Andries Hekstra*, Vincent Berg+ * Philips Research, High Tech Campus 5, 5656 AE Eindhoven, The Netherlands + Philips Semiconductors, 2, rue de la Girafe, BP.

More information

Maximizing the Throughput-Area Efficiency of Fully-Parallel Low-Density Parity-Check Decoding with C-Slow Retiming and Asynchronous Deep Pipelining

Maximizing the Throughput-Area Efficiency of Fully-Parallel Low-Density Parity-Check Decoding with C-Slow Retiming and Asynchronous Deep Pipelining Maximizing the Throughput-Area Efficiency of Fully-Parallel Low-Density Parity-Check Decoding with C-Slow Retiming and Asynchronous Deep Pipelining Ming Su, Lili Zhou, Student Member, IEEE, and C.-J. Richard

More information

Quantized Iterative Message Passing Decoders with Low Error Floor for LDPC Codes

Quantized Iterative Message Passing Decoders with Low Error Floor for LDPC Codes Quantized Iterative Message Passing Decoders with Low Error Floor for LDPC Codes Xiaojie Zhang and Paul H. Siegel University of California, San Diego 1. Introduction Low-density parity-check (LDPC) codes

More information

Investigation of Error Floors of Structured Low- Density Parity-Check Codes by Hardware Emulation

Investigation of Error Floors of Structured Low- Density Parity-Check Codes by Hardware Emulation Investigation of Error Floors of Structured Low- Density Parity-Check Codes by Hardware Emulation Zhengya Zhang, Lara Dolecek, Borivoje Nikolic, Venkat Anantharam, and Martin Wainwright Department of Electrical

More information

/$ IEEE

/$ IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 1, JANUARY 2009 81 Bit-Level Extrinsic Information Exchange Method for Double-Binary Turbo Codes Ji-Hoon Kim, Student Member,

More information

A Memory Efficient FPGA Implementation of Quasi-Cyclic LDPC Decoder

A Memory Efficient FPGA Implementation of Quasi-Cyclic LDPC Decoder Proceedings of the 5th WSEAS Int. Conf. on Instrumentation, Measurement, Circuits and Systems, angzhou, China, April 6-8, 26 (pp28-223) A Memory Efficient FPGA Implementation of Quasi-Cyclic DPC Decoder

More information

Complexity-Optimized Low-Density Parity-Check Codes

Complexity-Optimized Low-Density Parity-Check Codes Complexity-Optimized Low-Density Parity-Check Codes Masoud Ardakani Department of Electrical & Computer Engineering University of Alberta, ardakani@ece.ualberta.ca Benjamin Smith, Wei Yu, Frank R. Kschischang

More information

FPGA Matrix Multiplier

FPGA Matrix Multiplier FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri

More information

Performance comparison of Decoding Algorithm for LDPC codes in DVBS2

Performance comparison of Decoding Algorithm for LDPC codes in DVBS2 Performance comparison of Decoding Algorithm for LDPC codes in DVBS2 Ronakben P Patel 1, Prof. Pooja Thakar 2 1M.TEC student, Dept. of EC, SALTIER, Ahmedabad-380060, Gujarat, India 2 Assistant Professor,

More information

FPGA Implementation of Binary Quasi Cyclic LDPC Code with Rate 2/5

FPGA Implementation of Binary Quasi Cyclic LDPC Code with Rate 2/5 FPGA Implementation of Binary Quasi Cyclic LDPC Code with Rate 2/5 Arulmozhi M. 1, Nandini G. Iyer 2, Anitha M. 3 Assistant Professor, Department of EEE, Rajalakshmi Engineering College, Chennai, India

More information

SERIAL LDPC DECODING ON A SIMD DSP USING HORIZONTAL SCHEDULING

SERIAL LDPC DECODING ON A SIMD DSP USING HORIZONTAL SCHEDULING SERIA DPC DECODING ON A SID DSP USING HORIZONTA SCHEDUING arco Gomes, Vitor Silva, Cláudio Neves and Ricardo arques Institute of Telecommunications - Department of Electrical and Computer Engineering University

More information

A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors

A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors Brent Bohnenstiehl and Bevan Baas Department of Electrical and Computer Engineering University of California, Davis {bvbohnen,

More information

TURBO codes, [1], [2], have attracted much interest due

TURBO codes, [1], [2], have attracted much interest due 800 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Zigzag Codes and Concatenated Zigzag Codes Li Ping, Member, IEEE, Xiaoling Huang, and Nam Phamdo, Senior Member, IEEE Abstract

More information

LOW-POWER IMPLEMENTATION OF A HIGH-THROUGHPUT LDPC DECODER FOR IEEE N STANDARD. Naresh R. Shanbhag

LOW-POWER IMPLEMENTATION OF A HIGH-THROUGHPUT LDPC DECODER FOR IEEE N STANDARD. Naresh R. Shanbhag LOW-POWER IMPLEMENTATION OF A HIGH-THROUGHPUT LDPC DECODER FOR IEEE 802.11N STANDARD Junho Cho Department of Electrical Engineering, Seoul National University, Seoul, 151-744, Korea Naresh R. Shanbhag

More information

REVIEW ON CONSTRUCTION OF PARITY CHECK MATRIX FOR LDPC CODE

REVIEW ON CONSTRUCTION OF PARITY CHECK MATRIX FOR LDPC CODE REVIEW ON CONSTRUCTION OF PARITY CHECK MATRIX FOR LDPC CODE Seema S. Gumbade 1, Anirudhha S. Wagh 2, Dr.D.P.Rathod 3 1,2 M. Tech Scholar, Veermata Jijabai Technological Institute (VJTI), Electrical Engineering

More information

< Irregular Repeat-Accumulate LDPC Code Proposal Technology Overview

<  Irregular Repeat-Accumulate LDPC Code Proposal Technology Overview Project IEEE 802.20 Working Group on Mobile Broadband Wireless Access Title Irregular Repeat-Accumulate LDPC Code Proposal Technology Overview Date Submitted Source(s):

More information

Iterative Decoder Architectures

Iterative Decoder Architectures Iterative Decoder Architectures Submitted to IEEE Communications Magazine Engling Yeo, Borivoje Nikolic, and Venkat Anantharam Department of Electrical Engineering and Computer Sciences University of California,

More information

Piecewise Linear Approximation Based on Taylor Series of LDPC Codes Decoding Algorithm and Implemented in FPGA

Piecewise Linear Approximation Based on Taylor Series of LDPC Codes Decoding Algorithm and Implemented in FPGA Journal of Information Hiding and Multimedia Signal Processing c 2018 ISSN 2073-4212 Ubiquitous International Volume 9, Number 3, May 2018 Piecewise Linear Approximation Based on Taylor Series of LDPC

More information

98 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 58, NO. 1, JANUARY 2011

98 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 58, NO. 1, JANUARY 2011 98 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 58, NO. 1, JANUARY 2011 Memory System Optimization for FPGA- Based Implementation of Quasi-Cyclic LDPC Codes Decoders Xiaoheng Chen,

More information

Optimal Overlapped Message Passing Decoding of Quasi-Cyclic LDPC Codes

Optimal Overlapped Message Passing Decoding of Quasi-Cyclic LDPC Codes Optimal Overlapped Message Passing Decoding of Quasi-Cyclic LDPC Codes Yongmei Dai and Zhiyuan Yan Department of Electrical and Computer Engineering Lehigh University, PA 18015, USA E-mails: {yod304, yan}@lehigh.edu

More information

New LDPC code design scheme combining differential evolution and simplex algorithm Min Kyu Song

New LDPC code design scheme combining differential evolution and simplex algorithm Min Kyu Song New LDPC code design scheme combining differential evolution and simplex algorithm Min Kyu Song The Graduate School Yonsei University Department of Electrical and Electronic Engineering New LDPC code design

More information

System Verification of Hardware Optimization Based on Edge Detection

System Verification of Hardware Optimization Based on Edge Detection Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection

More information

LDPC Codes a brief Tutorial

LDPC Codes a brief Tutorial LDPC Codes a brief Tutorial Bernhard M.J. Leiner, Stud.ID.: 53418L bleiner@gmail.com April 8, 2005 1 Introduction Low-density parity-check (LDPC) codes are a class of linear block LDPC codes. The name

More information

A Fast Systematic Optimized Comparison Algorithm for CNU Design of LDPC Decoders

A Fast Systematic Optimized Comparison Algorithm for CNU Design of LDPC Decoders 2246 IEICE TRANS. FUNDAMENTALS, VOL.E94 A, NO.11 NOVEMBER 2011 PAPER Special Section on Smart Multimedia & Communication Systems A Fast Systematic Optimized Comparison Algorithm for CNU Design of LDPC

More information

A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications

A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications Kui-Ting Chen Research Center of Information, Production and Systems, Waseda University, Fukuoka, Japan Email: nore@aoni.waseda.jp

More information

ISSN Vol.05,Issue.09, September-2017, Pages:

ISSN Vol.05,Issue.09, September-2017, Pages: WWW.IJITECH.ORG ISSN 2321-8665 Vol.05,Issue.09, September-2017, Pages:1693-1697 AJJAM PUSHPA 1, C. H. RAMA MOHAN 2 1 PG Scholar, Dept of ECE(DECS), Shirdi Sai Institute of Science and Technology, Anantapuramu,

More information

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely

More information

Chip Design for Turbo Encoder Module for In-Vehicle System

Chip Design for Turbo Encoder Module for In-Vehicle System Chip Design for Turbo Encoder Module for In-Vehicle System Majeed Nader Email: majeed@wayneedu Yunrui Li Email: yunruili@wayneedu John Liu Email: johnliu@wayneedu Abstract This paper studies design and

More information

Novel Low-Density Signature Structure for Synchronous DS-CDMA Systems

Novel Low-Density Signature Structure for Synchronous DS-CDMA Systems Novel Low-Density Signature Structure for Synchronous DS-CDMA Systems Reza Hoshyar Email: R.Hoshyar@surrey.ac.uk Ferry P. Wathan Email: F.Wathan@surrey.ac.uk Rahim Tafazolli Email: R.Tafazolli@surrey.ac.uk

More information

Performance study and synthesis of new Error Correcting Codes RS, BCH and LDPC Using the Bit Error Rate (BER) and Field-Programmable Gate Array (FPGA)

Performance study and synthesis of new Error Correcting Codes RS, BCH and LDPC Using the Bit Error Rate (BER) and Field-Programmable Gate Array (FPGA) IJCSNS International Journal of Computer Science and Network Security, VOL.16 No.5, May 2016 21 Performance study and synthesis of new Error Correcting Codes RS, BCH and LDPC Using the Bit Error Rate (BER)

More information

Error Control Coding for MLC Flash Memories

Error Control Coding for MLC Flash Memories Error Control Coding for MLC Flash Memories Ying Y. Tai, Ph.D. Cadence Design Systems, Inc. ytai@cadence.com August 19, 2010 Santa Clara, CA 1 Outline The Challenges on Error Control Coding (ECC) for MLC

More information

DESIGN OF FAULT SECURE ENCODER FOR MEMORY APPLICATIONS IN SOC TECHNOLOGY

DESIGN OF FAULT SECURE ENCODER FOR MEMORY APPLICATIONS IN SOC TECHNOLOGY DESIGN OF FAULT SECURE ENCODER FOR MEMORY APPLICATIONS IN SOC TECHNOLOGY K.Maheshwari M.Tech VLSI, Aurora scientific technological and research academy, Bandlaguda, Hyderabad. k.sandeep kumar Asst.prof,

More information

PRACTICAL communication systems often need to operate

PRACTICAL communication systems often need to operate IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 57, NO. 1, JANUARY 2009 75 Transactions Papers Multiple-Rate Low-Density Parity-Check Codes with Constant Blocklength Andres I. Vila Casado, Wen-Yen Weng, Stefano

More information

Multi-Rate Reconfigurable LDPC Decoder Architectures for QC-LDPC codes in High Throughput Applications

Multi-Rate Reconfigurable LDPC Decoder Architectures for QC-LDPC codes in High Throughput Applications Multi-Rate Reconfigurable LDPC Decoder Architectures for QC-LDPC codes in High Throughput Applications A thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Technology

More information

On the Performance Evaluation of Quasi-Cyclic LDPC Codes with Arbitrary Puncturing

On the Performance Evaluation of Quasi-Cyclic LDPC Codes with Arbitrary Puncturing On the Performance Evaluation of Quasi-Cyclic LDPC Codes with Arbitrary Puncturing Ying Xu, and Yueun Wei Department of Wireless Research Huawei Technologies Co., Ltd, Shanghai, 6, China Email: {eaglexu,

More information

An FPGA Implementation of (3, 6)-Regular Low-Density Parity-Check Code Decoder

An FPGA Implementation of (3, 6)-Regular Low-Density Parity-Check Code Decoder EURASIP Journal on Applied Signal Processing 2003:, 30 42 c 2003 Hindawi Publishing Corporation An FPGA Implementation of (3, )-Regular Low-Density Parity-Check Code Decoder Tong Zhang Department of Electrical,

More information

ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7

ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7 ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7 8.7 A Programmable Turbo Decoder for Multiple 3G Wireless Standards Myoung-Cheol Shin, In-Cheol Park KAIST, Daejeon, Republic of Korea

More information

Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA

Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA Hardware Description of Multi-Directional Fast Sobel Edge Detection Processor by VHDL for Implementing on FPGA Arash Nosrat Faculty of Engineering Shahid Chamran University Ahvaz, Iran Yousef S. Kavian

More information

Assertion Checker Synthesis for FPGA Emulation

Assertion Checker Synthesis for FPGA Emulation Assertion Checker Synthesis for FPGA Emulation Chengjie Zang, Qixin Wei and Shinji Kimura Graduate School of Information, Production and Systems, Waseda University, 2-7 Hibikino, Kitakyushu, 808-0135,

More information

Reduced Latency Majority Logic Decoding for Error Detection and Correction

Reduced Latency Majority Logic Decoding for Error Detection and Correction Reduced Latency Majority Logic Decoding for Error Detection and Correction D.K.Monisa 1, N.Sathiya 2 1 Department of Electronics and Communication Engineering, Mahendra Engineering College, Namakkal, Tamilnadu,

More information

A VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation

A VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation Journal of Automation and Control Engineering Vol. 3, No. 1, February 20 A VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation Dam. Minh Tung and Tran. Le Thang Dong Center of Electrical

More information

Non-recursive complexity reduction encoding scheme for performance enhancement of polar codes

Non-recursive complexity reduction encoding scheme for performance enhancement of polar codes Non-recursive complexity reduction encoding scheme for performance enhancement of polar codes 1 Prakash K M, 2 Dr. G S Sunitha 1 Assistant Professor, Dept. of E&C, Bapuji Institute of Engineering and Technology,

More information

New Code Construction Method and High-Speed VLSI Codec Architecture for Repeat-Accumulate Codes

New Code Construction Method and High-Speed VLSI Codec Architecture for Repeat-Accumulate Codes New Code Construction Method and High-Speed VLSI Codec Architecture for Repeat-Accumulate Codes Kaibin Zhang*, Liuguo Yin**, Jianhua Lu* *Department of Electronic Engineering, Tsinghua University, Beijing,

More information

Improving Min-sum LDPC Decoding Throughput by Exploiting Intra-cell Bit Error Characteristic in MLC NAND Flash Memory

Improving Min-sum LDPC Decoding Throughput by Exploiting Intra-cell Bit Error Characteristic in MLC NAND Flash Memory Improving Min-sum LDPC Decoding Throughput by Exploiting Intra-cell Bit Error Characteristic in MLC NAND Flash Memory Wenzhe Zhao, Hongbin Sun, Minjie Lv, Guiqiang Dong, Nanning Zheng, and Tong Zhang Institute

More information

Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications

Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications 46 IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.3, March 2008 Implementation of A Optimized Systolic Array Architecture for FSBMA using FPGA for Real-time Applications

More information