/$ IEEE

Size: px
Start display at page:

Download "/$ IEEE"

Transcription

1 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 56, NO. 5, MAY Low-Power Memory-Reduced Traceback MAP Decoding for Double-Binary Convolutional Turbo Decoder Cheng-Hung Lin, Student Member, IEEE, Chun-Yu Chen, Student Member, IEEE, An-Yeu (Andy) Wu, Member, IEEE, and Tsung-Han Tsai, Member, IEEE Abstract Iterative decoding of convolutional turbo code (CTC) has a large memory power consumption. To reduce the power consumption of the state metrics cache (SMC), low-power memory-reduced traceback maximum a posteriori algorithm (MAP) decoding is proposed. Instead of storing all state metrics, the traceback MAP decoding reduces the size of the SMC by accessing difference metrics. The proposed traceback computation requires no complicated reversion checker, path selection, and reversion flag cache. For double-binary (DB) MAP decoding, radix-2 2 and radix-4 traceback structures are introduced to provide a tradeoff between power consumption and operating frequency. These two traceback structures achieve an around 20% power reduction of the SMC, and around 7% power reduction of the DB MAP decoders. In addition, a high-throughput 12-mode WiMAX CTC decoder applying the proposed radix-2 2 traceback structure is implemented by using a m CMOS process in a core area of 7.16 mm 2. Based on postlayout simulation results, the proposed decoder achieves a maximum throughput rate of Mbps and an energy efficiency of 0.43 nj/bit per iteration. Index Terms Low-power design, maximum a posteriori (MAP) algorithm, turbo decoder. I. INTRODUCTION SINGLE-BINARY (SB) convolutional turbo code (CTC), proposed in 1993 [1], has been proven to provide a high coding gain near the Shannon capacity limit. The SB-CTC has been adopted in the forward-error-control (FEC) scheme for wideband code-division multiple access (WCDMA) [2] and cdma2000 [3]. In 1999, the nonbinary CTC [4] was introduced, which has a superior coding performance compared with the SB CTC [5]. In recent years, double-binary (DB) CTC has been adopted in the FEC coding of advanced wireless communication standards, such as digital video broadcasting-return channel over satellite and terrestrial (DVB-RCS Manuscript received September 28, 2008; revised December 23, First published March 10, 2009; current version published May 20, This work was supported in part by Industrial Technology Research Institute, Taiwan, under the WiMAX FEC Project 8362B51310, and by the National Science Council, Taiwan, under Grant NSC E This paper was presented in part at the ISCAS, Kobe, Japan, May 2005, VLSI-DAT, Hsinchu, Taiwan, April 2008, and ISCAS, Seattle, WA, May This paper was recommended by Guest Editor H. Schmid. C.-H. Lin, C.-Y. Chen, and A.-Y. Wu are with the Graduate Institute of Electronics Engineering and Department of Electrical Engineering, National Taiwan University, Taipei 10617, Taiwan ( andywu@cc.ee.ntu.edu.tw). T.-H. Tasi is with the Department of Electrical Engineering, National Central University, Jhongli City 32001, Taiwan. Color version of one or more of the figures in this paper are available online at Digital Object Identifier /TCSI Fig. 1. Decoding path of the state metrics access: the (a) conventional path and (b) traceback path. and DVB-RCT) [6], and worldwide interoperability for microwave access (WiMAX) [7]. Powerful soft-input soft-output (SISO) algorithms for CTC decoding are the maximum a posteriori algorithm (MAP) [8], and its derivatives, such as the log-map (L-MAP) [9], Max-log-MAP (ML-MAP) [9], combinations of the L-MAP and ML-MAP [11], and enhanced Max-log-MAP (EML-MAP) [12]. In this paper, we use the term MAP for an abbreviation of L-MAP and (E)ML-MAP. The memory organization of MAP decoding is an important issue in facilitating the hardware implementation of CTC decoders [13]. In particular, the power reduction of state metrics cache (SMC) is critical for MAP decoders [14]. With regard to SB CTC decoding, some researches have been proposed to reduce the power consumption of the SMC [14] [21]. The reverse computations [20], [21] significantly reduce SMC power consumption with reversion checkers and reversion flag caches. However, the reversion checker and reversion flag cache prolong the critical path or decoding cycles. In addition, the computational complexities of the reverse computations are increased dramatically when the reverse computations are extended from the SB to the DB MAP. Although some researchers [22], [23] have proposed the VLSI designs of DB CTC decoders, there are presently very few papers discussing an efficient SMC power reduction of DB CTC decoders. In this paper, the traceback MAP decoding is proposed to trace the state metrics back by accessing the difference metrics. Fig. 1 illustrates the decoding paths of the conventional computation and proposed traceback computation. In the conventional path, the state metrics computed by the natural recursion processor (NRP) in the natural order are stored in the SMC /$ IEEE

2 1006 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 56, NO. 5, MAY 2009 Powerful SISO algorithms for DB CTC decoding are the DB MAP. First, the arithmetic operations of the DB EML-MAP [24] are described as follows: (1a) Fig. 2. DB RSC encoder of WiMAX CTC scheme. (1b) Then, the state metrics are read out to compute the a posteriori log-likelihood ratio (LLR) by the log-a posteriori module (LAPO) in the reverse order. In the traceback path, the difference metrics computed by the NRP are stored in the SMC. Then, the state metrics are traced back with the stored difference metrics by the traceback recursion processor (TRP) in the reverse order. The power consumption of the SMC can be reduced by accessing the difference metrics because the number of stored metrics is lower. The computational power of TRP is small, and the overall power consumption of the traceback path is reduced. For the traceback DB MAP decoding, two traceback structures are demonstrated. The radix-2 2 traceback structure has low hardware costs, and the radix-4 traceback structure has short path delays. Experimental results show that these two traceback structures achieve an around 20% power reduction of the SMC, and around 7% power reduction of the DB MAP decoder. In addition, an application of the proposed traceback DB MAP decoding for the DB CTC is to deploy the radix-2 2 traceback structure to a high-throughput WiMAX CTC decoder. We emphasize the 12 general transmissions of the WiMAX CTC scheme without the optional hybrid automatic repeat request (HARQ) transmissions. The 12-mode CTC decoder was implemented by using a TSMC 0.13 m CMOS process. The prototyping chip in a core area of 7.16 mm achieves Mbps, with an energy efficiency of 0.43 nj/bit per iteration. The remainder of this paper is as follows. Section II provides reviews of the DB MAP decoding. Section III introduces some background information regarding SMC power reductions. Section IV describes the proposed memory-reduced traceback MAP decoding. Section V demonstrates the radix-4 traceback DB MAP decoding and the corresponding computational units. Section VI demonstrates the comparisons and experimental results of the proposed radix-4 DB traceback structures. The prototyping chip of the 12-mode WiMAX CTC decoder is described in Section VII. Finally, Section VIII concludes this paper. where branch metrics; forward recursion state metrics; backward recursion state metrics; a priori LLR; a posteriori LLR; intrinsic values; extrinsic values; two binary bits at time ; transmitted codewords BPSK; for (1c) (1d) (1e) (1f) II. REVIEWS OF DB MAP DECODING A CTC encoder is composed of a CTC interleaver and two parallel or serial concatenated recursive systematic convolutional (RSC) encoders. Fig. 2 shows the DB RSC encoder of a WiMAX CTC scheme. The constraint length of this DB RSC encoder is 4. The transmitted codewords are denoted by and. soft received codewords; number of parity bit; ; state index; scaling factor.

3 LIN et al.: LOW-POWER MEMORY-REDUCED TRACEBACK MAP DECODING FOR DB CONVOLUTIONAL TURBO DECODER 1007 Decisions are based on (2) Note that the values of, and are always equal to zero. The differences between the DB L-MAP and DB (E)ML-MAP are that a priori LLR multiplies the channel value, and MAX operations are replaced by in the DB L-MAP. The [9] is defined as Fig. 3. Conventional decoding procedure. (3) A lookup table (LUT) can implement the corrective term. The difference between the DB EML-MAP and DB ML-MAP is that the extrinsic values in the DB ML-MAP do not multiply a scaling factor. Despite the SB and DB MAP decoding, the L-MAP has the significant coding gain and the EML-MAP achieves better coding gain than the ML-MAP. Without specifying which algorithm is used, we use the term MAP for an abbreviation of the L-MAP and (E)ML-MAP in the following sections. The details of SB MAP algorithms can be referred to in [10] for the SB CTC. III. POWER REDUCTIONS OF STATE METRICS CACHE Before we introduce the proposed memory-reduced traceback MAP decoding, the background information of SMC power reduction for the MAP decoding is demonstrated in this section. A. Conventional (Windowing) Decoding Procedure Despite the SB or DB MAP decoding, forward (1b) and backward (1c) recursion states metrics are computed in chronologically reverse order, where is the constraint length of a RSC encoder. Both forward and backward recursion state metrics are required for the computation of a posteriori LLR (1d). Thus, a large SMC stores the forward (or backward) recursion state metrics to compute the a posteriori LLR until the backward (or forward) recursion state metrics are generated. The conventional decoding procedure employing the windowing technique [25] was proposed to reduce the depth of SMC from a block size to a window size, where is. (The well-known sliding window (SW) and parallel window (PW) MAP architectures can be referred to in [13] and [26], respectively.) Fig. 3 shows the conventional decoding procedure. In the natural order, the NRP, composed of add-compare-select units (ACSUs), is used to recursively compute the forward (or backward) recursion state metrics in cycles. The obtained state metrics in each cycle are immediately stored into the SMC and recursively fed back to the NRP to calculate the next state metrics. To compute the a posteriori LLR, the state metrics are read out from the SMC. The SMC of the conventional decoding procedure still accounts for more than 50% of the entire power consumption [14]. Fig. 4. Reverse decoding procedure. B. Reverse Decoding Procedure To further reduce the access power of SMC, the reverse computations [20], [21] modified the conventional decoding procedure for the radix-2 SB MAP decoding. The decoding procedure of the reverse computations is illustrated in Fig. 4. Compared with the conventional decoding procedure shown in Fig. 3, the reverse decoding procedure adds a reversion checker, a reversion flag cache, and a reverse recursion processor (RRP). The reversion checker decides whether the state metrics computed by the NRP are reversible or not. If a state metric is not reversible, this state metric is stored in one subbank of the SMC. Meanwhile, the reversion flag cache stores the path information of this state metric. To compute the a posteriori LLR, the irreversible state metric is read out from the SMC according to the path information stored in the reversion flag cache. Otherwise, the reversible state metric is computed by the RRP, composed of reverse units. The recovered state metrics are recursively fed back to the RRP to calculate the next reversible state metrics. The reverse computations reduce the power consumption of the radix-2 SB MAP decoder because the subbanks of SMC are dynamically accessed, and the computational power of the reversion checker and RRP is small. The reverse computations for the radix-2 SB L-MAP decoding achieve an around 30% power reduction with an over 20% logic overhead. However, the reversion checker and reversion flag cache prolong the critical path or decoding cycles. In addition, dividing the SMC into subbanks increases the silicon area of the SMC and consumes more overall power of the SMC if all subbanks are accessed. For the radix-2 SB MAP decoding, a -state trellis structure can be decomposed into radix-2 butterfly structures.

4 1008 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 56, NO. 5, MAY 2009 Fig. 6. Traceback decoding procedure. Fig. 5. Natural, reverse, and traceback recursions in the (a) (c) radix-2 butterfly structures and (d) (f) radix-4 butterfly structures. The symbol index is denoted by k. For the conversional decoding procedure, the computational units in (a) are the ACSUs. For the reverse decoding procedure, the computational units in (a) and (d) are the ACSUs and reversion checkers, and the computational units in (b) and (e) are the reverse units. For the traceback decoding procedure, the computational units in (a) and (d) are ACSUs only, and the computational units in (c) and (f) are the TBUs. Fig. 5(a) and (b) illustrates an example of the SB reverse computation in a radix-2 butterfly structure. In Fig. 5(a), the reversion checker checks all fixed paths and determines the reversible paths. In Fig. 5(b), the dashed lines denote the reversible (selective) paths determined by the reversion flag cache. One radix-2 butterfly structure has four paths and four cases to be checked. The essential DB MAP decoding is radix-4 trellis decoding. For the radix-4 DB MAP decoding, a -state trellis structure can be decomposed into radix-4 butterfly structures. Fig. 5(d) and (e) illustrates an example of the DB reverse computation in a radix-4 butterfly structure. One radix-4 butterfly structure has 16 paths and 256 cases to be checked in the reverse computations. Hence, the reverse computations for the radix-4 trellis require more complicated reversion checkers and larger reversion flag caches. The computational complexities are dramatically increased when the reverse computations are extended from the SB to the DB MAP. IV. PROPOSED MEMORY-REDUCED TRACEBACK MAP DECODING Here, the traceback MAP decoding is proposed to reduce the power consumption of the SMC. The traceback MAP decoding has five major stages/phases, given here. 1) The branch metrics are computed with the received codewords and the a priori LLR in the natural order. 2) The forward (or backward) state metrics are recursively computed by the NRP with the branch metrics in the natural order, and the difference metrics are stored into the SMC. Note that the difference metric is the difference between two state metrics and has the same bit-length of the state metric. 3) The forward (or backward) state metrics are recursively traced back by the TRP with the stored difference metrics in the reverse order. Concurrently, the backward (or forward) state metrics are recursively computed with the branch metrics. 4) The a posteriori LLR is computed with regenerated forward (or backward) state metrics, the backward (or forward) state metrics, and branch metrics by the LAPO in the reverse order. 5) The extrinsic values and hard bits are computed in the reverse order with the a posteriori LLR. In contrast to the conventional MAP decoding, the traceback MAP decoding reduces the number of stored metrics by accessing the difference metrics. Hence, the SMC power consumption is reduced. In addition, the computational power overhead of tracing the state metrics back is much smaller than the SMC power consumption. Thus, the overall power consumption of the MAP decoding is reduced. The work in [18] has introduced that not absolute values but differences between the state metrics are important for the a posteriori LLR. In the proposed traceback computation, the differences between state metrics are kept by storing the difference metrics. Hence, the traceback MAP decoding performs without losing correction ability. In addition, the proposed traceback computation works in the L-MAP and (E)ML-MAP. Fig. 6 shows the proposed traceback decoding procedure, which is a modification of the conventional decoding procedure shown in Fig. 3. Compared with the conventional decoding procedure, only an additional TRP is added. Fig. 5(a) and (c) illustrates an example of the radix-2 SB traceback computation in a radix-2 butterfly structure. In Fig. 5(a), only one difference metric (black arc) computed by a radix-2 ACSU is stored in the SMC. In Fig. 5(c), the two state metrics (white nodes) are traced back in the traceback recursion by one traceback unit (TBU) with the stored difference metric (black arc). The recovered two state metrics are recursively fed back to the TRP to calculate the next two state metrics. The stored metrics are reduced from two state metrics to one difference metric. The bit-lengths of the state metric and difference metric are the same. Thus, the radix-2 traceback SB MAP decoding requires half the SMC size of the conventional decoding procedure. Compared with the reverse computations shown in Figs. 4 and

5 LIN et al.: LOW-POWER MEMORY-REDUCED TRACEBACK MAP DECODING FOR DB CONVOLUTIONAL TURBO DECODER (b), the traceback computation has fixed paths and requires no complicated checker and path selection. The design details of the radix-2 traceback SB MAP decoding, including the radix-2 ACSU, radix-2 TBU, and overall architecture of the WCDMA CTC decoder, can be referred to in [19]. Based on the experimental results using a m CMOS process, the radix-2 SB traceback MAP decoding achieves a 21.4% power reduction of the SW L-MAP decoder with a 3.2% logic overhead. V. TRACEBACK COMPUTATION AND COMPUTATIONAL UNITS FOR THE RADIX-4 DB MAP DECODING In this paper, we emphasize the design of the radix-4 traceback DB MAP decoding and the corresponding computational units. Here, we demonstrate how the radix-4 traceback DB MAP decoding works with accessing the difference metrics. Two widely used ACSUs are then described, and the corresponding TBUs are proposed for the radix-4 traceback DB MAP decoding. A. Traceback Computation for the DB MAP Decoding The radix-4 traceback DB MAP decoding still employs the aforementioned five major stages/phases in Section IV and the traceback decoding procedure shown in Fig. 6. For the DB MAP decoding, the decoding trellis changes to the radix-4 trellis. Fig. 5(d) and (f) illustrates an example of the traceback computation in a radix-4 butterfly structure. In Fig. 5(d), the natural recursion in a radix-4 butterfly structure shows that three difference metrics (black arcs) are generated by a 4-input 1-output ACSU. These three difference metrics are stored into the SMC for the traceback computation. In Fig. 5(f), one 1-input 4-output TBU then traces the four state metrics (white nodes) back, with the three stored difference metrics (black arcs) in the traceback recursion. Taking the DB RCS encoder shown in Fig. 2, for instance, Fig. 7(a) illustrates the corresponding eight-state radix-4 trellis diagram of the natural recursion for the DB MAP decoding, where denotes the time index in the natural order, and denotes the symbol index. Fig. 7(b) illustrates the radix-4 trellis of the traceback recursion for the DB MAP decoding, where denotes the time index in the reverse order. In the conventional decoding procedure (see Fig. 3), the eight states metrics are recursively computed by the NRP. The SMC stores these eight state metrics to compute the a posteriori LLR. In the traceback decoding procedure (see Fig. 6), the six difference metrics [black arcs in Fig. 7(a)] are computed by the NRP and stored into the SMC. The TRP, composed of 2 TBUs [black nodes in Fig. 7(b)], traces the eight state metrics back with the stored six difference metrics [black arcs in Fig. 7(b)]. Compared with the conventional decoding procedure, the stored metrics are reduced from eight state metrics to six difference metrics. Thus, the traceback computation reduces the SMC power consumption for the DB MAP decoding. B. Radix-2 2 Traceback Pair For the radix-4 MAP decoding, two types of ACSUs, the radix-2 2 ACSU and the radix-4 ACSU, are widely used to constitute the NRP. Fig. 8(a) shows an example of the Fig state radix-4 trellis diagrams of the (a) natural recursion and (b) traceback recursion. The symbol index is denoted by k. The time index in the natural recursion is denoted by t, and the time index in the traceback recursion is denoted by t. The computational units in (a) are the ACSUs, and the computational units in (b) are the TBUs. radix-2 2 ACSU. The radix-2 2 ACSU consists of four front adders, three radix-2 compare-select units (CSUs), and an LUT. In the proposed traceback computation, three difference metrics (Diff 0, Diff 1, and Diff 2) generated by three radix-2 CSUs are stored in the SMC. Fig. 8(b) shows the corresponding TBU of the radix-2 2 ACSU. Fig. 8(c) illustrates a computational example of the radix-2 2 ACSU and TBU. The radix-2 2 ACSU obtains the maximal state metric B based on Diff 0, Diff 1, and Diff 2. Subsequently, the radix-2 2 TBU regenerates A, B, C, and D based on Diff 0, Diff 1, and Diff 2, since B can be initially achieved. Hence, the four state metrics can be recomputed by the radix-2 2 TBU with the difference metrics stored in the SMC. In the radix-2 2 TBU, the sign bit of the difference metric decides the paths of the two multiplexers and the operation of the binary adder/subtracter. Taking the trellis diagram shown in Fig. 7(b), for instance, six difference metrics are stored in the SMC because two current states and two TBUs can trace eight next states back. The storage of the SMC is reduced from eight state metrics to six difference metrics at each stage. Note that the values A, B, C, and D in the output end of the radix-2 2 TBU can be the input values of the LAPO to compute. This approach reduces eight adders ( in (1d)) in the LAPO. C. Radix-4 Traceback Pair Fig. 9(a) shows an example of the radix-4 ACSU. The radix-4 ACSU has a comparator to select the maximal state metric quickly. Unlike the radix-2 2 ACSU, three difference metrics related to A (Diff 0, Diff 1, and Diff 2) and two selective bits (S0 and S1) of the radix-4 ACSU are stored in the SMC. The or

6 1010 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 56, NO. 5, MAY 2009 Fig. 8. Radix-222 traceback pair: (a) the ACSU, (b) the TBU, and (c) a computational example. corresponding TBU of the radix-4 ACSU is shown in Fig. 9(b). Fig. 9(c) illustrates a computational example of the radix-4 ACSU and TBU. The radix-4 ACSU obtains the maximal state metric B based on two selective bits generated by the six parallel subtractions in the comparator. The three difference metrics related to A are stored in the SMC. Since B can be initially achieved, A can be regenerated by the radix-4 TBU based on the stored S0 and S1. Subsequently, the radix-4 TBU regenerates A, B, C, and D based on Diff 0, Diff 1, and Diff 2. Taking the trellis diagram shown in Fig. 7(b), for instance, six difference metrics and four extra selective bits are stored in the SMC because two current states and two TBUs can trace eight next states back. The storage of the SMC is reduced from eight state metrics to six difference metrics and four extra bits at each stage. The values A, B, C, and D in the output end of the radix-4 TBU can be the input values of the LAPO to compute in. This approach reduces eight adders [ or in (1d)] in the LAPO. These two traceback pairs perform the (E)ML-MAP if the LUT of the corrective term in (3) is not implemented. Otherwise, these two traceback pairs perform the L-MAP. The proposed traceback computation and traceback pairs of the radix-4 DB MAP decoding can be simply applied to the radix-4 SB MAP decoding with a simple trellis modification.

7 LIN et al.: LOW-POWER MEMORY-REDUCED TRACEBACK MAP DECODING FOR DB CONVOLUTIONAL TURBO DECODER 1011 Fig. 9. Radix-4 traceback pair: (a) the ACSU, (b) the TBU, and (c) a computational example. VI. COMPARISONS AND EXPERIMENTAL RESULTS Here, accurate silicon area and power evaluations are obtained by using Verilog HDL codes synthesized with the standard cell library of TSMC m CMOS process. The decoding procedures of the conventional and proposed traceback structures are only considered and evaluated in the radix-4 DB EML-MAP. A. Area and Power Evaluations and Comparisons Table I lists a summary of the conventional and traceback structures for the radix-4 DB EML-MAP decoding, where denotes the bit-length of a state metric or a difference metric. The constraint length cannot be less than 3 for exploiting the radix-4 trellis. Taking and, for instance, eight radix-2 2 (or radix-4) ACSUs generate eight state metrics at each time stage. Hence, the SMC of the conventional structures has 80 bit-lengths. However, the traceback structure composed of eight radix-2 2 ACSUs and two radix-2 2 TBUs reduces the SMC bit-lengths from 80 to 60. The traceback structure composed of eight radix-4 ACSUs and two radix-4 TBUs reduces the SMC bit-lengths from 80 to 64. For the comparison of the different structures, the practical evaluations are under the WiMAX CTC decoding. The -type TABLE I SUMMARY OF THE CONVENTIONAL AND TRACEBACK STRUCTURES parallel-window (PW) MAP decoding in [26] is adopted to achieve a high throughput rate because the decoding latency is 1. Table II lists evaluation parameters under the specification of WiMAX CTC. Ten parallel windows are used to decode an CTC block, and we achieves. The size of information bits is a double of a block size because of the DB CTC. The bit-length of a state metric ( or )or a difference metric is 10. Table III shows the evaluated results of computational units of the different structures. The silicon area and path delay are reported by Synopsis Design Vision. The power consumption on 2-dB SNR noisy data is estimated by Synopsis PrimePower at 100 MHz operating frequency. For the eight-state radix-4 trellis, eight ACSUs (as an NRP) are grouped as two examples of the conventional structures. Eight ACSUs (as an NRP) and two

8 1012 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 56, NO. 5, MAY 2009 TABLE II EVALUATION SPECIFICATION OF WIMAX CTC DECODING TABLE IV SMC COMPARISON OF THE DIFFERENT STRUCTURES TABLE III COMPUTATIONAL UNIT COMPARISON OF THE DIFFERENT STRUCTURES TBUs (as a TRP) are grouped as two examples of the traceback structures. The radix-2 2 ACSU has the longest path delay, because the values of the latter radix-2 CSU in the radix-2 2 ACSU are not correct until the sign (most significant) bits of the two former radix-2 CSU are stable. The radix-2 2 TBU does not suffer from this problem because the sign bits of the difference metrics are initially known. In addition, the radix-4 ACSU selects the maximal state metric quickly because the comparator generates selective bits based on the six parallel subtractions at the first level. Compared with the radix-2 2 traceback structure, the radix-4 traceback structure has shorter and more balanced path delays. However, the radix-2 2 traceback structure has less area costs and power consumptions. Because the ACSU dominate the operating frequency of a MAP decoder, the radix-4 traceback structure can derive a higher operating frequency than the radix-2 2 traceback structure. Operating at a higher operating frequency consumes more power for the radix-4 traceback structure. Thus, it is a tradeoff between the power consumption and maximum operating frequency for the radix-4 traceback DB MAP decoding. In Table IV, the SMCs composed of single-port (SP) RAMs are generated by using the TSMC m process for the comparison of the different structures. Because of the -type PW MAP decoding, the SMC can be implemented by a SP RAM with depth. Note that eight state metrics have to be stored in the SMC, despite the conventional structure composed of the radix-2 2 or radix-4 ACSUs. For the radix-2 2 traceback structure, only six difference metrics are stored in the SMC. For the radix-4 traceback structure, six difference metrics and four extra selective bits are stored in the SMC. Fig. 10 shows comparisons of silicon area and power consumption of the SMC and TRP. The radix-2 2 traceback structure achieves a 24.9% area reduction and 25% power reduction of the SMC. On the other hand, the radix-4 traceback structure achieves a 19.6% area reduction of the SMC and 19.5% power reduction of the SMC. The power overhead of the TRPs is less than 4%, and the area overhead of the TRPs is less than 19%. Thus, the radix-4 and radix-2 2 traceback structure achieve the area and power reduction of the SMC of the radix-4 and radix-2 2 conventional structure. B. Area and Power Evaluations of PW EML-MAP Decoders Fig. 11 illustrates a processing element (PE) employing the traceback -type PW EML-MAP decoding. The architecture can be divided into the forward recursion path (upper path) and backward recursion path (lower path). One branch metrics unit (BMU) for (1a), one NRP for (1b) or (1c), one SMC with depth, one proposed TRP, one LAPO for (1d), one log-extrinsic module (LEX) for (1e), and one hard decision module (HD) for (2) are used for each path. When the TRPs are removed, the PE performs the conventional -type PW EML-MAP decoding. The branch metrics are directly calculated without buffering in a branch metrics cache because of the -type PW decoding. To meet the parameters in Table II, ten PEs are constructed as a DB MAP kernel. Fig. 12 shows comparisons of silicon area and power consumption of the ten PEs composed of the conventional and traceback structures. The power consumption on 2-dB SNR noisy data is estimated at 100 MHz operating frequency. Compared with the radix-2 2 conventional structure, the radix-2 2 traceback structure achieves a 9.7% power reduction and a 2.1% area reduction. The logic overhead of radix-2 2 TRP is 9.8%. Similarly, the radix-4 traceback structure achieves a 7.9% power reduction and a 2.1% area reduction compared with the radix-4 conventional structure. The logic overhead of radix-4 TRP is 6.6%. The silicon areas of the traceback structures are not significantly reduced because the radix-4 DB MAP decoding complicates the hardware complexities of all computational units. Compared with the conventional structures, however, the power consumptions of the traceback structures are noticeably lower. The logic overhead of the radix-4 DB traceback computation is less than 10%. VII. APPLICATION TO 12-MODE WIMAX CTC DECODER Here, the 12-mode WiMAX CTC decoder for is presented. The design parameters of the CTC decoding are

9 LIN et al.: LOW-POWER MEMORY-REDUCED TRACEBACK MAP DECODING FOR DB CONVOLUTIONAL TURBO DECODER 1013 Fig. 10. (a) Power consumption and (b) silicon area comparisons of the SMC and TRP using the conventional and traceback structures. Fig. 11. Architecture of the processing element (PE) employing traceback PW EML-MAP decoding. Fig. 12. (a) Power consumption and (b) silicon area comparisons of 10 PW EML-MAP PEs composed of the conventional and traceback structures. shown in Table II. A prototyping CTC decoder has been implemented by using a TSMC m CMOS process. A. Overall Architecture A VLSI architecture design of the 12-mode WiMAX CTC decoder has been presented in [27] to achieve the features of high throughput and efficient reconfigurability. The overall architecture of the implemented 12-mode WiMAX CTC decoder is shown in Fig. 13. The decoder is composed of three input buffers, an internal buffer, an output buffer, a global controller, and a DB MAP kernel. The DB MAP kernel is composed of ten PW EML-MAP PEs and the conjoint WiMAX CTC interleavers (CIs). Because the radix-2 2 traceback structure has less power consumption (see Fig. 12), the radix-2 2 traceback structure is adopted in the PEs. Each buffer is divided into ten banks to be accessed simultaneously by the PE. To satisfy the

10 1014 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 56, NO. 5, MAY 2009 TABLE V CHIP SUMMARY OF THE 12-MODE WIMAX CTC DECODER Fig. 13. Block diagram of the 12-mode WiMAX CTC decoder. TABLE VI COMPARISONS OF DIFFERENT CTC DECODER CHIPS Fig. 14. Chip layout of the proposed 12-mode WiMAX CTC decoder. (+ represents the SMC composed of single-port RAM.) power-efficient 12-mode CTC decoding, the global controller activates the appropriate PEs, CIs, and buffer banks, depending on the block size. B. Prototyping Chip Implementation The chip layout and summary of the prototyping chip implemented by using the TSMC m CMOS process within a core size of 7.16 mm are shown in Fig. 14 and Table V, respectively. The prototyping chip contains 27.4 Kb RAM and integrates 635 Kgates of logic. Each PE accounts for 52.3 Kgates of logic, and each CI accounts for 0.51 Kgates of logic. Based on worst-case static timing analysis and post-layout simulation results, the decoder achieves a maximum operating frequency of 100 MHz. The post-layout, gate-level, and typical-case (@1.2 V, 25 C) power consumption of this CTC decoder is mw, which is estimated at 100 MHz operating frequency on 2-dB SNR noisy data. For a single WiMAX FEC block within the 12 general transmissions, this CTC decoder can achieve a higher throughput rate of Mbps at 100 MHz than that of Mbps at 214 MHz of the Xilinx CTC decoder implemented in a FPGA [28]. C. Comparisons of CTC Decoders In Table VI, the proposed 12-mode WiMAX CTC decoder is compared with other chip designs that support multiple block sizes. The work in [29] is designed for the HSDPA standard with one radix-4 SB SW L-MAP PE and block sizes from 40 to The work in [30] is designed with seven radix-2 SB SW L-MAP PEs and block sizes from It is hard to compare these chips since the coding parameters are different from each other. However, we use normalized energy efficiency (NEE) as one performance index. The NEE indicates how much energy a decoder chip consumes to process a hard bit at an iteration. In addition, we use normalized area efficiency (NAE) as the other performance index. The NAE indicates how many hard bits per one mm for a single CTC block a decoder chip decodes. When we normalize the works from m to m technology, the normalized energy factor is V V m m. Similarly, the normalized area factor is m m. Our proposed decoder achieves a lower NEE (0.43 nj/bit/iteration) and higher NAE (0.68 bits/mm ). Note that the NEE and NAE are not used to justify which design is superior to the others, but to provide an evaluative method for reference. (4) (5)

11 LIN et al.: LOW-POWER MEMORY-REDUCED TRACEBACK MAP DECODING FOR DB CONVOLUTIONAL TURBO DECODER 1015 VIII. CONCLUSION To reduce the power consumption of SMC, the traceback MAP decoding has been presented with a low logic overhead. The proposed traceback MAP decoding performs in the L-MAP and (E)ML-MAP without losing correction ability. Two pairs of the traceback computation are introduced for the radix-4 DB MAP decoding. The traceback radix-2 2 pair has low hardware costs, and the radix-4 traceback pair has short path delays. In addition, the proposed traceback pairs of the radix-4 DB MAP decoding can be simply applied to the radix-4 SB MAP decoding. The experimental result shows that the two traceback structures achieve an around 20% power reduction of the SMC, and around 7% power reduction of ten DB MAP decoders for the WiMAX CTC. Compared with other designs, the 12-mode WiMAX CTC decoder employing the radix-2 2 traceback structure achieves a low energy efficiency of 0.43 nj/bit per iteration. ACKNOWLEDGMENT The authors would like to thank National Chip Implementation Center (CIC) for technical support. REFERENCES [1] C. Berrou, A. Glavieux, and P. Thitimajshima, Near Shannon limit error-correcting coding and decoding: Turbo codes, in Proc. IEEE Int. Conf. Commun. (ICC), 1993, pp [2] 3rd Generation Partnership Project (3GPP). [Online]. Available: [3] 3rd Generation Partnership Project 2 (3GPP2). [Online]. Available: [4] C. Berrou and M. Jezequel, Non-binary convolutional codes for turbo coding, Electron. Lett., vol. 35, no. 1, pp , Jan [5] C. Berrou et al., The advantages of non-binary turbo codes, in Proc. IEEE Inf. Theory Workshop (ITW), 2001, pp [6] Digital Video Broadcasting (DVB). [Online]. Available: dvb.org/ [7] Worldwide Interoperability for Microwave Access (WiMAX). [Online]. Available: [8] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, Optimal decoding of linear codes for minimizing symbol error rate, IEEE Trans. Inf. Theory, vol. IT-20, no. 2, pp , Mar [9] P. Robertson, E. Villebrun, and P. Hoeher, A comparison of optimal and sub-optimal MAP decoding algorithms operating in the log domain, in Proc. IEEE Int. Conf. Commun. (ICC), 1995, pp [10] J. P. Woodard and L. Hanzo, Comparative study of turbo decoding techniques: An overview, IEEE Trans. Veh. Technol., vol. 49, no. 6, pp , Nov [11] S. Papaharalabos, P. Sweeney, and B. G. Evans, SISO algorithms based on combined max/max* operations for turbo decoding, Electron. Lett., vol. 41, no. 3, pp , Feb [12] J. Vogt and A. Finger, Improving the max-log-map turbo decoder, Electron. Lett., vol. 36, no. 23, pp , Nov [13] G. Masera et al., VLSI architecture for turbo codes, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 7, no. 3, pp , Aug [14] C. Schurgers, F. Catthoor, and M. Engels, Memory optimization of MAP turbo decoder algorithms, IEEE Trans. Very Large Scale Integr. (VLSI) Sys., vol. 9, no. 3, pp , Sep [15] H. Liu et al., Energy efficient turbo decoder by reducing the state metric quantization, in Proc. IEEE Workshop Signal Processing Syst. (SiPS), 2007, pp [16] Z. Wang, Z. Chi, and K. K. Parhi, Area-efficient high-speed decoding schemes for turbo decoders, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 10, no. 6, pp , Aug [17] M. M. Mansour and N. R. Shanbhag, VLSI architectures for SISO-APP decoders, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 11, no. 4, pp , Aug [18] E. Boutillon, W. J. Gross, and P. G. Gulak, VLSI architectures for the MAP algorithm, IEEE Trans. Commun., vol. 51, pp , Feb [19] T.-H. Tsai, C.-H. Lin, and A.-Y. Wu, A memory-reduced log-map kernel for turbo decoder, in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), 2005, pp [20] D.-S. Lee and I.-C. Park, Low-power log-map decoding based on reduced metric memory access, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 53, no. 6, pp , Jun [21] H.-M. Choi, J.-H. Kim, and I.-C. Park, Low-power hybrid turbo decoding based on reverse calculation, IEICE Trans. Fund. Electron. Comm. Comput. Sci., vol. E89-A, no. 3, pp , Mar [22] J.-H. Kim and I.-C. Park, Double-binary circular turbo decoding based on border metric encoding, IEEE Trans. Circuits. Syst. II, Exp. Briefs., vol. 55, no. 1, pp , Jan [23] J.-H. Kim and I.-C. Park, Bit-level extrinsic information exchange method for double-binary turbo codes, IEEE Trans. Circuits. Syst. II, Exp. Briefs, vol. 56, no. 1, pp , Jan [24] Y. Ould-Cheikh-Mouhamedou, P. Guinand, and P. Kabal, Enhanced max-log-app and enhanced log-app decoding for DVB-RCS, in Proc. 3rd Int. Symp. Turbo Codes, Sep. 2003, pp [25] A. J. Viterbi, An intuitive justification and simplified implementation of the MAP decoder for convolutional codes, IEEE J. Sel. Areas Commun., vol. 16, no. 2, pp , Feb [26] A. Worm, H. Lamm, and N. Wehn, A high-speed MAP architectures with optimized memory and power consumption, in Proc. IEEE Workshop Signal Processing Syst. (SiPS), 2000, pp [27] C.-H. Lin, C.-Y. Chen, and A.-Y. Wu, High-throughput 12-mode CTC decoder for WiMAX standard, in Proc. IEEE Int. Symp. VLSI Design, Automation, and Test (VLSI-DAT), 2008, pp [28] IEEE e CTC Decoder Core, DS137 (V2.2), XiLinx IP Center, Jun. 5, [29] M. Bickerstaff et al., A 24Mb/s radix-4 logmap turbo decoder for 3GPP-HSDPA mobile wireless, in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2003, pp [30] B. Bougard et al., A scalable 8.7nj/bit 75.6Mb/s parallel concatenated convolutional (turbo-)codec, in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2003, pp Cheng-Hung Lin (S 05) received the B.S. degree in electronic engineering from Fu Jen Catholic University, Taipei, Taiwan, in 2002, and the M.S. degree in electrical engineering from National Central University, Taoyuan, Taiwan, in He is currently working toward the Ph.D. degree in electronic engineering at National Taiwan University, Taipei. His research interests include the design of very large-scale integration architectures and circuits for digital signal processing and communication systems. He is currently working on the hardware design for coding systems. Chun-Yu Chen (S 08) received the B.S. degree from National Chiao Tung University, Hsinchu, Taiwan, in 2007 and the M.S. degree from National Taiwan University, Taipei, in 2009, both in electronic engineering. He is currently an engineer with Silicon Motion Technology Corporation, Taipei. His research interests include the design of very large-scale integration architectures and circuits for digital signal processing and communication systems.

12 1016 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 56, NO. 5, MAY 2009 An-Yeu (Andy) Wu (S 91 M 96) received the B.S. degree from National Taiwan University, Taipei, in 1987, and the M.S. and Ph.D. degrees from the University of Maryland, College Park, in 1992 and 1995, respectively, all in electrical engineering. From August 1995 to July 1996, he was a Member of Technical Staff (MTS) with AT&T Bell Laboratories, Murray Hill, NJ, working on high-speed transmission IC designs. From 1996 to July 2000, he was with the Electrical Engineering Department of National Central University, Taiwan. In August 2000, he joined the faculty of the Department of Electrical Engineering and the Graduate Institute of Electronics Engineering, National Taiwan University, where he is currently a Professor. His research interests include low-power/high-performance VLSI architectures for DSP and communication applications, adaptive/multirate signal processing, reconfigurable broadband access systems and architectures, and SoC platform for software/hardware co-design. Dr. Wu served as an Associate Editor for EURASIP Journal of Applied Signal Processing from 2001 to 2004 and acted as the leading Guest Editor for a special issue on Signal Processing for Broadband Access Systems: Techniques and Implementations of the same journal (published in December 2003). He also served as the Associate Editor of the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS from 2003 to 2005 and the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS from 2007 to He is now an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS and the IEEE TRANSACTIONS ON SIGNAL PROCESSING. He has served on the technical program committees of many major IEEE International Conferences, such as ICIP, SiPS, AP-ASIC, ISCAS, ISPACS, ICME, SOC, and A-SSCC. He was the recipient of the A-class Research Award from the National Science Council, Taiwan, four times from 1997 to He received the Macronix International Corporation (MXIC) Young Chair Professor Award in 2003, the Distinguished Young Engineer Award from The Chinese Institute of Electrical Engineering, Taiwan, in 2004, and, in 2005, the Dr. Wu Ta-you Award (young scholar award) and the President Fu Si-nien Award, from the National Science Council and National Taiwan University, respectively, for his research work on VLSI system designs. Tsung-Han Tsai (S 96 M 98) received the B.S., M.S., and Ph.D. degrees in electrical engineering from the National Taiwan University, Taipei, in 1990, 1994, and 1998, respectively. From 1999 to 2000, he was a Professor of Electronic Engineering with Fu Jen Catholic University, Taipei, Taiwan. In 2000, he joined the Department of Electrical Engineering, National Central University, Taipei, where he is currently a Professor. He has been awarded 14 patents and has published more than 120 referred papers in international journals and conferences. His research interests include VLSI signal processing, video/audio coding algorithms, DSP architecture design, wireless communication, and system-on-chip design. Dr. Tsai is a member of the Audio Engineering Society (AES) and the Institute of Electronics, Information and Communication Engineers (IEICE). He was the recipient of the Industrial Cooperation Award in 2003 from the Ministry of Education, Taiwan. He is a member of the Technical Committee of the IEEE Circuits and Systems Society and serves as a Technical Program Committee member or Session Chair of several international conferences.

/$ IEEE

/$ IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 1, JANUARY 2009 81 Bit-Level Extrinsic Information Exchange Method for Double-Binary Turbo Codes Ji-Hoon Kim, Student Member,

More information

High Speed Downlink Packet Access efficient turbo decoder architecture: 3GPP Advanced Turbo Decoder

High Speed Downlink Packet Access efficient turbo decoder architecture: 3GPP Advanced Turbo Decoder I J C T A, 9(24), 2016, pp. 291-298 International Science Press High Speed Downlink Packet Access efficient turbo decoder architecture: 3GPP Advanced Turbo Decoder Parvathy M.*, Ganesan R.*,** and Tefera

More information

High Throughput Radix-4 SISO Decoding Architecture with Reduced Memory Requirement

High Throughput Radix-4 SISO Decoding Architecture with Reduced Memory Requirement JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.4, AUGUST, 2014 http://dx.doi.org/10.5573/jsts.2014.14.4.407 High Throughput Radix-4 SISO Decoding Architecture with Reduced Memory Requirement

More information

RECENTLY, researches on gigabit wireless personal area

RECENTLY, researches on gigabit wireless personal area 146 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 55, NO. 2, FEBRUARY 2008 An Indexed-Scaling Pipelined FFT Processor for OFDM-Based WPAN Applications Yuan Chen, Student Member, IEEE,

More information

ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7

ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7 ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7 8.7 A Programmable Turbo Decoder for Multiple 3G Wireless Standards Myoung-Cheol Shin, In-Cheol Park KAIST, Daejeon, Republic of Korea

More information

THE turbo code is one of the most attractive forward error

THE turbo code is one of the most attractive forward error IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 63, NO. 2, FEBRUARY 2016 211 Memory-Reduced Turbo Decoding Architecture Using NII Metric Compression Youngjoo Lee, Member, IEEE, Meng

More information

Low Complexity Architecture for Max* Operator of Log-MAP Turbo Decoder

Low Complexity Architecture for Max* Operator of Log-MAP Turbo Decoder International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Low

More information

DUE to the high computational complexity and real-time

DUE to the high computational complexity and real-time IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen

More information

THE orthogonal frequency-division multiplex (OFDM)

THE orthogonal frequency-division multiplex (OFDM) 26 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 57, NO. 1, JANUARY 2010 A Generalized Mixed-Radix Algorithm for Memory-Based FFT Processors Chen-Fong Hsiao, Yuan Chen, Member, IEEE,

More information

/$ IEEE

/$ IEEE IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 10, OCTOBER 2006 1147 Transactions Briefs Highly-Parallel Decoding Architectures for Convolutional Turbo Codes Zhiyong He,

More information

Memory-Reduced Turbo Decoding Architecture Using NII Metric Compression

Memory-Reduced Turbo Decoding Architecture Using NII Metric Compression Memory-Reduced Turbo Decoding Architecture Using NII Metric Compression Syed kareem saheb, Research scholar, Dept. of ECE, ANU, GUNTUR,A.P, INDIA. E-mail:sd_kareem@yahoo.com A. Srihitha PG student dept.

More information

FAST FOURIER TRANSFORM (FFT) and inverse fast

FAST FOURIER TRANSFORM (FFT) and inverse fast IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 11, NOVEMBER 2004 2005 A Dynamic Scaling FFT Processor for DVB-T Applications Yu-Wei Lin, Hsuan-Yu Liu, and Chen-Yi Lee Abstract This paper presents an

More information

VHDL Implementation of different Turbo Encoder using Log-MAP Decoder

VHDL Implementation of different Turbo Encoder using Log-MAP Decoder 49 VHDL Implementation of different Turbo Encoder using Log-MAP Decoder Akash Kumar Gupta and Sanjeet Kumar Abstract Turbo code is a great achievement in the field of communication system. It can be created

More information

TURBO codes, [1], [2], have attracted much interest due

TURBO codes, [1], [2], have attracted much interest due 800 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 2, FEBRUARY 2001 Zigzag Codes and Concatenated Zigzag Codes Li Ping, Member, IEEE, Xiaoling Huang, and Nam Phamdo, Senior Member, IEEE Abstract

More information

Programmable Turbo Decoder Supporting Multiple Third-Generation Wireless Standards

Programmable Turbo Decoder Supporting Multiple Third-Generation Wireless Standards Programmable Turbo Decoder Supporting Multiple Third-eneration Wireless Standards Myoung-Cheol Shin and In-Cheol Park Department of Electrical Engineering and Computer Science, KAIST Yuseong-gu Daejeon,

More information

EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL INTERLEAVER FOR LONG TERM EVOLUTION SYSTEMS

EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL INTERLEAVER FOR LONG TERM EVOLUTION SYSTEMS Rev. Roum. Sci. Techn. Électrotechn. et Énerg. Vol. 61, 1, pp. 53 57, Bucarest, 016 Électronique et transmission de l information EFFICIENT RECURSIVE IMPLEMENTATION OF A QUADRATIC PERMUTATION POLYNOMIAL

More information

422 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 45, NO. 2, FEBRUARY 2010

422 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 45, NO. 2, FEBRUARY 2010 422 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 45, NO. 2, FEBRUARY 2010 Turbo Decoder Using Contention-Free Interleaver and Parallel Architecture Cheng-Chi Wong, Ming-Wei Lai, Chien-Ching Lin, Hsie-Chia

More information

LOW-DENSITY parity-check (LDPC) codes, which are defined

LOW-DENSITY parity-check (LDPC) codes, which are defined 734 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 9, SEPTEMBER 2009 Design of a Multimode QC-LDPC Decoder Based on Shift-Routing Network Chih-Hao Liu, Chien-Ching Lin, Shau-Wei

More information

AS TURBO codes [1], or parallel concatenated convolutional

AS TURBO codes [1], or parallel concatenated convolutional IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 15, NO. 7, JULY 2007 801 SIMD Processor-Based Turbo Decoder Supporting Multiple Third-Generation Wireless Standards Myoung-Cheol Shin,

More information

A VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation

A VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation Journal of Automation and Control Engineering Vol. 3, No. 1, February 20 A VLSI Architecture for H.264/AVC Variable Block Size Motion Estimation Dam. Minh Tung and Tran. Le Thang Dong Center of Electrical

More information

Exploring Parallel Processing Levels for Convolutional Turbo Decoding

Exploring Parallel Processing Levels for Convolutional Turbo Decoding Exploring Parallel Processing Levels for Convolutional Turbo Decoding Olivier Muller Electronics Department, GET/EST Bretagne Technopôle Brest Iroise, 29238 Brest, France olivier.muller@enst-bretagne.fr

More information

OPTIMIZED MAP TURBO DECODER

OPTIMIZED MAP TURBO DECODER OPTIMIZED MAP TURBO DECODER Curt Schurgers Francy Catthoor Marc Engels EE Department IMEC/KUL IMEC UCLA Kapeldreef 75 Kapeldreef 75 Los Angeles, CA 90024 3001 Heverlee 3001 Heverlee USA Belgium Belgium

More information

EFFICIENT PARALLEL MEMORY ORGANIZATION FOR TURBO DECODERS

EFFICIENT PARALLEL MEMORY ORGANIZATION FOR TURBO DECODERS In Proceedings of the European Signal Processing Conference, pages 831-83, Poznan, Poland, September 27. EFFICIENT PARALLEL MEMORY ORGANIZATION FOR TURBO DECODERS Perttu Salmela, Ruirui Gu*, Shuvra S.

More information

A New List Decoding Algorithm for Short-Length TBCCs With CRC

A New List Decoding Algorithm for Short-Length TBCCs With CRC Received May 15, 2018, accepted June 11, 2018, date of publication June 14, 2018, date of current version July 12, 2018. Digital Object Identifier 10.1109/ACCESS.2018.2847348 A New List Decoding Algorithm

More information

SINCE its introduction in 1993 [1], turbo codes have found

SINCE its introduction in 1993 [1], turbo codes have found 1718 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 40, NO. 8, AUGUST 2005 A 285-MHz Pipelined MAP Decoder in 0.18-m CMOS Seok-Jun Lee, Member, IEEE, Naresh R. Shanbhag, Senior Member, IEEE, and Andrew C.

More information

The Serial Commutator FFT

The Serial Commutator FFT The Serial Commutator FFT Mario Garrido Gálvez, Shen-Jui Huang, Sau-Gee Chen and Oscar Gustafsson Journal Article N.B.: When citing this work, cite the original article. 2016 IEEE. Personal use of this

More information

Non-recursive complexity reduction encoding scheme for performance enhancement of polar codes

Non-recursive complexity reduction encoding scheme for performance enhancement of polar codes Non-recursive complexity reduction encoding scheme for performance enhancement of polar codes 1 Prakash K M, 2 Dr. G S Sunitha 1 Assistant Professor, Dept. of E&C, Bapuji Institute of Engineering and Technology,

More information

Reduced complexity Log-MAP algorithm with Jensen inequality based non-recursive max operator for turbo TCM decoding

Reduced complexity Log-MAP algorithm with Jensen inequality based non-recursive max operator for turbo TCM decoding Sybis and Tyczka EURASIP Journal on Wireless Communications and Networking 2013, 2013:238 RESEARCH Open Access Reduced complexity Log-MAP algorithm with Jensen inequality based non-recursive max operator

More information

VLSI Architectures for SISO-APP Decoders

VLSI Architectures for SISO-APP Decoders IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 11, NO. 4, AUGUST 2003 627 VLSI Architectures for SISO-APP Decoders Mohammad M. Mansour, Student Member, IEEE, and Naresh R. Shanbhag,

More information

Area-Efficient High-Throughput MAP Decoder Architectures

Area-Efficient High-Throughput MAP Decoder Architectures IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 8, AUGUST 2005 921 Area-Efficient High-Throughput MAP Decoder Architectures Seok-Jun Lee, Member, IEEE, Naresh R. Shanbhag,

More information

BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU

BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU 2013 8th International Conference on Communications and Networking in China (CHINACOM) BER Guaranteed Optimization and Implementation of Parallel Turbo Decoding on GPU Xiang Chen 1,2, Ji Zhu, Ziyu Wen,

More information

LLR-based Successive-Cancellation List Decoder for Polar Codes with Multi-bit Decision

LLR-based Successive-Cancellation List Decoder for Polar Codes with Multi-bit Decision > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLIC HERE TO EDIT < LLR-based Successive-Cancellation List Decoder for Polar Codes with Multi-bit Decision Bo Yuan and eshab. Parhi, Fellow,

More information

An Implementation of a Soft-Input Stack Decoder For Tailbiting Convolutional Codes

An Implementation of a Soft-Input Stack Decoder For Tailbiting Convolutional Codes An Implementation of a Soft-Input Stack Decoder For Tailbiting Convolutional Codes Mukundan Madhavan School Of ECE SRM Institute Of Science And Technology Kancheepuram 603203 mukundanm@gmail.com Andrew

More information

The Lekha 3GPP LTE Turbo Decoder IP Core meets 3GPP LTE specification 3GPP TS V Release 10[1].

The Lekha 3GPP LTE Turbo Decoder IP Core meets 3GPP LTE specification 3GPP TS V Release 10[1]. Lekha IP Core: LW RI 1002 3GPP LTE Turbo Decoder IP Core V1.0 The Lekha 3GPP LTE Turbo Decoder IP Core meets 3GPP LTE specification 3GPP TS 36.212 V 10.5.0 Release 10[1]. Introduction The Lekha IP 3GPP

More information

THERE has been great interest in recent years in coding

THERE has been great interest in recent years in coding 186 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 16, NO. 2, FEBRUARY 1998 Concatenated Decoding with a Reduced-Search BCJR Algorithm Volker Franz and John B. Anderson, Fellow, IEEE Abstract We

More information

Comparison of Decoding Algorithms for Concatenated Turbo Codes

Comparison of Decoding Algorithms for Concatenated Turbo Codes Comparison of Decoding Algorithms for Concatenated Turbo Codes Drago Žagar, Nenad Falamić and Snježana Rimac-Drlje University of Osijek Faculty of Electrical Engineering Kneza Trpimira 2b, HR-31000 Osijek,

More information

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO 2402 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 6, JUNE 2016 A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO Antony Xavier Glittas,

More information

ERROR correcting codes are used to increase the bandwidth

ERROR correcting codes are used to increase the bandwidth 404 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 37, NO. 3, MARCH 2002 A 690-mW 1-Gb/s 1024-b, Rate-1/2 Low-Density Parity-Check Code Decoder Andrew J. Blanksby and Chris J. Howland Abstract A 1024-b, rate-1/2,

More information

Efficient VLSI Huffman encoder implementation and its application in high rate serial data encoding

Efficient VLSI Huffman encoder implementation and its application in high rate serial data encoding LETTER IEICE Electronics Express, Vol.14, No.21, 1 11 Efficient VLSI Huffman encoder implementation and its application in high rate serial data encoding Rongshan Wei a) and Xingang Zhang College of Physics

More information

A MULTIBANK MEMORY-BASED VLSI ARCHITECTURE OF DIGITAL VIDEO BROADCASTING SYMBOL DEINTERLEAVER

A MULTIBANK MEMORY-BASED VLSI ARCHITECTURE OF DIGITAL VIDEO BROADCASTING SYMBOL DEINTERLEAVER A MULTIBANK MEMORY-BASED VLSI ARCHITECTURE OF DIGITAL VIDEO BROADCASTING SYMBOL DEINTERLEAVER D.HARI BABU 1, B.NEELIMA DEVI 2 1,2 Noble college of engineering and technology, Nadergul, Hyderabad, Abstract-

More information

An Area-Efficient BIRA With 1-D Spare Segments

An Area-Efficient BIRA With 1-D Spare Segments 206 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 1, JANUARY 2018 An Area-Efficient BIRA With 1-D Spare Segments Donghyun Kim, Hayoung Lee, and Sungho Kang Abstract The

More information

Design and Implementation of Signed, Rounded and Truncated Multipliers using Modified Booth Algorithm for Dsp Systems.

Design and Implementation of Signed, Rounded and Truncated Multipliers using Modified Booth Algorithm for Dsp Systems. Design and Implementation of Signed, Rounded and Truncated Multipliers using Modified Booth Algorithm for Dsp Systems. K. Ram Prakash 1, A.V.Sanju 2 1 Professor, 2 PG scholar, Department of Electronics

More information

ISSN Vol.05,Issue.09, September-2017, Pages:

ISSN Vol.05,Issue.09, September-2017, Pages: WWW.IJITECH.ORG ISSN 2321-8665 Vol.05,Issue.09, September-2017, Pages:1693-1697 AJJAM PUSHPA 1, C. H. RAMA MOHAN 2 1 PG Scholar, Dept of ECE(DECS), Shirdi Sai Institute of Science and Technology, Anantapuramu,

More information

Design of Viterbi Decoder for Noisy Channel on FPGA Ms. M.B.Mulik, Prof. U.L.Bombale, Prof. P.C.Bhaskar

Design of Viterbi Decoder for Noisy Channel on FPGA Ms. M.B.Mulik, Prof. U.L.Bombale, Prof. P.C.Bhaskar International Journal of Scientific & Engineering Research Volume 2, Issue 6, June-2011 1 Design of Viterbi Decoder for Noisy Channel on FPGA Ms. M.B.Mulik, Prof. U.L.Bombale, Prof. P.C.Bhaskar Abstract

More information

Implementation Aspects of Turbo-Decoders for Future Radio Applications

Implementation Aspects of Turbo-Decoders for Future Radio Applications Implementation Aspects of Turbo-Decoders for Future Radio Applications Friedbert Berens STMicroelectronics Advanced System Technology Geneva Applications Laboratory CH-1215 Geneva 15, Switzerland e-mail:

More information

Low-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm

Low-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm International Journal of Scientific and Research Publications, Volume 3, Issue 8, August 2013 1 Low-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm MUCHHUMARRI SANTHI LATHA*, Smt. D.LALITHA KUMARI**

More information

Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation

Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation LETTER IEICE Electronics Express, Vol.11, No.5, 1 6 Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation Liang-Hung Wang 1a), Yi-Mao Hsiao

More information

High speed low complexity radix-16 Max-Log-MAP SISO decoder

High speed low complexity radix-16 Max-Log-MAP SISO decoder High speed low complexity radix-16 Max-Log-MAP SISO decoder Oscar David Sanchez Gonzalez, Christophe Jego, Michel Jezequel, Yannick Saouter To cite this version: Oscar David Sanchez Gonzalez, Christophe

More information

High Speed ACSU Architecture for Viterbi Decoder Using T-Algorithm

High Speed ACSU Architecture for Viterbi Decoder Using T-Algorithm High Speed ACSU Architecture for Viterbi Decoder Using T-Algorithm Atish A. Peshattiwar & Tejaswini G. Panse Department of Electronics Engineering, Yeshwantrao Chavan College of Engineering, E-mail : atishp32@gmail.com,

More information

Interlaced Column-Row Message-Passing Schedule for Decoding LDPC Codes

Interlaced Column-Row Message-Passing Schedule for Decoding LDPC Codes Interlaced Column-Row Message-Passing Schedule for Decoding LDPC Codes Saleh Usman, Mohammad M. Mansour, Ali Chehab Department of Electrical and Computer Engineering American University of Beirut Beirut

More information

A New MIMO Detector Architecture Based on A Forward-Backward Trellis Algorithm

A New MIMO Detector Architecture Based on A Forward-Backward Trellis Algorithm A New MIMO etector Architecture Based on A Forward-Backward Trellis Algorithm Yang Sun and Joseph R Cavallaro epartment of Electrical and Computer Engineering Rice University, Houston, TX 775 Email: {ysun,

More information

Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA

Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA Power and Area Efficient Implementation for Parallel FIR Filters Using FFAs and DA Krishnapriya P.N 1, Arathy Iyer 2 M.Tech Student [VLSI & Embedded Systems], Sree Narayana Gurukulam College of Engineering,

More information

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 ISSCC 26 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 22.1 A 125µW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications Tsu-Ming Liu 1, Ting-An Lin 2, Sheng-Zen Wang 2, Wen-Ping Lee

More information

Implementation of Convolution Encoder and Viterbi Decoder Using Verilog

Implementation of Convolution Encoder and Viterbi Decoder Using Verilog International Journal of Electronics and Communication Engineering. ISSN 0974-2166 Volume 11, Number 1 (2018), pp. 13-21 International Research Publication House http://www.irphouse.com Implementation

More information

Chip Design for Turbo Encoder Module for In-Vehicle System

Chip Design for Turbo Encoder Module for In-Vehicle System Chip Design for Turbo Encoder Module for In-Vehicle System Majeed Nader Email: majeed@wayneedu Yunrui Li Email: yunruili@wayneedu John Liu Email: johnliu@wayneedu Abstract This paper studies design and

More information

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely

More information

A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications

A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications Kui-Ting Chen Research Center of Information, Production and Systems, Waseda University, Fukuoka, Japan Email: nore@aoni.waseda.jp

More information

Available online at ScienceDirect. Procedia Technology 25 (2016 )

Available online at  ScienceDirect. Procedia Technology 25 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Technology 25 (2016 ) 544 551 Global Colloquium in Recent Advancement and Effectual Researches in Engineering, Science and Technology (RAEREST

More information

A Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for MAP Decoder

A Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for MAP Decoder A Novel Area Efficient Folded Modified Convolutional Interleaving Architecture for Decoder S.Shiyamala Department of ECE SSCET Palani, India. Dr.V.Rajamani Principal IGCET Trichy,India ABSTRACT This paper

More information

Non-Binary Turbo Codes Interleavers

Non-Binary Turbo Codes Interleavers Non-Binary Turbo Codes Interleavers Maria KOVACI, Horia BALTA University Polytechnic of Timişoara, Faculty of Electronics and Telecommunications, Postal Address, 30223 Timişoara, ROMANIA, E-Mail: mariakovaci@etcuttro,

More information

AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES

AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES AN EFFICIENT DESIGN OF VLSI ARCHITECTURE FOR FAULT DETECTION USING ORTHOGONAL LATIN SQUARES (OLS) CODES S. SRINIVAS KUMAR *, R.BASAVARAJU ** * PG Scholar, Electronics and Communication Engineering, CRIT

More information

Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design

Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design Ting-Sheng Chen, Ding-Yuan Lee, Tsung-Te Liu, and An-Yeu (Andy) Wu Graduate Institute of Electronics Engineering, National Taiwan

More information

IEEE Proof Web Version

IEEE Proof Web Version IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 From Parallelism Levels to a Multi-ASIP Architecture for Turbo Decoding Olivier Muller, Member, IEEE, Amer Baghdadi, and Michel Jézéquel,

More information

IN RECENT years, multimedia application has become more

IN RECENT years, multimedia application has become more 578 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. 5, MAY 2007 A Fast Algorithm and Its VLSI Architecture for Fractional Motion Estimation for H.264/MPEG-4 AVC Video Coding

More information

Research Article Cooperative Signaling with Soft Information Combining

Research Article Cooperative Signaling with Soft Information Combining Electrical and Computer Engineering Volume 2010, Article ID 530190, 5 pages doi:10.1155/2010/530190 Research Article Cooperative Signaling with Soft Information Combining Rui Lin, Philippa A. Martin, and

More information

Design of Convolution Encoder and Reconfigurable Viterbi Decoder

Design of Convolution Encoder and Reconfigurable Viterbi Decoder RESEARCH INVENTY: International Journal of Engineering and Science ISSN: 2278-4721, Vol. 1, Issue 3 (Sept 2012), PP 15-21 www.researchinventy.com Design of Convolution Encoder and Reconfigurable Viterbi

More information

Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider

Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider Tamkang Journal of Science and Engineering, Vol. 3, No., pp. 29-255 (2000) 29 Carry-Free Radix-2 Subtractive Division Algorithm and Implementation of the Divider Jen-Shiun Chiang, Hung-Da Chung and Min-Show

More information

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2

VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila Khan 1 Uma Sharma 2 IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 05, 2015 ISSN (online): 2321-0613 VLSI Implementation of Low Power Area Efficient FIR Digital Filter Structures Shaila

More information

Implementation of a Turbo Encoder and Turbo Decoder on DSP Processor-TMS320C6713

Implementation of a Turbo Encoder and Turbo Decoder on DSP Processor-TMS320C6713 International Journal of Engineering Research and Development e-issn : 2278-067X, p-issn : 2278-800X,.ijerd.com Volume 2, Issue 5 (July 2012), PP. 37-41 Implementation of a Turbo Encoder and Turbo Decoder

More information

Memory-Efficient and High-Speed Line-Based Architecture for 2-D Discrete Wavelet Transform with Lifting Scheme

Memory-Efficient and High-Speed Line-Based Architecture for 2-D Discrete Wavelet Transform with Lifting Scheme Proceedings of the 7th WSEAS International Conference on Multimedia Systems & Signal Processing, Hangzhou, China, April 5-7, 007 3 Memory-Efficient and High-Speed Line-Based Architecture for -D Discrete

More information

Parallel-computing approach for FFT implementation on digital signal processor (DSP)

Parallel-computing approach for FFT implementation on digital signal processor (DSP) Parallel-computing approach for FFT implementation on digital signal processor (DSP) Yi-Pin Hsu and Shin-Yu Lin Abstract An efficient parallel form in digital signal processor can improve the algorithm

More information

POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY

POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY Latha A 1, Saranya G 2, Marutharaj T 3 1, 2 PG Scholar, Department of VLSI Design, 3 Assistant Professor Theni Kammavar Sangam College Of Technology, Theni,

More information

Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure

Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure Iswarya Gopal, Rajasekar.T, PG Scholar, Sri Shakthi Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India Assistant

More information

PERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS

PERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS American Journal of Applied Sciences 11 (4): 558-563, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.558.563 Published Online 11 (4) 2014 (http://www.thescipub.com/ajas.toc) PERFORMANCE

More information

DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY

DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY Saroja pasumarti, Asst.professor, Department Of Electronics and Communication Engineering, Chaitanya Engineering

More information

TURBO CODES with performance near the Shannon

TURBO CODES with performance near the Shannon IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 4, APRIL 2005 427 Parallel Interleaver Design and VLSI Architecture for Low-Latency MAP Turbo Decoders Rostislav (Reuven)

More information

A Review on Analysis on Codes using Different Algorithms

A Review on Analysis on Codes using Different Algorithms A Review on Analysis on Codes using Different Algorithms Devansh Vats Gaurav Kochar Rakesh Joon (ECE/GITAM/MDU) (ECE/GITAM/MDU) (HOD-ECE/GITAM/MDU) Abstract-Turbo codes are a new class of forward error

More information

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE

HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE HIGH-PERFORMANCE RECONFIGURABLE FIR FILTER USING PIPELINE TECHNIQUE Anni Benitta.M #1 and Felcy Jeba Malar.M *2 1# Centre for excellence in VLSI Design, ECE, KCG College of Technology, Chennai, Tamilnadu

More information

BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel

BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel I J C T A, 9(40), 2016, pp. 397-404 International Science Press ISSN: 0974-5572 BER Evaluation of LDPC Decoder with BPSK Scheme in AWGN Fading Channel Neha Mahankal*, Sandeep Kakde* and Atish Khobragade**

More information

Implementation of SCN Based Content Addressable Memory

Implementation of SCN Based Content Addressable Memory IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 12, Issue 4, Ver. II (Jul.-Aug. 2017), PP 48-52 www.iosrjournals.org Implementation of

More information

High Performance and Area Efficient DSP Architecture using Dadda Multiplier

High Performance and Area Efficient DSP Architecture using Dadda Multiplier 2017 IJSRST Volume 3 Issue 6 Print ISSN: 2395-6011 Online ISSN: 2395-602X Themed Section: Science and Technology High Performance and Area Efficient DSP Architecture using Dadda Multiplier V.Kiran Kumar

More information

A Universal Test Pattern Generator for DDR SDRAM *

A Universal Test Pattern Generator for DDR SDRAM * A Universal Test Pattern Generator for DDR SDRAM * Wei-Lun Wang ( ) Department of Electronic Engineering Cheng Shiu Institute of Technology Kaohsiung, Taiwan, R.O.C. wlwang@cc.csit.edu.tw used to detect

More information

Viterbi Algorithm for error detection and correction

Viterbi Algorithm for error detection and correction IOSR Journal of Electronicsl and Communication Engineering (IOSR-JECE) ISSN: 2278-2834-, ISBN: 2278-8735, PP: 60-65 www.iosrjournals.org Viterbi Algorithm for error detection and correction Varsha P. Patil

More information

{ rizwan.rasheed, aawatif.menouni eurecom.fr,

{ rizwan.rasheed, aawatif.menouni eurecom.fr, Reconfigurable Viterbi Decoder for Mobile Platform Rizwan RASHEED, Mobile Communications Department, Institut Eurecom, Sophia Antipolis, France Aawatif MENOUNI HAYAR, Mobile Communications Department,

More information

On combining chase-2 and sum-product algorithms for LDPC codes

On combining chase-2 and sum-product algorithms for LDPC codes University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2012 On combining chase-2 and sum-product algorithms

More information

Resource Efficient Multi Ported Sram Based Ternary Content Addressable Memory

Resource Efficient Multi Ported Sram Based Ternary Content Addressable Memory IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 PP 11-18 www.iosrjen.org Resource Efficient Multi Ported Sram Based Ternary Content Addressable Memory S.Parkavi (1) And S.Bharath

More information

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter

A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter A Ripple Carry Adder based Low Power Architecture of LMS Adaptive Filter A.S. Sneka Priyaa PG Scholar Government College of Technology Coimbatore ABSTRACT The Least Mean Square Adaptive Filter is frequently

More information

[Kalyani*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

[Kalyani*, 4.(9): September, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY SYSTEMATIC ERROR-CORRECTING CODES IMPLEMENTATION FOR MATCHING OF DATA ENCODED M.Naga Kalyani*, K.Priyanka * PG Student [VLSID]

More information

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression Divakara.S.S, Research Scholar, J.S.S. Research Foundation, Mysore Cyril Prasanna Raj P Dean(R&D), MSEC, Bangalore Thejas

More information

Design of Convolutional Codes for varying Constraint Lengths

Design of Convolutional Codes for varying Constraint Lengths Design of Convolutional Codes for varying Constraint Lengths S VikramaNarasimhaReddy 1, Charan Kumar K 2, Neelima Koppala 3 1,2 MTech(VLSI) Student, 3 Assistant Professor, ECE Department, SreeVidyanikethan

More information

Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders

Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders Vol. 3, Issue. 4, July-august. 2013 pp-2266-2270 ISSN: 2249-6645 Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders V.Krishna Kumari (1), Y.Sri Chakrapani

More information

Efficient Design of Radix Booth Multiplier

Efficient Design of Radix Booth Multiplier Efficient Design of Radix Booth Multiplier 1Head and Associate professor E&TC Department, Pravara Rural Engineering College Loni 2ME E&TC Engg, Pravara Rural Engineering College Loni --------------------------------------------------------------------------***----------------------------------------------------------------------------

More information

DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX

DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX DETECTION AND CORRECTION OF CELL UPSETS USING MODIFIED DECIMAL MATRIX ENDREDDY PRAVEENA 1 M.T.ech Scholar ( VLSID), Universal College Of Engineering & Technology, Guntur, A.P M. VENKATA SREERAJ 2 Associate

More information

Design and Implementation of 3-D DWT for Video Processing Applications

Design and Implementation of 3-D DWT for Video Processing Applications Design and Implementation of 3-D DWT for Video Processing Applications P. Mohaniah 1, P. Sathyanarayana 2, A. S. Ram Kumar Reddy 3 & A. Vijayalakshmi 4 1 E.C.E, N.B.K.R.IST, Vidyanagar, 2 E.C.E, S.V University

More information

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator A.Sindhu 1, K.PriyaMeenakshi 2 PG Student [VLSI], Dept. of ECE, Muthayammal Engineering College, Rasipuram, Tamil Nadu,

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK LOW POWER VITERBI DECODER FOR TCM USING T-ALGORITHM CHETAN K. SHENDURKAR 1, MRS.

More information

Fault Tolerant Parallel Filters Based on ECC Codes

Fault Tolerant Parallel Filters Based on ECC Codes Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 11, Number 7 (2018) pp. 597-605 Research India Publications http://www.ripublication.com Fault Tolerant Parallel Filters Based on

More information

DISCRETE COSINE TRANSFORM (DCT) is a widely

DISCRETE COSINE TRANSFORM (DCT) is a widely IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL 20, NO 4, APRIL 2012 655 A High Performance Video Transform Engine by Using Space-Time Scheduling Strategy Yuan-Ho Chen, Student Member,

More information

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee

More information

MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA

MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA MULTI-RATE HIGH-THROUGHPUT LDPC DECODER: TRADEOFF ANALYSIS BETWEEN DECODING THROUGHPUT AND AREA Predrag Radosavljevic, Alexandre de Baynast, Marjan Karkooti, and Joseph R. Cavallaro Department of Electrical

More information