A New Approach to Pipeline FFT Processor

Similar documents
Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope

Laboratory Exercise 6

Laboratory Exercise 6

Laboratory Exercise 6

Advanced Encryption Standard and Modes of Operation

International Journal of Engineering Research & Technology (IJERT) ISSN: Vol. 2 Issue 5, May

Distributed Packet Processing Architecture with Reconfigurable Hardware Accelerators for 100Gbps Forwarding Performance on Virtualized Edge Router

Course Project: Adders, Subtractors, and Multipliers a

Universität Augsburg. Institut für Informatik. Approximating Optimal Visual Sensor Placement. E. Hörster, R. Lienhart.

Laboratory Exercise 6

Laboratory Exercise 2

/06/$ IEEE 364

Performance of a Robust Filter-based Approach for Contour Detection in Wireless Sensor Networks

Routing Definition 4.1

DESIGN METHODOLOGY. 5.1 General

Hassan Ghaziri AUB, OSB Beirut, Lebanon Key words Competitive self-organizing maps, Meta-heuristics, Vehicle routing problem,

Delaunay Triangulation: Incremental Construction

Lecture 14: Minimum Spanning Tree I

A SIMPLE IMPERATIVE LANGUAGE THE STORE FUNCTION NON-TERMINATING COMMANDS

On successive packing approach to multidimensional (M-D) interleaving

Multi-Target Tracking In Clutter

Planning of scooping position and approach path for loading operation by wheel loader

Floating Point CORDIC Based Power Operation

Representations and Transformations. Objectives

Edits in Xylia Validity Preserving Editing of XML Documents

DAROS: Distributed User-Server Assignment And Replication For Online Social Networking Applications

[N309] Feedforward Active Noise Control Systems with Online Secondary Path Modeling. Muhammad Tahir Akhtar, Masahide Abe, and Masayuki Kawamata

Key Terms - MinMin, MaxMin, Sufferage, Task Scheduling, Standard Deviation, Load Balancing.

Trainable Context Model for Multiscale Segmentation

Computer Arithmetic Homework Solutions. 1 An adder for graphics. 2 Partitioned adder. 3 HDL implementation of a partitioned adder

Shortest Path Routing in Arbitrary Networks

Contents. shortest paths. Notation. Shortest path problem. Applications. Algorithms and Networks 2010/2011. In the entire course:

xy-monotone path existence queries in a rectilinear environment

Image authentication and tamper detection using fragile watermarking in spatial domain

1 The secretary problem

CERIAS Tech Report EFFICIENT PARALLEL ALGORITHMS FOR PLANAR st-graphs. by Mikhail J. Atallah, Danny Z. Chen, and Ovidiu Daescu

3D MODELLING WITH LINEAR APPROACHES USING GEOMETRIC PRIMITIVES

Increasing Throughput and Reducing Delay in Wireless Sensor Networks Using Interference Alignment

Cutting Stock by Iterated Matching. Andreas Fritsch, Oliver Vornberger. University of Osnabruck. D Osnabruck.

Laboratory Exercise 2

ADAM - A PROBLEM-ORIENTED SYMBOL PROCESSOR

The Serial Commutator FFT

AUTOMATIC TEST CASE GENERATION USING UML MODELS

Frequency Table Computation on Dataflow Architecture

Dynamically Reconfigurable Neuron Architecture for the Implementation of Self- Organizing Learning Array

CS 467/567: Divide and Conquer on the PRAM

VLSI Design 9. Datapath Design

Karen L. Collins. Wesleyan University. Middletown, CT and. Mark Hovey MIT. Cambridge, MA Abstract

A Practical Model for Minimizing Waiting Time in a Transit Network

(12) Patent Application Publication (10) Pub. No.: US 2011/ A1

DESIGN OF PARALLEL PIPELINED FEED FORWARD ARCHITECTURE FOR ZERO FREQUENCY & MINIMUM COMPUTATION (ZMC) ALGORITHM OF FFT

Performance Evaluation of an Advanced Local Search Evolutionary Algorithm

How to Select Measurement Points in Access Point Localization

mapping reult. Our experiment have revealed that for many popular tream application, uch a networking and multimedia application, the number of VC nee

Diverse: Application-Layer Service Differentiation in Peer-to-Peer Communications

AN ALGORITHM FOR RESTRICTED NORMAL FORM TO SOLVE DUAL TYPE NON-CANONICAL LINEAR FRACTIONAL PROGRAMMING PROBLEM

New Structural Decomposition Techniques for Constraint Satisfaction Problems

3D SMAP Algorithm. April 11, 2012

ON CONFIGURATION OF RESIDUE SCALING PROCESS IN PIPELINED RADIX-4 MQRNS FFT PROCESSOR

See chapter 8 in the textbook. Dr Muhammad Al Salamah, Industrial Engineering, KFUPM

else end while End References

A Boyer-Moore Approach for. Two-Dimensional Matching. Jorma Tarhio. University of California. Berkeley, CA Abstract

Maneuverable Relays to Improve Energy Efficiency in Sensor Networks

SLA Adaptation for Service Overlay Networks

A User-Attention Based Focus Detection Framework and Its Applications

Drawing Lines in 2 Dimensions

Research on Star Image Noise Filtering Based on Diffusion Model of Regularization Influence Function

A METHOD OF REAL-TIME NURBS INTERPOLATION WITH CONFINED CHORD ERROR FOR CNC SYSTEMS

Shortest Paths with Single-Point Visibility Constraint

Nearly Constant Approximation for Data Aggregation Scheduling in Wireless Sensor Networks

MAT 155: Describing, Exploring, and Comparing Data Page 1 of NotesCh2-3.doc

Topics. Lecture 37: Global Optimization. Issues. A Simple Example: Copy Propagation X := 3 B > 0 Y := 0 X := 4 Y := Z + W A := 2 * 3X

Audio-Visual Voice Command Recognition in Noisy Conditions

ANALYSIS OF THE FIRST LAYER IN WEIGHTLESS NEURAL NETWORKS FOR 3_DIMENSIONAL PATTERN RECOGNITION

A PROBABILISTIC NOTION OF CAMERA GEOMETRY: CALIBRATED VS. UNCALIBRATED

Markov Random Fields in Image Segmentation

Service and Network Management Interworking in Future Wireless Systems

arxiv:cs.oh/ v1 7 Mar 2005

A System Dynamics Model for Transient Availability Modeling of Repairable Redundant Systems

Chapter S:II (continued)

Twiddle Factor Transformation for Pipelined FFT Processing

( ) subject to m. e (2) L are 2L+1. = s SEG SEG Las Vegas 2012 Annual Meeting Page 1

Brief Announcement: Distributed 3/2-Approximation of the Diameter

Shortest Paths Problem. CS 362, Lecture 20. Today s Outline. Negative Weights

Gray-level histogram. Intensity (grey-level) transformation, or mapping. Use of intensity transformations:

Modeling of underwater vehicle s dynamics

arxiv: v3 [cs.cg] 1 Oct 2018

Keywords: Defect detection, linear phased array transducer, parameter optimization, phased array ultrasonic B-mode imaging testing.

Comparison of Methods for Horizon Line Detection in Sea Images

Shortest-Path Routing in Arbitrary Networks

SIMIT 7. Component Type Editor (CTE) User manual. Siemens Industrial

arxiv: v1 [cs.ds] 27 Feb 2018

A Sparse Shared-Memory Multifrontal Solver in SCAD Software

IMPROVED JPEG DECOMPRESSION OF DOCUMENT IMAGES BASED ON IMAGE SEGMENTATION. Tak-Shing Wong, Charles A. Bouman, and Ilya Pollak

Operational Semantics Class notes for a lecture given by Mooly Sagiv Tel Aviv University 24/5/2007 By Roy Ganor and Uri Juhasz

Locating Brain Tumors from MR Imagery Using Symmetry

Testing Structural Properties in Textual Data: Beyond Document Grammars

Minimum congestion spanning trees in bipartite and random graphs

An Intro to LP and the Simplex Algorithm. Primal Simplex

Focused Video Estimation from Defocused Video Sequences

Transcription:

A ew Approach to Pipeline FFT Proceor Shouheng He and Mat Torkelon Department of Applied Electronic, Lund Univerity S- Lund, SWEDE email: he@tde.lth.e; torkel@tde.lth.e Abtract A new VLSI architecture for real-time pipeline FFT proceor i propoed. A hardware oriented radix- algorithm i derived by integrating a twiddle factor decompoition technique in the divide and conquer approach. Radix- algorithm ha the ame multiplicative complexity a radix- algorithm, but retain the butterfly tructure of radix- algorithm. The ingle-path delay-feedback architecture i ued to exploit the patial regularity in ignal flow graph of the algorithm. For length- DFT computation, the hardware requirement of the propoed architecture i minimal on both dominant component: log complex multiplier and complex data memory. The validity and efficiency of the architecture have been verified by imulation in hardware decription language VHDL. I. ITRODUCTIO Pipeline FFT proceor i a pecified cla of proceor for DFT computation utilizing fat algorithm. It i characterized with real-time, non-topping proceing a the data equence paing the proceor. It i an AT non-optimal approach with AT = O( 3 ), ince the area lower bound i O(). However, a it ha been peculated [] that for real-time proceing whether a new metric hould be introduced ince it i necearily non-optimal given the time complexity of O(). Although aymptotically almot all the feaible architecture have reached the area lower bound [], the cla of pipeline FFT proceor ha probably the mallet contant factor among the approache that meet the time requirement, due to it leat number, O(log ), of Arithmetic Element (AE). The difference come from the fact that an AE, epecially the multiplier, take much larger area than a regiter in digital VLSI implementation. It i alo intereting to note the at leat Ω(log ) AE are neceary to meet the real-time proceing requirement due to the computational complexity of Ω( log ) for FFT algorithm. Thu it ha the nature of lower bound for AE requirement. Any optimal architecture for real-time proceing will likely have Ω(log ) AE. Another major area/energy conumption of the FFT proceor come from the memory requirement to buffer the input data and the intermediate reult for the computation. For large ize tranform, thi turn out to be dominating [3, ]. Although there i no formal proof, the area lower bound indicate that the the lower bound for the number of regiter i likelytobeω(). Thi i obviouly true for any architecture implementing FFT baed algorithm, ince the butterfly at firt tage ha to take data element eparated =r ditance away from the input equence, where r i a mall contant integer, or the radix. Putting above argument together, a pipeline FFT proceor ha necearily Ω(log r ) AE and Ω() complex word regiter. The optimal architecture ha to be the one that reduce the contant factor, or the abolute number of AE (multiplier and adder) and memory ize, to the minimum. In thi paper a new approach for real-time pipeline FFT proceor, the Radix- Single-path Delay Feedback,or R SDF architecture will be preented. We will begin with a brief review of previou approache. A hardware oriented radix- algorithm i then developed by integrating a twiddle factor decompoition technique in divide and conquer approach to form a patially regular ignal flow graph (SFG). Mapping the algorithm to the cacading delay feedback tructure lead to the the propoed architecture. Finally we conclude with a comparion of hardware requirement of R SDF and everal other popular pipeline architecture. II. PIPELIE FFT PROCESSOR ARCHITECTURES Before going into detail of the new approach, it i beneficial to have a brief review of the variou architecture for pipeline FFT proceor. To avoid being influenced by the equence order, we aume that the real-time proceing tak only require the input equence to be in normal order, and the output i allowed to be in digit-revered (radix- or radix-) order, which i permiible in uch application uch a DFT baed communication ytem [5]. We alo tick to the Decimation- In-Frequency (DIF) type of decompoition throughout the dicuion. The architecture deign for pipeline FFT proceor had been the ubject of intenive reearch a early a in 7 when

real-time proceing wa demanded in uch application a radar ignal proceing [6], well before the VLSI technology had advanced to the level of ytem integration. Several architecture have been propoed over the lat decade ince then, along with the increaing interet and the leap forward of the technology. Here different approache will be put into functional block with unified terminology, where the additive butterfly ha been eparated from multiplier to how the hardware requirement ditinctively, a in Fig.. The control and twiddle factor reading mechanim have been alo omitted for clarity. All data and arithmetic operation are complex, and a contraint that i a power of applie. butterfly unit and multiplier a in RMDC approach, but with much reduced memory requirement: regiter. It memory requirement i minimal. RSDF: Radix- Single-path Delay Feedback [] wa propoed a a radix- verion of RSDF, employing CORDIC iteration. The utilization of multiplier ha been increaed to 75% due to the torage of 3 out of radix- butterfly output. However, the utilization of the radix- butterfly, which i fairly complicated and contain at leat complex adder, i dropped to only 5%. It require log multiplier, log full radix- butterflie and torage of ize. C C BF C BF C BF j C BF (i). RMDC(=6) j BF BF BF BF (ii). RSDF(=6) 3x6 3x6 3x 3x BF BF BF BF (iii). RSDF(=56) 9 3 6 3 BF C BF C BF C BF 6 3 6 3 (iv). RMDC(=56) RMDC: Radix- Multi-path Delay Commutator [6] i a radix- verion of RMDC. It ha been ued a the architecture for the initial VLSI implementation of pipeline FFT proceor [3] and maive wafer cale integration [9]. However, it uffer from low, 5%, utilization of all component, which can be compenated only in ome pecial application where four FFT are being proceed imultaneouly. It require 3 log multiplier, log full radix- butterflie and 5= regiter. RSDC: Radix- Single-path Delay Commutator [] ue a modified radix- algorithm with programable = radix- butterflie to achieve higher, 75% utilization of multiplier. A combined Delay-Commutator alo reduce the memory requirement to from 5=, that of RMDC. The butterfly and delay-commutator become relatively complicated due to programmability requirement. RSDC ha been ued recently in building the larget ever ingle chip pipeline FFT proceor for HDTV application []. DC6x6 BF DC6x6 BF DC6x BF DC6x (v). RSDC(=56) Figure : Variou cheme for pipeline FFT proceor RMDC: Radix- Multi-path Delay Commutator [6] wa probably the mot traightforward approach for pipeline implementation of radix- FFT algorithm. The input equence ha been broken into two parallel data tream flowing forward, with correct ditance between data element entering the butterfly cheduled by proper delay. Both butterflie and multiplier are in 5% utilization. log multiplier, log radix- butterflie and 3= regiter (delay element) are required. RSDF: Radix- Single-path Delay Feedback [7] ue the regiter more efficiently by toring the butterfly output in feedback hift regiter. A ingle data tream goe through the multiplier at every tage. It ha ame number of BF A wift kimming through of the architecture lited above reveal the ditinctive merit of the different approache: Firt, the delay-feedback approache are alway more efficient than correponding delay-commutator approache in term of memory utilization ince the tored butterfly output can be directly ued by the multiplier. Second, radix- algorithm baed ingle-path architecture have higher multiplier utilization, however, radix- algorithm baed architecture have impler butterflie which are better utilized. The new approach developed in following ection i highly motivated by thee obervation. III. RADI- DIF FFT ALGORITHM By the obervation made in lat ection the mot deirable hardware oriented algorithm will be that it ha the ame number of non-trivial multiplication at the ame poition in the SFG a of radix- algorithm, but ha the ame butterfly tructure a that of radix- algorithm. Strictly peaking, algorithm with thi feature i not completely new. An SFG The Coordinate Rotational Digital Computer

with a complex bia factor had been obtained implicitly a the reult of contant-rotation/compenation procedure uing retricted CORDIC operation []. Another algorithm combining radix- and radix- + in DIT form ha been ued to decreae the caling error in RMDC architecture, without altering the multiplier requirement []. The clear derivation of the algorithm in DIF form with perception of reducing the hardware requirement in the context pipeline FFT proceor i, however, yet to be developed. To avoid confuing with the well known radix-= plit radix algorithm and the mixed radix- + algorithm, the notion of radix- algorithm i ued to clearly reflect the tructural relation with radix- algorithm and the identical computational requirement with radix- algorithm. The DFT of ize i defined by (k)= x(n)w nk n= k< () where W denote the th primitive root of unity, with it exponent evaluated modulo. To make the derivation of the new algorithm clearer, conider the firt tep of decompoition in the radix- DIF FFT together. Applying a 3-dimenional linear index map, n = < n + n + n 3 > k = <k +k +k 3 > () the Common Factor Algorithm (CFA) ha the form of (k + k + k 3 ) = = n 3 =n =n = n 3 =n = x( n + n +n 3 )W ( n + n +n 3 )(k +k +k 3 ) fb k ( n + ( n 3 )W n +n 3 )k ( gw n +n 3 )(k +k 3 ) where the butterfly tructure ha the form of B k ( n + n 3 )=x( n +n 3 )+( ) k x( n +n 3 + ) If the expreion within the brace of eqn. (3) i to be computed before further decompoition, an ordinary radix- DIF FFT reult. The key idea of the new algorithm i to proceed the econd tep decompoition to the remaining DFT coefficient, including the twiddle factor W ( n +n 3 )k,to exploit the exceptional value in multiplication before the next butterfly i contructed. Decompoing the compoite twiddle factor and oberve that W ( n +n 3 )(k +k +k 3 ) = W n k 3 W n (k +k ) W n 3(k +k ) W n 3k 3 =( j) n (k +k ) n W 3 (k +k ) W n 3k 3 (3) () Subtituting eqn. () in eqn. (3) and expand the ummation with index n. After implification we have a et of DFT of length =, (k + k + k 3 )= n 3 = h H(k ;k ;n 3 )W n 3(k +k ) where H(k ;k ;n 3 )i expreed in eqn. (6). x() x() x() x(3) x() x(5) x(6) x(7) x() x(9) x() x() x() x(3) x() x(5) W W W W W W W W9 / DFT (k=, k=) / DFT (k=, k=) / DFT (k=, k=) / DFT (k=, k=) i W n 3k 3 Figure : Butterfly with decompoed twiddle factor. (5) () () () () () () (6) () () (9) (5) (3) (3) () (7) (5) eqn. (6) repreent the firt two tage of butterflie with only trivial multiplication in the SFG, a and in Fig.. After thee two tage, full multiplier are required to compute the product of the decompoed twiddle factor W n 3(k +k ) in eqn. (5), a hown in Fig.. ote the order of the twiddle factor i different from that of radix- algorithm. Applying thi CFA procedure recurively to the remaining DFT of length = in eqn. (5), the complete radix- DIF FFT algorithm i obtained. An = 6 example i hown in Fig. 3 where mall diamond repreent trivial multiplication by W = = j, which involve only real-imaginary wapping and ign inverion. Radix- algorithm ha the feature that it ha the ame multiplicative complexity a radix- algorithm, but till retain the radix- butterfly tructure. The multiplicative operation are in a uch an arrangement that only every other tage ha non-trivial multiplication. Thi i a great tructural advantage over other algorithm when pipeline/cacade FFT architecture i under conideration. IV. R SDF ARCHITECTURE Mapping radix- DIF FFT algorithm derived in lat ection to the RSDF architecture dicued in ection II., a new architecture of Radix- Single-path Delay Feedback (R SDF) approach i obtained.

z h } { H(k ;k ;n 3 )= z } { x(n 3 )+( ) k x(n 3 + )i +( j) (k +k ) hx(n 3 + )+( )k x(n 3 + 3 )i {z } (6) x() x() x() x(3) x() x(5) x(6) x(7) x() x(9) x() x() x() x(3) x() x(5) W W W W W9 I V Figure 3: Radix- DIF FFT flow graph for = 6 () () () () () () (6) () () (9) (5) (3) (3) () (7) (5) Fig. 5 outline an implementation of the R SDF architecture for = 56, note the imilarity of the data-path to RSDF and the reduced number of multiplier. The implementation ue two type of butterflie, one identical to that in RSDF, the other contain alo the logic to implement the trivial twiddle factor multiplication, a hown in Fig. -(i)(ii) repectively. Due to the patial regularity of Radix- algorithm, the ynchronization control of the proceor i very imple. A (log )-bit binary counter erve two purpoe: ynchronization controller and addre counter for twiddle factor reading in each tage. With the help of the butterfly tructure hown in Fig., the cheduled operation of the R SDFproceorinFig.5 i a follow. On firt = cycle, the -to- multiplexor in the firt butterfly module witch to poition, and the butterfly i idle. The input data from left i directed to the hift regiter until they are filled. On next = cycle, the multiplexor turn to poition, the butterfly compute a -point DFT with incoming data and the data tored in the hift regiter. Z(n) = x(n)+ x(n+ =), n<= (7) Z(n + =) = x(n) x(n + =) The butterfly output Z(n) i ent to apply the twiddle factor, and Z(n + =) i ent back to the hift regiter to be multiplied in till next = cyclewhen the firt half ofthe next frame of time equence i loaded in. The operation of the econd butterfly i imilar to that of the firt one, except the ditance of butterfly input equence are jut = andthe trivial twiddle factor multiplication ha been implemented by real-imaginary wapping with a commutator and controlled add/ubtract operation, a in Fig. -(ii), which require two bit control ignal from the ynchronizing counter. The data then goe through a full complex multiplier, working at 75% utility, accomplihe the reult of firt level of radix- DFT word by word. Further proceing repeat thi pattern with the ditance of the input data decreae by half at each conecutive butterfly tage. After clockcycle, The complete DFT tranform reult tream out to the right, in bit-revered order. The next frame of tranform can be computed without pauing due to the pipelined proceing of each tage. xr(n) xi(n) xr(n+/) xi(n+/) xr(n) xi(n) xr(n+/) xi(n+/) (i). BFI (ii). BFII t Zr(n+/) Zi(n+/) Zr(n) Zi(n) Zr(n+/) Zi(n+/) Figure : Butterfly tructure for R SDF FFT proceor In practical implementation, pipeline regiter hould be inerted betweeneach multiplier and butterfly tage to improve the performance. Shimming regiter are alo needed for control ignal to comply with thu revied timing. The latency of the output i then increaed to +3(log ) without affecting the throughput rate. V. COCLUSIO In thi paper, a hardware-oriented radix- algorithm i derived which ha the radix- multiplicative complexity but retain radix- butterfly tructure in the SFG. Baed on thi algorithm, a new, efficient pipeline FFT architecture, the R SDF architecture, i put forward. The hardware requirement of propoed architecture a compared with variou approache i hown in Table, where not only the number of complex Zr(n) Zi(n)

6 3 6 x(n) BFI BFII t BFI BFII BFI BFII BFI BFII t t t (k) clk W(n) W(n) (n) 7 6 5 3 Figure 5: R SDF pipeline FFT architecture for = 56 multiplier, adder and memory ize but alo the control complexity are lited for comparion. For eay reading, bae- logarithm i ued whenever applicable. It how R SDF ha reached the minimum requirement for both multiplier and the torage, and only econd to RSDC for adder. Thi make it an ideal architecture for VLSI implementation of pipeline FFT proceor. Table : Hardware requirement comparion multiplier # adder # memory ize control RMDC (log ) log 3= imple RSDF (log ) log imple RSDF log log medium RMDC 3(log ) log 5= imple RSDC log 3log complex R SDF log log imple The architecture ha been modeled with hardware decription language VHDL with generic parameter for tranform ize and word-length, uing fixed point arithmetic and a complex array multiplier implemented with ditributed arithmetic. The validity and efficiency of the propoed architecture ha been verified by extenive imulation. [7] E.H. Wold and A.M. Depain. Pipeline and parallel-pipeline FFT proceor for VLSI implementation. IEEE Tran. Comput., C-33(5): 6, May 9. [] A.M. Depain. Fourier tranform computer uing CORDIC iteration. IEEE Tran. Comput., C-3():993, Oct. 97. [9] E. E. Swartzlander, V. K. Jain, and H. Hikawa. A radix wafer cale FFT proceor. J. VLSI Signal Proceing, (,3):65 76, May 99. [] G. Bi and E. V. Jone. A pipelined FFT proceor for wordequential data. IEEE Tran. Acout., Speech, Signal Proceing, 37():9 95, Dec. 99. [] A.M. Depain. Very fat Fourier tranform algorithm hardware for implementation. IEEE Tran. Comput., C-(5):333 3, May 979. [] R. Storn. Radix- FFT-pipeline architecture with raduced noie-to-ignal ratio. IEE Proc.-Vi. Image Signal Proce., (): 6, Apr. 99. REFERECES [] C. D. Thompon. Fourier tranform in VLSI. IEEE Tran. Comput., C-3():7 57, ov. 93. [] S. He and M. Torkelon. A new expandable D ytolic array for DFT computation baed on ymbioi of D array. In Proc. ICA 3 PP 95, page 9, Bribane, Autralia, Apr. 995. [3] E. E. Swartzlander, W. K. W. Young, and S. J. Joeph. A radix delay commutator for fat Fourier tranform proceor implementation. IEEE J. Solid-State Circuit, SC-9(5):7 79, Oct. 9. [] E. Bidet, D. Catelain, C. Joanblanq, and P. Stenn. A fat ingle-chip implementation of 9 complex point FFT. IEEE J. Solid-State Circuit, 3(3):3 35, Mar. 995. [5] M. Alard and R. Laalle. Principle of modulation and channel coding for digital broadcating for mobile receiver. EBU Review, ():7 69, Aug. 97. [6] L.R. Rabiner and B. Gold. Theory and Application of Digital Signal Proceing. Prentice-Hall, Inc., 975.