Discrete Fourier Transform Compiler: From Mathematical Representation to Efficient Hardware

Size: px
Start display at page:

Download "Discrete Fourier Transform Compiler: From Mathematical Representation to Efficient Hardware"

Transcription

1 Discrete Fourier Trasform Compiler: From Mathematical Represetatio to Efficiet Hardware Peter A. Milder, Fraz Frachetti, James C. Hoe, ad Markus Püschel Electrical ad Computer Egieerig Departmet Caregie Mello Uiversity Pittsburgh, PA, U.S.A. {pam, frazf, jhoe, Abstract A wide rage of hardware implemetatios are possible for the discrete Fourier trasform (DFT), offerig differet tradeoffs i throughput, latecy ad cost. The well-uderstood structure of DFT algorithms makes possible a fully automatic sythesis framework that ca spa the viable iterestig desig choices. I this paper, we preset such a sythesis framework that starts from formal mathematical formulas of a geeral class of fast DFT algorithms ad produces performace ad cost efficiet sequetial hardware implemetatios, makig desig decisios ad tradeoffs accordig to user specified high-level prefereces. We preset evaluatios to demostrate the variety of supported implemetatios ad the cost/performace tradeoffs they allow. I. INTRODUCTION The discrete Fourier trasform (DFT) is oe of the ubiquitous buildig blocks i sigal processig ad other embedded processig applicatios. Its computatio exhibits a high degree of regularity i structure, comprisig recurrig basic kerels. O oe had, the theories behid efficiet hardware implemetatios have bee studied extesively ad are very well uderstood []. O the other, creatig practical implemetatios remais challegig i practice because it requires combied sophisticatio i the mathematics of trasforms as well as i digital desig. Whe a desig calls for a discrete Fourier trasform, desigers most ofte resort to istatiatig pre-desiged library implemetatios. Ready-to-use DFT modules are i the repertoire of early every techology vedor library whether ASIC or FPGAs. These library modules are desiged by specialists ad geerally attai optimum performace for the amout of resources they cosume. However optimized, these static library modules ca fall far short of optimum i a give applicatio cotext due to a mismatch i objectives. For example, a static library module would offer exactly the same level of performace regardless of how much surplus logic resources are available. To address this limitatio, Nordi et al. [2] have made available a parameterized DFT module geerator that allows cotrol over the level of hardware parallelism such that the desiger ca make custom tradeoffs betwee the performace desired ad the resources cosumed i the geerated module. Give the well-uderstood regular structure of the DFT (ad other liear DSP trasforms i geeral), oe should be able to fully capture the available desig space i a sythesis system to fully automate the geeratio of high-quality hardware implemetatios. The parameterized geeratio egie i [2] is a step i the right directio. However, this techology is limited i that the egie is hard-coded for a specific DFT algorithm (Pease [3]) ad oly exploits oe particular restructurig to derive the tradeoff i oe specific dimesio of the overall desig space. I this paper, we preset a formula-to-hardware sythesis flow that accepts as iput the mathematical represetatio of a geeral class of DFT trasform algorithms ad is capable of producig a wide rage of correct hardware implemetatios (i sythesizable RTL Verilog), icludig latecy-efficiet iterative microarchitectures ad throughput-efficiet streamig microarchitectures. The iput represetatio is based o a sparse-matrix formula laguage. Startig from pure mathematical formulas, the sythesis process comprisig a set of formal formula-level rewrite rules makes hardware implemetatio decisios ad tradeoffs accordig to user specified high-level prefereces. The outcome is a set of fully aotated formulas that ca be straightforwardly reduced to its correspodig hardware sequetial implemetatios. The wealth of iitial DFT formula choices ad the rich combiatios of structural rewrite rules together yield the large space of implemetatios attaiable by this DFT sythesis framework. Paper outlie. Sectio II itroduces the DFT trasform ad the formula laguage ad sketches a rudimetary sythesis algorithm for combiatioal implemetatios. Sectio III presets the flow from formula to hardware sythesis i two parts: first, from formula to aotated hardware formula, secod, from hardware formula to sequetial datapath. Sectio IV reviews the sythesis flow by demostratig two workig examples. Sectio V evaluates a wide rage of DFT desig istaces produced by our framework. Sectio VI discusses prior work i hardware DFT implemetatios. Fially, Sectio VII offers our discussio ad coclusios. II. BACKGROUND Discrete Fourier trasform. The discrete Fourier trasform (DFT) of size is the matrix-vector multiplicatio y = DFT x,

2 where x ad y are the iput ad output vectors of legth, ad DFT = [ω kl ] k,l<, ω = exp( 2πi/), i =. I this paper we oly cosider two-power sizes. Fast Fourier trasforms. Computig the DFT of x by matrixvector multiply requires O ( 2) operatios. However, this ca be reduced to O ( log() ) usig well-kow fast algorithms (fast Fourier trasforms or FFTs). A FFT ca be viewed as a factorizatio of DFT ito a product of sparse matrices. For example (omitted etries are zero): DFT = i i i i i = Computig DFT x by multiplyig the iput x from right to left with the four sparse matrices has a lower arithmetic cost tha multiplyig by the dese trasform matrix. Each of the occurrig sparse matrices has structure, which ca be used to express the algorithm usig the Kroecker or tesor product formalism []. For example, the factorizatio above becomes DFT = (DFT 2 I 2 )D (I 2 DFT 2 )L,2. () Here D = diag(,,,i) is a diagoal matrix (called twiddle matrix); I the idetity; L,m (for m divides ) the stride permutatio, which ca be viewed as trasposig a /mm matrix stored i a vector i row-major order, formally: i (/m) + j j m + i, for i < m, j < /m. Fially, is the Kroecker or tesor product defied as B m = [a k,l B m ] k,l<, for A = [a k,l ] k,l<. Geerally, we write to deote that A is. Formula laguage for FFTs. The above formalism ca be captured i a formal laguage that ca be used to represet FFTs usig formulas. I Backus-Naur form, the laguage is defied as follows (o-termials are bold-faced): formula ::=formula formula I k formula m where = km formula m I k where = km base base ::= D = diag(d,,d ) L,m I DFT 2 This laguage is a subset of the sigal processig laguage (SPL) used i Spiral, a program geerator for liear trasforms []. We will also refer to it as SPL i this paper. Eve though the laguage is small, a large class of differet FFTs ca be expressed with it. We provide a few examples: DFT m = (DFT I m )D,m (I DFT m )L m, (2) ( t 2 DFT r t = L r t,r (I r t DFT r ) D,k k= (I r k L r t k,r t k )(I r k+ L r t k,r) ) (3) (I r t DFT r ) R r t,r [ t ] DFT r t = L r t,r (I r t DFT r ) D,k Rr rt () k= Equatio (2) is the well-kow recursive Cooley-Tukey FFT ad the stadard choice for software implemetatios. The others are iterative ad suitable for hardware implemetatios. Equatio (3) is called the iterative FFT, ad () is the Pease FFT, which has perfect regularity across stages except for the diagoal matrices D,k which deped o k. Both FFTs are give for a arbitrary radix r, where the radix idicates the size of the algorithm s basic block. Lastly, R r t,r is the radix-r reversal permutatio, kow as the bit reversal whe r = 2. Formula to combiatioal datapath. There exists a atural oe-to-oe correspodece betwee a SPL formula ad a combiatioal logic implemetatio. We demostrate it for differet formula costructs M supported by the grammar: M = B : The iput vector x first passes through a combiatioal module correspodig to B the aother module correspodig to A (Fig. (a)). M m = I m : The resultig matrix is a block diagoal matrix that cotais zero everywhere except repeated m times o the diagoal. I combiatioal logic, the vector x passes through m parallel copies of A such that each copy operates o a flit of cosecutive elemets from x (Fig. (b)). M m = I m : We ca first rewrite this matrix as L m, (I m )L m,m (a kow idetity []) ad hadle it as the product of three matrices (Fig. (d)). M = L,m : The correspodig reorderig of the vector x is achieved by reshufflig the busses that carry the elemets of x (Fig. (c)). M is diagoal: Each elemet of x is multiplied by a correspodig costat (Fig. (e)). M = DFT 2 : This icurs the computatios y = x + x ad y = x x ad yields the so-called butterfly structure i Fig. (f). Fig. shows that it would be a straightforward task to geerate combiatioal logic for ay formula i the SPL laguage. For example, for the formula i (), this compiler would produce the datapath show i Fig. (g). For geeral, the FFTs (3) ad () would yield combiatioal logic that is O ( log() ) i depth ad O ( log() ) i size. Such combiatioal implemetatios are too expesive except for small problem sizes.

3 x B (a) B y x x A 2 y y x 2 x 3 A 2 y 2 y 3 (b) I 2 A 2 x x y y x 2 y 2 x 3 y 3 (c) L,2 m m cycles x x A 2 y y x 2 x 3 A 2 y 2 y 3 (d) A 2 I 2 x x x 2 x 3 DFT 2 DFT 2 x d x d x 2 d 2 x 3 d 3 (e) D i y y y 2 y 3 DFT 2 DFT 2 x x + - (f) DFT 2 y y y 2 y 3 y y (a) No streamig reuse: I m. w m/w cycles (b) Full streamig reuse: I m sr. (w/) blocks (c) Partial streamig reuse: I m/w sr ( I w/ ). (g) DFT (from ()) Fig. 3. Examples of streamig reuse. Fig.. Examples of formulas ad associated combiatioal datapaths. DSP Trasform formula geeratio formula formula aotatio RTL geeratio RTL etlist hardware formula Fig. 2. Block diagram of desig flow. The dashed block cotais the focus of this paper. III. FROM FORMULA TO HARDWARE Our goal is to automatically geerate various sequetial implemetatios of the DFT. Our formula laguage, as explaied i Sectio II, has o sigular correspodece to sequetial hardware. I this sectio, we explai how we exted this laguage to express sequetial hardware elemets eeded for efficiet implemetatios. The, we itroduce a rewritig system which takes a formula with hardware directives ad produces a hardware descriptio formula. Lastly, we discuss the process of compilig a hardware descriptio formula to a sythesizable RTL etlist. This flow is illustrated by the diagram i Figure 2. A. Hardware Sigal Processig Laguage The datapaths associated with various DFT algorithms exhibit a high degree of regularity. This regular structure gives a opportuity to reuse portios of the datapath i two ways. I this sectio, we examie both types of reuse ad the laguage extesios eeded to support them. Streamig reuse. As see i Sectio II, the tesor product I m leads to a datapath with m data-parallel istaces of the block associated with (Fig. 3(a)). We also ca iterpret the tesor product as a idicator of parallelism i time i a streamig fashio. Rather tha havig block repeated m times i parallel, we ca build oe physical istace of it, ad reuse it over m cosecutive clock cycles (Figure 3(b)). We call this streamig reuse. I order to distiguish betwee these two meaigs of the tesor product, we ca tag the symbol sr i order to idicate streamig reuse. Additioally, it is possible to have partial streamig reuse. For example, (I A ) ca be broke dow as (I 2 sr (I 2 A )), meaig that there are two blocks i parallel, ad each operates o data over two clock cycles. A geeralized versio of this situatio ca be see i Figure 3(c). We use w to idicate the stream width. Horizotal reuse. I the previous sectio, we saw how data parallel blocks could be vertically collapsed ito oe block. Additioally, a series of idetical blocks (such as ) ca be horizotally collapsed ito oe block, as see i Figure. We call this horizotal reuse. (Notice that could be streamed as well.) I order to distiguish betwee the two meaigs of, we tag the product term hr i order to idicate horizotal reuse. It is importat to ote that the terms i a horizotal reuse

4 (a) No horizotal reuse: A. (b) Full horizotal reuse: hr A. Fig.. Example of horizotal reuse. product term caot chage from iteratio to iteratio, except i the case of diagoal matrices. For example, (3) would ot be eligible, but () would. This is explaied i detail i Sectio III-C. B. Rewritig System: From Trasform to Hardware Formula I this sectio, we describe a rewritig system that takes a formula plus hardware directives ad produces a hardware formula, which is restructured ad aotated such that it directly correspods to a sequetial hardware implemetatio. This process correspods to the formula aotatio segmet of Figure 2. A hardware directive is a tag that idicates a desired feature of the fial hardware implemetatio. I order to idicate streamig reuse, we defie a streamig tag: This tag idicates that the cotets of A should be restructured such that the resultig hardware formula will be implemeted i a block that cotais w iput ad output ports, with data streamed at w elemets per cycle. Figure 5 lists the rewritig rules that perform this trasformatio. Each time the system ecouters a stream tagged formula, it attempts to restructure the formula or propagate the tag dowward. If a tagged formula does ot match ay of these rules, the tag becomes part of the hardware formula. I these cases, the compiler (discussed i the followig sectio) must explicitly kow how to build a data structure for the tagged formula. Each of the rewrite rules give i Figure 5 has a simple explaatio: base: If the size of a matrix is the same as its stream size, the stream tag is ot ecessary ad ca be dropped. product select: This rule selects whether to do horizotal reuse or streamig. product ad product HR: If a group of matrices is tagged as streamig, the tag is propagated iward to all of the idividual matrices. This rule applies to both versios of the product term. ame rule base if k = w, product-select product product-hr A k A k A A A B Z hra hr hra if streamig if hor. reuse A B Z reuse if lk > w ad k w, I l A }{{ k I } l/w ( ) sr I w/k A k reuse2 if k > w, reverse Fig. 5. I l A }{{ k I } l sr A k A k I l L kl,k (I l A k ) L kl,l Rewritig rules for geeratig hardware formulas. reuse: If the size of A is less tha or equal to the size of the stream, the ier tesor product urolls the correct umber of A istaces such that the ier product is exactly the stream size. reuse2: If the size of A is larger tha the size of the stream, the tag is propagated iward, ad aother rule must restructure A to the right stream size. reverse: A property of the tesor product allows us to reverse its order with strided access. After rewritig, the resultig hardware formula may be made of the followig blocks: formulas (without tags), streamig-reuse tesor products, horizotal-reuse product terms, streamed diagoals, ad streamed permutatios (i.e., stride ad bit reversal permutatios): A, I l sr A, hra, D, L,m, Rr. I the followig sectio, we will discuss how the hardware formula, made of these five types of objects, is built ito a Verilog descriptio. C. Compiler: From Hardware Formula to HDL The compiler takes i a hardware formula, as defied above, ad produces a sythesizable Verilog descriptio of a circuit. I this sectio, we explai how each of the possible forms of the hardware formula is mapped.

5 cotrol address geeratio cotrol dual-port RAM Fig. 6. w ports itercoectio etwork dual-port RAM dual-port RAM itercoectio etwork Structure for permutig streamed vector elemets. Combiatioal formula. Ay portio of the formula without a reuse tag ca automatically be mapped ito a combiatioal datapath, as discussed i Sectio II. Whe the compiler ecouters this type of formula, it costructs a hardware datapath ad automatically pipelies the path by isertig stagig registers i the appropriate locatios. Specific streamig elemets. We implemet two elemets that are built directly from a stream-tagged matrix. The compiler has specific kowledge of how to geerate these blocks: Streamig diagoal: A diagoal matrix scales each elemet i the iput vector by the correspodig value from the diagoal of the matrix. I order to covert this ito a streamig hardware structure, we first eed w multipliers, where w is the stream width. The, we store the values from the diagoal i w tables, which feed the multipliers with the appropriate data at each cycle. Streamig permutatios: A streamig permutatio implemetatio must reorder data i space ad across differet clock cycles. Püschel et al. [5] prescribe a architecture ad a algorithm whereby a arbitrary permutatio ca be costructed for a arbitrary streamig width w (where both the vector legth ad stream width w are 2-powers). The costructio, sketched i Figure 6, uses w dual-ported memory baks ad cofigurable switchig etworks at the iput ad output stages. For the relevat permutatios, the cotrols for the permutatio block ca be computed cheaply from the flit umber usig oly bitwise operators. For example, Figure 7 shows a implemetatio of L 256,2 with streamig width w =. Because this method works with a geeral class of permutatios, it is able to implemet products of permutatios (e.g., L 8, (I 2 L,2 )) as oe self-cotaied permutatio module. Fially, the compiler maps formulas tagged for horizotal or streamig reuse to the appropriate structural hardware costruct: Streamig reuse: A streamig reuse tesor product I m sr is implemeted i hardware as oe block iteded for a iput vector i a streamed format (as see i Fig. 3(b). Horizotal reuse: As show i Fig. (b), a horizotal reuse structure is built with a iput multiplexer ad feedback loop. Whe the block cotais D k, a diagoal matrix that chages with the iteratio, the table streamig vector (w= words) Fig. 7. s s wa wa wa 2 wa 3 dual-port Bak dual-port Bak dual-port Bak2 dual-port Bak3 ra 3 ra ra 2 ra Example of RAM-based permutatio: L 256,2 with ports. must grow to accommodate all values. If a data vector iterates l times over this datapath, the diagoal table must grow by a factor of l. IV. EXAMPLES: FROM FORMULA TO DATAPATH We demostrate the automated sythesis flow for two differet DFT formulas. Streamig reuse. The iterative FFT, give i Equatio (3), produces a streamig reuse structure. For size 8 ad radix 2, this formula simplifies to: DFT 2 3 = L 8,2(I DFT 2 )D 8, L 8, (I 2 L,2 ) (I DFT 2 )D 8, (I 2 L,2 )(I DFT 2 ). The etire formula is the tagged as streamig with a stream size: (I DFT 2 ) D 8, L 8, (I 2 L,2 ) (I DFT 2 ) (I 2 L,2 ) (I DFT 2 ). DFT 2 = L 3 8,2 } {{ } D 8, } {{ } s 2 s 3 } {{ } The, the rewrite system described i Sectio III-B chages the formula to a hardware descriptio formula. For this example, the rewrite rules produce the followig hardware formula: DFT 2 = L 3 8,2 (I sr DFT 2 ) D 8, (I sr DFT 2 )(I 2 sr L,2 ) D 8, L 8, (I 2 L,2 ) (I sr DFT 2 ). (5) Each term i this equatio is directly traslated to a hardware datapath accordig to Sectio III-C (from the right of the formula to the left), producig the datapath see i Fig. 8. Streamig ad horizotal reuse. The Pease FFT algorithm [3] give i Equatio () produces a architecture with We implemet the DFT with the digit reversal permutatio R omitted. This is a commo iterface optio i hardware DFT implemetatios. We idicate this i the formula as DFT.

6 + - L,2 + L 8, (I 2 L,2) L 8,2 Fig. 8. Datapath implemetatio of streamig DFT 8 with stream size of 2 both horizotal ad streamig reuse. For size 6 ad radix, this formula simplifies to: DFT = L 2 6, (I DFT ) D 6,k k= DFT L 6, Next, this formula is tagged with a stream size. Additioally, the product ca be tagged for horizotal reuse, because the formula iside it oly cotais iterator k i the diagoal matrix. The formula is the coverted to a hardware formula, as i Sectio III-B. If the stream size is set to, the followig hardware formula is obtaied: DFT = 2 stream() k= hr L 6, (I sr DFT ) D 6,k stream() stream() Each term of this formula is traslated directly to a RTL etlist (readig the formula from right to left), ad the resultig datapath is show i Fig. 9. However, this datapath icludes two optimizatios that are performed i the compiler: The first optimizatio reduces the amout of arithmetic hardware that is built. Due to the structure of the diagoal matrix i the Pease algorithm, oe out of every r multipliers will always access a value of. By labellig these values as trivial costats ad givig the system some additioal arithmetic simplificatio rules, the tool is able to determie that these multipliers will always multiply by ad thus remove them. This reduces the umber of multipliers by out of every r. The secod optimizatio allows the amout of table data to be reduced by a factor of log r (). A horizotal reuse block with a diagoal matrix D,k that chages with each iteratio requires elemets to be stored for each of the log r () iteratios, leadig to a storage requiremet of log r () words. However, the diagoals i the Pease formula pose a special property: the set of all values of D,l (the diagoal values at iteratio l) is a subset of the values of D,l for l >. This meas that with the right access fuctio, all log r () data words ca be obtaied from a table of words. Whe a stream tag is applied to the Pease diagoal matrix, our system recogizes this property. The, it applies the correct access fuctio to the represetatio ad stores oly the data words correspodig to D,. The ew access fuctio is very simple, cosistig of bit-shifts ad bitwise ANDs of idices. So, the Pease storage requiremet is reduced by a factor of log r (). (6) Fig. 9. Example of Pease DFT 6 with w =. V. EVALUATION Whe coupled with a formula geerator like Spiral [], the formula sythesis flow described i this paper eables a large umber of DFT desigs to be explored quickly i a turkey fashio. This sectio evaluates the supported rage of implemetatios ad the differet cost/performace tradeoffs they provide. As we offer desigs over a wide rage of tradeoffs betwee performace ad cost, our evaluatios iclude a compariso to the Xilix LogiCore FFT implemetatio [6] to establish that the tradeoffs have a soud basis. Specifically, we select as referece three LogiCore FFT implemetatios: the radix- burst I/O implemetatio, the radix-2 miimal size implemetatio, ad the pipelied streamig I/O implemetatio, each with a scaled fixed-poit (6-bit) data format, ad atural-i bit-reverse-out data orderig. A. Methodology We implemeted the formula-to-hardware sythesis flow as a ew hardware backed to Spiral s formula geerator, which produced all the startig DFT formulas studied i this paper. The hardware-specific formula rewritig rules discussed i Sectio III-B are implemeted as a part of Spiral s formula maipulatio stage to produce the aotated hardware formulas. Fially, a RTL geerator, implemeted i Java, emits sythesizable Verilog RTL descriptios from the hardware formulas. Whe a evaluatio i this sectio reports sythesized results, the Verilog descriptios are sythesized ad placead-routed for the Xilix XC2VP-6 FPGA usig Xilix ISE 8.. We report implemetatio cost i uits of slices. All sythesized desigs use 6-bit fix-poit data format. To curtail sythesis load, we cosistetly use 7s as the target

7 clock frequecy. 2 Our RTL geerator will use a block-ram for storage if the usage will utilize more tha 5 percet of that block-ram; otherwise, distributed-ram is used istead. 3 B. Throughput Performace vs. Cost We first evaluate the tradeoff betwee cost (i slices) ad throughput performace (umber of trasforms completed per secod). For DFT 6 ad DFT 256, we evaluate implemetatios of fully-streamig Iterative FFT ad horizotal-reuse Pease FFT. For each algorithm/architecture combiatio, we explore radices 2,, ad 8. We iclude implemetatios with streamig width from w = r up to the maximum allowed by the FPGA capacity. Their cost ad throughput are reported i Figure. Throughput (y-axis) is preseted i terms of gap, the time betwee starts i the steady-state. The x-axis idicates cost i slices. The plots show separate tred lies for each combiatio of algorithm, radix, ad architecture. Each tred lie begis (left to right) with streamig width (w = r) ad doubles thereafter. I the Pareto-style plot, poits closer to the origi represet desigs that are smaller ad faster. Oly poits o the Pareto frot poits that are ot overshadowed by aother poit that is both faster ad smaller should be used i practice. It is importat to ote that the Pareto frot comprises poits arisig from differet combiatios of algorithmic ad architectural decisios. I both DFT 6 ad DFT 256, the fully-streamig implemetatios based o Iterative FFT algorithm provide the fastest (yet commesurately more expesive) desig poits. For all radix choices, the results show a icrease i throughput as more slices are cosumed (by icreasig streamig width w). Implemetatios usig larger radices geerally have better performace/cost ratios relative to comparable implemetatios based o lower radices. This is because, for the comparable choices of streamig width, all implemetatios cosistetly sythesize to comparable frequecies regardless of radix. Hece, all streamig implemetatios of DFT 6 with the same stream width should achieve comparable throughput. O the other had, for the same stream width, higher radix implemetatios have the advatage of fewer permutatio ad twiddle stages. However, the differece betwee radix 8 ad is much less oticeable tha betwee radix ad 2. The throughput evaluatio icludes horizotal-reuse implemetatios based o the Pease FFT to provide the very 2 This is methodology is acceptable because for moderate streamig width w 8, the cycle times of our DFT implemetatios are determied by critical paths i the complex arithmetic pipelie stages (cosistetly sythesize to betwee 7 ad 8 s). For the larger ad wider desigs, the sythesized frequecy iheretly becomes less predictable (typically 2 to 8 s) due to routig ad placemet effects. Overall, our methodology makes our performace results coservative as our performace could possibly improve by choosig a differet frequecy target. Whe reportig sythesis results for the Xilix LogiCore Library, we report the highest performig outcome from sythesizig their desigs over a rage of target frequecies. 3 Block-RAM are the 6-kilobit memory hard macros i the Xilix Virtex- II Pro FPGAs. Distributed-RAM cosumes 6-bits per slice. I geeral, our geerator lets the user set arbitrary switch-over poit betwee usig block- RAM vs distributed-ram. The problem size must be a power of r, the radix. Spiral geerated FFT IP Cores vs. Xilix LogiCore = 6 (top), = 256 (bottom) Iverse throughput (gap) versus area ot streamed streamed Gap [microsecods] Gap [microsecods] Area [slices] 5 5 Area [slices] Xilix LogiCore Spiral radix 2 Spiral radix Spiral radix 8 Fig.. Gap ( / throughput) versus cost for implemetatios of DFT 6 (top) ad DFT 256 (bottom). cheap but commesurately lower throughput desig poits. Gap is still measured i terms of time betwee starts of ew DFT computatios, but these horizotal-reuse implemetatios caot support cotiuous streamig of vectors. Data poits correspodig to the LogiCore FFT implemetatios are icluded i Figure. They serve as referece poits to show that our desigs are of good quality ad yield a real icrease i performace for the extra resources they cosume. C. Latecy Performace vs. Cost Next, we evaluate the tradeoff betwee cost (i slices) ad latecy performace (time elapsed for oe trasform computatio). For DFT 6, DFT 256, ad DFT 2, we evaluate implemetatios of horizotal-reuse Pease FFT oly. (Fullystreamig implemetatios are always Pareto sub-optimal i this regard because they are optimized for high throughput at the expese of exteded latecy.) We explore radices r = {2,,8} whe = 6; radices r = {2,} for = 256; ad radices r = {2,} for = 2. We iclude implemetatios from the miimum streamig width w = r up to the maximum allowed by the FPGA capacity. The cost ad latecy are

8 reported i Figure. For each desig poit, the y-axis idicates latecy i microsecods, ad the x-axis idicates cost i slices. Agai i this Pareto-style plot, poits closer to the origi are cheaper ad faster. Similar to the previous reportig format, the plots show separate tred lies for each combiatio of algorithm/radix/architecture. Each tred lie begis (left to right) with streamig width (w = r) ad doubles thereafter. For all radix choices, the horizotal-reuse implemetatios show a decrease i latecy as more resources are cosumed for wider stream width w. Agai, a large improvemet i performace/cost ratio is see for radix- relative to radix- 2 implemetatios 5, but the differece betwee radix-8 ad radix- is less sigificat. Employig higher-radix implemetatios has aother advatage that is more subtle. For example, to achieve the same latecy, a radix-2 implemetatio eeds approximately twice the streamig width as i a radix- implemetatio (to get the same amout of computatio per cycle). These performace-comparable radix-2 ad radix- implemetatios will also have comparable cost as well. (The same relatioship exists betwee radix- ad radix-8.) The subtle but importat differece is that a w = radix- implemetatio oly requires loadig ad uloadig vector elemets per cycle at the start ad fiish of each computatio istead of 8 elemets per cycle for a comparable performace radix-2 implemetatio. For the same cost ad performace, a higher radix implemetatio is more desirable due to this lower iterface badwidth. Agai, data poits correspodig to the LogiCore DFT implemetatios are show i Figure to provide a baselie. Our horizotal-reuse implemetatios allow more direct comparisos agaist LogiCore s latecy ad cost optimized architectures. Amog our differet latecy/cost implemetatios, the low cost high latecy implemetatios correspod most closely to LogiCore s tradeoff poits. D. Rage of Implemetatios Give the multi-variable ad multi-objective ature of optimizig FFT implemetatios, it is impossible to completely explore the full rage of desigs or to properly compare tradeoffs across all combiatios of metrics. I Table I, we highlight some of the most saliet desig corers attaiable usig the desig choices described i this paper. Colums to 5 specify the correspodig decisios used (problem size, algorithm, radix r, architecture, stream width w). Colums 6 to 9 report the performace ad cost metrics (throughput, latecy, slices used, block-ram used). VI. RELATED WORK A extesive base of fudametal work i FFT algorithms ad architectures for VLSI ad FPGA has laid the foudatio for this work. The mathematical framework described i this 5 The results give for the radix-2 cases agree with our earlier work [2] which dealt specially with radix-2 horizotal-reuse Pease FFT implemetatios. Improvemets see i the curret results are due to the recetly icorporated memory-based permutatio blocks. paper is capable of represetig a wide variety of desigs, icorporatig optimizatios at both the algorithmic ad architectural levels. Examples of prior work i fully-streamed (or pipelied) FFT implemetatios ca be see i [7], [8], ad [9]. I some previous pipelied implemetatios, arithmetic uits are ot fully utilizable (e.g., [7] ad [9]) due to their permutatio implemetatios. Examples of prior work examiig horizotalreuse FFT implemetatios ca be see i [] ad []. Specifically, Pease FFTs with horizotal reuse are discussed i [2] ad [2]. O the whole, may prior developmets have covered much of the same desig space we cosidered i this paper. However, these implemetatios were tued for differet objectives ad targeted differet techologies, prevetig a systematic represetatio of the desig space. Our study is somewhat uique i its extesive coverage of varied implemetatio parameters, usig real RTL desigs ad real FPGA sythesis. Below, we highlight examples of some importat desig choices ot examied i this study. We did ot cosider the impact of fixed-poit precisio [3] or floatig-poit arithmetics []. We cosidered either o-the-fly twiddle geeratio usig CORDIC [5] or distributed arithmetic to optimize the arithmetic pipelie at the bit-level [6]. We cocetrated o performace ad cost as the primary metrics ad did ot cosider the issues of power or eergy [7]. We also did ot cosider FFT processors desiged specifically for executig FFT algorithms [8]. VII. CONCLUSION This paper presets a DFT trasform sythesis flow that captures a importat rage of implemetatio optios. The sythesis flow starts from precise mathematical formulas of fast DFT algorithms ad applies structural rewrite rules to impart appropriate hardware implemetatio decisios. The resultig aotated hardware formulas straightforwardly map to RTL etlists of efficiet implemetatios. This sythesis flow ca be coupled with the exitig Spiral formula geerator to fully automate DFT desig exploratio ad sythesis. The formula laguage ad the sythesis procedure preseted i this paper are actually sufficiet for a wider rage of trasforms i additio to DFT. The system, as is, ca hadle Walsh-Hadamard trasform ad multidimesioal DFTs. The cetral limitatio to supportig a broader class of trasforms is i costructig cost-effective streamig implemetatios of the required permutatios. Recet work [5] has produced very efficiet solutios to this problem. Thus, we pla to cotiue this work o other trasforms (e.g., discrete cosie trasform or the DFT o real valued iputs). ACKNOWLEDGMENT This work was supported by DARPA uder DOI grat NBCH-59 ad by NSF awards ACR ad ITR/ACI

9 Spiral geerated FFT IP Cores vs. Xilix LogiCore = 6 (left), = 256 (ceter), = 2 (right) Latecy versus area Xilix LogiCore Spiral radix Spiral radix 2 Spiral radix 8 Latecy [microsecods] Latecy [microsecods] Latecy [microsecods] Area [slices] Area [slices] Area [slices] Fig.. Latecy versus cost for horizotal-reuse implemetatios of DFT 6, DFT 256, ad DFT 2 (from left to right). algorithm r architecture w throughput latecy cost BRAMs commets (/µs) (µs) (slices) 6 Pease () 2 horiz. reuse lowest cost 6 Iterative (3) 2 fully-streamed best throughput 6 Pease () horiz. reuse lowest latecy per slice 256 Pease () 2 horiz. reuse lowest cost 256 Iterative (3) 6 fully-streamed best throughput 256 Pease () horiz. reuse balaced cost vs latecy 2 Pease () 2 horiz. reuse lowest cost 2 Pease () 32 horiz. reuse lowest latecy 2 Pease () horiz. reuse balaced cost vs latecy TABLE I COMPILATION OF SELECT REPRESENTATIVE IMPLEMENTATIONS AND DESIGN CORNERS. REFERENCES [] C. Va Loa. Computatioal Framework of the Fast Fourier Trasform. SIAM, 992. [2] G. Nordi, P. Milder, J. Hoe, ad M. Püschel. Automatic geeratio of customized discrete Fourier trasform IPs. I Proceedigs of the 2d Aual Coferece o Desig Automatio, 25. [3] M. C. Pease. A adaptatio of the fast Fourier trasform for parallel processig. ACM, 5(2), April 968. [] M. Püschel, J. M. F. Moura, J. Johso, D. Padua, M. Veloso, B. W. Siger, J. Xiog, F. Frachetti, A. Gačić, Y. Voroeko, K. Che, R. W. Johso, ad N. Rizzolo. SPIRAL: Code geeratio for DSP trasforms. Proceedigs of the IEEE, special issue o Program Geeratio, Optimizatio, ad Adaptatio, 93(2): , 25. [5] M. Püschel, P. A. Milder, ad J. C. Hoe. Permutig streamig data usig RAMs. Joural submissio uder preparatio. [6] Xilix, Ic. Xilix LogiCore: Fast Fourier Trasform v3.2. [7] E. H. Wold ad A. M. Despai. Pipelie ad parallel-pipelie FFT processors for VLSI implemetatios. IEEE Trasactios o Computers, C-33(5): 26, May 98. [8] S. F. Gorma ad J. M. Wills. Partial colum FFT pipelies. IEEE Trasactios o Circuits ad Systems II: Aalog ad Digital Sigal Processig, 2(6): 23, 995. [9] S. He ad M. Torkelso. ew approach to pipelie FFT processor. I Proc. Iteratioal Parallel Processig Symposium, 996. [] D. Cohe. Simplified cotrol of FFT hardware. IEEE Trasactios o Acoustics, Speech, ad Sigal Processig, 2(6): , 976. [] G. Szedo, V. Yag, ad C. Dick. High-performace FFT processig usig recofigurable logic. I Proc. Asilomar Coferece o Sigals, Systems ad Computers, 2. [2] M. Serra, P. Martí, ad J. Carrabia. IFFT/FFT core architecture with a idetical stage structure for wireless LAN commuicatios. I Proc. IEEE Workshop o Sigal Processig Advaces i Wireless Commuicatios, 2. [3] P. Kabal ad B. Sayar. Performace of fixed-poit FFT s: Roudig ad scalig cosideratios. I IEEE Iteratioal Coferece Acoustics, Speech, ad Sigal Processig, volume, pages 22 22, April 986. [] K. S. Hemmert ad K. D. Uderwood. A aalysis of the doubleprecisio floatig-poit FFT o FPGAs. I Proc. IEEE Symposium o Field-Programmable Custom Computig Machies, 25. [5] A. Baerjee, A. Sudar Dhar, ad S. Baerjee. FPGA realizatio of a CORDIC based FFT processor for biomedical sigal processig. Microprocessors ad Microsystems, 25(3):3 2, May 2. [6] M. Shaditalab, G. Bois, ad M. Sawa. Self sortig radix 2 FFT o FPGAs usig parallel pipelied distributed arithmetic blocks. I Proc. IEEE Symposium o FPGAs for Custom Computig Machies, 998. [7] S. Choi, G. Govidu, J. Jag, ad V. K. Prasaa. Eergy-efficiet ad parameterized desigs for fast Fourier trasform o FPGA. I Proc. IEEE Iteratioal Coferece o Acoustics, Speech ad Sigal Processig, 23. [8] P. Kumhom, J. Johso, ad P. Nagvajara. Desig, optimizatio, ad implemetatio of a uiversal FFT processor. ASIC/SOC Coferece, 2. I Proc. 3th IEEE

SPIRAL DSP Transform Compiler:

SPIRAL DSP Transform Compiler: SPIRAL DSP Trasform Compiler: Applicatio Specific Hardware Sythesis Peter A. Milder (peter.milder@stoybroo.edu) Fraz Frachetti, James C. Hoe, ad Marus Pueschel Departmet of ECE Caregie Mello Uiversity

More information

Formal Datapath Representation and Manipulation for Implementing DSP Transforms

Formal Datapath Representation and Manipulation for Implementing DSP Transforms Formal Datapath Represetatio ad Maipulatio for Implemetig DSP Trasforms Peter A. Milder, Fraz Frachetti, James C. Hoe, ad Markus Püschel Electrical ad Computer Egieerig Departmet Caregie Mello Uiversity

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

EE123 Digital Signal Processing

EE123 Digital Signal Processing Last Time EE Digital Sigal Processig Lecture 7 Block Covolutio, Overlap ad Add, FFT Discrete Fourier Trasform Properties of the Liear covolutio through circular Today Liear covolutio with Overlap ad add

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Cubic Polynomial Curves with a Shape Parameter

Cubic Polynomial Curves with a Shape Parameter roceedigs of the th WSEAS Iteratioal Coferece o Robotics Cotrol ad Maufacturig Techology Hagzhou Chia April -8 00 (pp5-70) Cubic olyomial Curves with a Shape arameter MO GUOLIANG ZHAO YANAN Iformatio ad

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea

FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea FPGA IMPLEMENTATION OF BASE-N LOGARITHM Salvador E. Tropea Electróica e Iformática Istituto Nacioal de Tecología Idustrial Bueos Aires, Argetia email: salvador@iti.gov.ar ABSTRACT I this work, we preset

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

BOOLEAN MATHEMATICS: GENERAL THEORY

BOOLEAN MATHEMATICS: GENERAL THEORY CHAPTER 3 BOOLEAN MATHEMATICS: GENERAL THEORY 3.1 ISOMORPHIC PROPERTIES The ame Boolea Arithmetic was chose because it was discovered that literal Boolea Algebra could have a isomorphic umerical aspect.

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Math 10C Long Range Plans

Math 10C Long Range Plans Math 10C Log Rage Plas Uits: Evaluatio: Homework, projects ad assigmets 10% Uit Tests. 70% Fial Examiatio.. 20% Ay Uit Test may be rewritte for a higher mark. If the retest mark is higher, that mark will

More information

The Simeck Family of Lightweight Block Ciphers

The Simeck Family of Lightweight Block Ciphers The Simeck Family of Lightweight Block Ciphers Gagqiag Yag, Bo Zhu, Valeti Suder, Mark D. Aagaard, ad Guag Gog Electrical ad Computer Egieerig, Uiversity of Waterloo Sept 5, 205 Yag, Zhu, Suder, Aagaard,

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

Computer Systems - HS

Computer Systems - HS What have we leared so far? Computer Systems High Level ENGG1203 2d Semester, 2017-18 Applicatios Sigals Systems & Cotrol Systems Computer & Embedded Systems Digital Logic Combiatioal Logic Sequetial Logic

More information

Outline. Applications of FFT in Communications. Fundamental FFT Algorithms. FFT Circuit Design Architectures. Conclusions

Outline. Applications of FFT in Communications. Fundamental FFT Algorithms. FFT Circuit Design Architectures. Conclusions FFT Circuit Desig Outlie Applicatios of FFT i Commuicatios Fudametal FFT Algorithms FFT Circuit Desig Architectures Coclusios DAB Receiver Tuer OFDM Demodulator Chael Decoder Mpeg Audio Decoder 56/5/ 4/48

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 ) EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders

More information

Data Warehousing. Paper

Data Warehousing. Paper Data Warehousig Paper 28-25 Implemetig a fiacial balace scorecard o top of SAP R/3, usig CFO Visio as iterface. Ida Carapelle & Sophie De Baets, SOLID Parters, Brussels, Belgium (EUROPE) ABSTRACT Fiacial

More information

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits Egieerig Letters, :, EL Reversible Realizatio of Quaterary Decoder, Multiplexer, ad Demultiplexer Circuits Mozammel H.. Kha, Member, ENG bstract quaterary reversible circuit is more compact tha the correspodig

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods.

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods. Software developmet of compoets for complex sigal aalysis o the example of adaptive recursive estimatio methods. SIMON BOYMANN, RALPH MASCHOTTA, SILKE LEHMANN, DUNJA STEUER Istitute of Biomedical Egieerig

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

Automatic Generation of Polynomial-Basis Multipliers in GF (2 n ) using Recursive VHDL

Automatic Generation of Polynomial-Basis Multipliers in GF (2 n ) using Recursive VHDL Automatic Geeratio of Polyomial-Basis Multipliers i GF (2 ) usig Recursive VHDL J. Nelso, G. Lai, A. Teca Abstract Multiplicatio i GF (2 ) is very commoly used i the fields of cryptography ad error correctig

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists

More information

South Slave Divisional Education Council. Math 10C

South Slave Divisional Education Council. Math 10C South Slave Divisioal Educatio Coucil Math 10C Curriculum Package February 2012 12 Strad: Measuremet Geeral Outcome: Develop spatial sese ad proportioal reasoig It is expected that studets will: 1. Solve

More information

1. SWITCHING FUNDAMENTALS

1. SWITCHING FUNDAMENTALS . SWITCING FUNDMENTLS Switchig is the provisio of a o-demad coectio betwee two ed poits. Two distict switchig techiques are employed i commuicatio etwors-- circuit switchig ad pacet switchig. Circuit switchig

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation 6-0-0 Kowledge Trasformatio from Task Scearios to View-based Desig Diagrams Nima Dezhkam Kamra Sartipi {dezhka, sartipi}@mcmaster.ca Departmet of Computig ad Software McMaster Uiversity CANADA SEKE 08

More information

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA Creatig Exact Bezier Represetatios of CST Shapes David D. Marshall Califoria Polytechic State Uiversity, Sa Luis Obispo, CA 93407-035, USA The paper presets a method of expressig CST shapes pioeered by

More information

BOOLEAN DIFFERENTIATION EQUATIONS APPLICABLE IN RECONFIGURABLE COMPUTATIONAL MEDIUM

BOOLEAN DIFFERENTIATION EQUATIONS APPLICABLE IN RECONFIGURABLE COMPUTATIONAL MEDIUM MATEC Web of Cofereces 79, 01014 (016) DOI: 10.1051/ mateccof/0167901014 T 016 BOOLEAN DIFFERENTIATION EQUATIONS APPLICABLE IN RECONFIGURABLE COMPUTATIONAL MEDIUM Staislav Shidlovskiy 1, 1 Natioal Research

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must

More information

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering EE 4363 1 Uiversity of Miesota Midterm Exam #1 Prof. Matthew O'Keefe TA: Eric Seppae Departmet of Electrical ad Computer Egieerig Uiversity of Miesota Twi Cities Campus EE 4363 Itroductio to Microprocessors

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

One advantage that SONAR has over any other music-sequencing product I ve worked

One advantage that SONAR has over any other music-sequencing product I ve worked *gajedra* D:/Thomso_Learig_Projects/Garrigus_163132/z_productio/z_3B2_3D_files/Garrigus_163132_ch17.3d, 14/11/08/16:26:39, 16:26, page: 647 17 CAL 101 Oe advatage that SONAR has over ay other music-sequecig

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview APPLICATION NOTE Automated Gai Flatteig Scope ad Overview A flat optical power spectrum is essetial for optical telecommuicatio sigals. This stems from a eed to balace the chael powers across large distaces.

More information

UNIVERSITY OF MORATUWA

UNIVERSITY OF MORATUWA UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

MOST of the advanced signal processing algorithms are

MOST of the advanced signal processing algorithms are A edited versio of this work was publiched i IEEE TRANS. ON CIRCUITS AND SYSTEMS II, VOL. 6, NO. 9, SEPT 05 DOI:0.09/TCSII.05.35753 High-Throughput FPGA Implemetatio of QR Decompositio Sergio D. Muñoz

More information

Parabolic Path to a Best Best-Fit Line:

Parabolic Path to a Best Best-Fit Line: Studet Activity : Fidig the Least Squares Regressio Lie By Explorig the Relatioship betwee Slope ad Residuals Objective: How does oe determie a best best-fit lie for a set of data? Eyeballig it may be

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 10: Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 10: Caches Prof. Yajig Li Uiversity of Chicago Midterm Recap Overview ad fudametal cocepts ISA Uarch Datapath, cotrol Sigle cycle, multi cycle Pipeliig Basic idea,

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

A Parallel DFA Minimization Algorithm

A Parallel DFA Minimization Algorithm A Parallel DFA Miimizatio Algorithm Ambuj Tewari, Utkarsh Srivastava, ad P. Gupta Departmet of Computer Sciece & Egieerig Idia Istitute of Techology Kapur Kapur 208 016,INDIA pg@iitk.ac.i Abstract. I this

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

Filter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design

Filter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design Filter desig Desig cosideratios: a framework C ı p ı p H(f) Aalysis of fiite wordlegth effects: I practice oe should check that the quatisatio used i the implemetatio does ot degrade the performace of

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

Efficient Hardware Design for Implementation of Matrix Multiplication by using PPI-SO

Efficient Hardware Design for Implementation of Matrix Multiplication by using PPI-SO Efficiet Hardware Desig for Implemetatio of Matrix Multiplicatio by usig PPI-SO Shivagi Tiwari, Niti Meea Dept. of EC, IES College of Techology, Bhopal, Idia Assistat Professor, Dept. of EC, IES College

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a 4. [10] Usig a combiatorial argumet, prove that for 1: = 0 = Let A ad B be disjoit sets of cardiality each ad C = A B. How may subsets of C are there of cardiality. We are selectig elemets for such a subset

More information

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO Sagwo Seo, Trevor Mudge Advaced Computer Architecture Laboratory Uiversity of Michiga at A Arbor {swseo, tm}@umich.edu Yumig Zhu, Chaitali

More information

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0 Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity

More information

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION 397 AN OPTIMIZATION NETWORK FOR MATRIX INVERSION Ju-Seog Jag, S~ Youg Lee, ad Sag-Yug Shi Korea Advaced Istitute of Sciece ad Techology, P.O. Box 150, Cheogryag, Seoul, Korea ABSTRACT Iverse matrix calculatio

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

The golden search method: Question 1

The golden search method: Question 1 1. Golde Sectio Search for the Mode of a Fuctio The golde search method: Questio 1 Suppose the last pair of poits at which we have a fuctio evaluatio is x(), y(). The accordig to the method, If f(x())

More information

MOTIF XF Extension Owner s Manual

MOTIF XF Extension Owner s Manual MOTIF XF Extesio Ower s Maual Table of Cotets About MOTIF XF Extesio...2 What Extesio ca do...2 Auto settig of Audio Driver... 2 Auto settigs of Remote Device... 2 Project templates with Iput/ Output Bus

More information

Appendix A. Use of Operators in ARPS

Appendix A. Use of Operators in ARPS A Appedix A. Use of Operators i ARPS The methodology for solvig the equatios of hydrodyamics i either differetial or itegral form usig grid-poit techiques (fiite differece, fiite volume, fiite elemet)

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

Counting the Number of Minimum Roman Dominating Functions of a Graph

Counting the Number of Minimum Roman Dominating Functions of a Graph Coutig the Number of Miimum Roma Domiatig Fuctios of a Graph SHI ZHENG ad KOH KHEE MENG, Natioal Uiversity of Sigapore We provide two algorithms coutig the umber of miimum Roma domiatig fuctios of a graph

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

Page 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory!

Page 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory! Why Care About the Memory Hierarchy? Memory Virtual Memory -DRAM Memory Gap (latecy) Reasos: Multi process systems (abstractio & memory protectio) Solutio: Tables (holdig per process traslatios) Fast traslatio

More information

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Course Site:   Copyright 2012, Elsevier Inc. All rights reserved. Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios

More information

Generation of Distributed Arithmetic Designs for Reconfigurable Applications

Generation of Distributed Arithmetic Designs for Reconfigurable Applications Geeratio of Distributed Arithmetic Desigs for Recofigurable Applicatios Christophe Bobda, Ali Ahmadiia, Jürge Teich Uiversity of Erlage-Nuremberg Departmet of computer sciece Am Weichselgarte 3, 91058

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

Consider the following population data for the state of California. Year Population

Consider the following population data for the state of California. Year Population Assigmets for Bradie Fall 2016 for Chapter 5 Assigmet sheet for Sectios 5.1, 5.3, 5.5, 5.6, 5.7, 5.8 Read Pages 341-349 Exercises for Sectio 5.1 Lagrage Iterpolatio #1, #4, #7, #13, #14 For #1 use MATLAB

More information

Today s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers?

Today s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers? CSE401: Itroductio to Compiler Costructio Larry Ruzzo Sprig 2004 Today s objectives Admiistrative details Defie compilers ad why we study them Defie the high-level structure of compilers Associate specific

More information

Message Integrity and Hash Functions. TELE3119: Week4

Message Integrity and Hash Functions. TELE3119: Week4 Message Itegrity ad Hash Fuctios TELE3119: Week4 Outlie Message Itegrity Hash fuctios ad applicatios Hash Structure Popular Hash fuctios 4-2 Message Itegrity Goal: itegrity (ot secrecy) Allows commuicatig

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware Parallel Polygo Approximatio Algorithm Targeted at Recofigurable Multi-Rig Hardware M. Arif Wai* ad Hamid R. Arabia** *Califoria State Uiversity Bakersfield, Califoria, USA **Uiversity of Georgia, Georgia,

More information

An Efficient Implementation of the Gradient-based Hough Transform using DSP slices and block RAMs on the FPGA

An Efficient Implementation of the Gradient-based Hough Transform using DSP slices and block RAMs on the FPGA A Efficiet Implemetatio of the Gradiet-based Hough Trasform usig DSP slices ad block RAMs o the FPGA Xi Zhou, Yasuaki Ito, ad Koji Nakao Departmet of Iformatio Egieerig Hiroshima Uiversity Kagamiyama 1-4-1,

More information