FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea
|
|
- Jessie Stephens
- 5 years ago
- Views:
Transcription
1 FPGA IMPLEMENTATION OF BASE-N LOGARITHM Salvador E. Tropea Electróica e Iformática Istituto Nacioal de Tecología Idustrial Bueos Aires, Argetia salvador@iti.gov.ar ABSTRACT I this work, we preset a area optimized FPGA implemetatio of a IP core to compute the base-n logarithm. Nevertheless, we also discuss the area, speed ad precisio trade-offs. We selected a algorithm that could be implemeted o ay FPGA avoidig vedor specific features like block RAMs, embedded multipliers, etc. We report the implemetatio results of a fixed poit versio of the algorithm usig various commo cofiguratios o Xilix ad Actel devices. This implemetatio achieved the required area goals providig a very good speed-area ratio. Keywords: logarithm, FPGA, area optimized, multiplicative ormalizatio 1. INTRODUCTION The computatio of the logarithm is widely used for may applicatios like digital filters, power computatios (db) ad logarithmic umber systems (LNS), to ame a few. This is also a key compoet of LNS, which offer good advatages over traditioal floatig poit systems [1]. This work is orgaized as follows: sectio two deals with about the algorithm selectio ad how it works, sectio three describes its implemetatio, sectio four briefly discusses the error itroduced by the approximatio, sectio five shows the results of the implemetatio ad sectio six provides the coclusios. 2. ALGORITHM SELECTION AND DESCRIPTION The available methods to compute the logarithm of a umber usig digital circuits ca be divided i two mai groups. O the oe had, we have the look-up table based algorithms ad, o the other, iterative methods. The first approach is faster ad straightforward, but oly useful for low precisio. This is due to the size of the look-up table. The secod group is slower, but suitable for high precisio. May studies explore hybrid implemetatios that take advatages from both groups [2] ad [3], [4], [5] cited by Kostopoulos. Our project required a algorithm that could be implemeted o FPGAs from ay vedor, icludig the Actel's RTSX72SU FPGA. This is a radiatio tolerat device ad lacks the BRAMs eeded to implemet big lookup tables. Therefore, we oly evaluated iterative algorithms that eed small look-up tables. Moreover, we coded the algorithm usig stadard VHDL to esure its portability. Taylor's series expasio is amog the most popular methods to maually compute logarithms, but it has a slow covergece ad requires slow operatios like the divisio [6]. Other algorithms like the oe proposed by Kostopoulos [7] use a multiplier iside the iteratio; therefore, they are slow whe o embedded multipliers are available. A very well studied alterative is the multiplicative ormalizatio. It just eeds adders ad shifters ad provides a fast covergece [6], [8], [9], cosequetly it was suitable for our project Multiplicative Normalizatio Based o the followig logarithms property log a b =log a log b (1) Assumig that x is our variable, the multiplicative ormalizatio uses the followig values for a ad b log x Selectig the b i values to get we obtai b i =log x log b i (2) x b i =1 (3) log x = log b i (4) Storig the log(b i) values i a small look-up table the logarithm computatio is reduced to sums. y i 1 = y i log b i (5)
2 The oly requiremet is (3). To reduce the multiplicatios used i (3) to simple shifts we could use b i =1 d i 2 i (6) I this way the product computed i each iteratio of (3) is ad the logarithm iteratio x i 1 =x i x i d i 2 i (7) y i 1 = y i log 1 d i 2 i (8) Now we must select the d i values to achieve (3). The fuctio to determie d i is called decisio fuctio A Simple Decisio Fuctio The selectio of the decisio fuctio affects the covergece speed ad the algorithm precisio. Although high-radix systems cosume big amouts of area, they provide high speed. However, selectig a simple decisio fuctio we ca achieve a small area footprit. We selected 1 ad 0 as the oly possible values for d i ad applied the followig criterio: if usig 1 makes x i+1 bigger tha or equal to 1, the, we simply use 0. As the b i values are smaller ad smaller this method asymptotically approaches to 1. The hardware implemetatio was very simple because to determie whether the resultig value is bigger tha or equal to 1 we just eed to check oe bit i the result. Whe this bit becomes 1 the iteratio is simply skipped because (7) ad (8) are: x i 1 =x i 0 y i 1 = y i log 1 (9) This fuctio uses oly two values ad is simpler tha the oe described by Kore [6]. Ulike the fuctios explaied i [6] ad [9], this fuctio provides a good covergece of the approximatio for the whole rage of values Covergece Radius Replacig (6) i (3) we obtai x= 1 1 d i 2 i (10) The smaller value we ca obtai is whe the decisio fuctio is always 1 ad the bigger whe the fuctio always selects 0. 1 x i (11) 2.4. Rage Extesio As (11) shows the valid rage for the variable is small. This is ot a problem for floatig poit arithmetic where the variable is ormalized. Whe the values are ot ormalized, as i the fixed poit case, we must reduce them ad, the, compesate the reductio at the output of the approximatio. A simple method to achieve this is to fid a umber that multiplied by the variable results i a umber that satisfy (11). If we call z to this value ad use (1) we ca write log x =log z x log z (12) The product computatio ca be simplified by choosig z=2 j (13) where j is selected, as metioed above, ad the oly thig that we must solve is log z =log 2 j (14) These values ca be stored i a look-up table. Whe the size of the look-up table becomes a issue, we ca solve this as follows: (14) is very simple to compute whe the base of the logarithm is 2 replacig (15) i (12) z = 2 j = j (15) x = 2 j x j (16) Whe we eed to compute aother base, we could use the followig property applied to (16) log a x = log b x log b a (17) log a x = 1 a 2 j x j (18) This method replaces the above metioed look-up table by a multiplicatio usig a costat. 3. ALGORITHM IMPLEMENTATION Our project used usiged fixed poit iput values, whereby, we split (18) i N =2 j x (19)
3 A= N j (20) M = 1 a A (21) where we call (19) ormalizatio, (20) approximatio ad (21) base selectio. Note that whe usig floatig poit values the matissa is already ormalized ad we ca start computig (20) usig the expoet as j ad the ext step (21) is the same. The mai differece is that the result of (21) must be ormalized like i (19) achievig a comparable complexity. Moreover, additioal circuitry could be eeded to detect values out of the logarithm domai Normalizatio This step must fid the value of j that satisfies (11). Whe m=1 ad 3 we ca satisfy (11) usig or i biary otatio 0.5 x 1 (22) 0.1 x 1.0 (23) For a usiged fixed poit value with E iteger bits ad D decimal bits we could fid the value of j as follows: we start with j=e, the we shift the value to the left util the most sigificat bit is 1, decreasig j i each iteratio. The result is a ormalized fixed poit value with zero iteger bits ad E+D decimal bits. Example: E=5, D=3, x= x= j=5 x= j=4 x= j=3 The result is j=3 ad the ormalized value is This algorithm cosumes up to E D 1 steps to achieve the ormalizatio. Noetheless, whe the iput values are chose at radom, half of the values are completed i zero steps ad oly oe value i 2 E D 1 eeds E D 1 steps. This process could be implemeted usig a combiatioal circuit, but it could cosume a much bigger area Approximatio High-radix versios of the multiplicative ormalizatio could be used for speed optimizatio, but for area optimizatio the proposed decisio fuctio is more suitable. The m value i (4) does ot eed to be 0, thaks to the ormalizatio step, ad we ca start from 1. The value is determied by the umber of bits used i the approximatio. As the log(b i) values quickly decrease, icreasig their trucatio error, we observed that for N bits the value of should be N 2, where N is selected to achieve the desired precisio. We implemeted the decisio fuctio (7) usig N+1 bits, thus the MSB of x i+1 is used to determie whether the step is skipped. A barrel shifter was used to implemet the product, while a serial shifter did ot show a importat area savig ad slowed dow the computatio. Fig. 1 (a) shows the flow chart, where Table cotais the log(b i) values ad the y variable is the approximatio result. Fig. 2 shows a simplified block diagram. We also icorporated the j additio to this block ad a error compesatio coefficiet explaied i sectio four (25) Base Selectio This step is a multiplicatio by a costat ad provides a base chage. The more commo bases are 2, e (atural logarithms) ad 10. I the first case the costat is 1 ad this step is skipped, the approximated costats for the other two cases are ad This block ca take advatage of embedded multipliers whe available. Whe optimizig for speed, high speed multipliers (i.e. adder tree) ca be used. I our case we used a traditioal shift-ad-add algorithm. We optimized the algorithm loadig the multiplier with the result obtaied after the first step ad all the subsequet 0s. This ca be illustrated with the followig example: let us suppose that we are computig the base 10 logarithm ad that we determied that the costat must be trucated at the 14 th bit to achieve a desired precisio 1/ The 0s at the right ad at the left ca be skipped ad we get We modified the load of the multiplier to load the result obtaied after the use of the four LSBs. As a result, the multiplier oly eeds to compute the remaiig I this example, a 14-bit multiplicatio is reduced to a 7-bit multiplicatio. Fig. 1 (b) shows the flow chart, where OPT is the umber of bits optimized durig the load ad Kap() represets the th bit of the costat Iterblock Commuicatio The above metioed blocks eeds differet amouts of clock cycles to fiish their tasks ad, which is eve worst, the first step eeds a variable umber of clock pulses. For this reaso, we implemeted a simple hadshake mechaism to commuicate the blocks. The protocol uses a request lie to idicate that a ew value is available for the
4 Fig. 2. Simplified block diagram for the approximatio 4.2. Approximatio Error Fig. 1. Flow charts: approximatio (a) ad multiplier (b) cosumer ad a ackowledge lie to idicate the value was cosumed. The reported results iclude the area ad clock pulses eeded for the above metioed hadshake. The resultig core is composed by three idepedet blocks allowig the computatio of up to three values o the fly. 4. ERROR ANALYSIS I this sectio we briefly discuss the error itroduced by this core. Uless otherwise specified all the errors are expressed i couts of the LSB, kow as ulp (uits i the last place). We oly provide a simplified aalysis to estimate the maximum error, the actual error should be less tha or equal to the estimated oe Normalizatio Error Whe the total umber of iput bits is bigger tha the umber of bits used i the approximatio, this block must trucate the value. I this case, the block itroduces a error i the (-1;0] rage. Addig a rouder, the itroduced error is i the (-0.5;0.5] rage. However, it does ot reduce the total error, as we will see later. This error ca be measured by simulatio; we used a C implemetatio for that purpose. Table 1 shows the errors for various sizes. The ormalizatio error is processed by the approximatio covertig the (-1;0] rage to a approximated (-3;0] rage, ad it should be added to the error itroduced i this step, show betwee paretheses i Table Multiplicatio Error This block itroduces two differet errors. Oe is itroduced whe we trucate the costat. This error is multiplied by the maximum iput value, where this is less tha E, whe the iteger part of x has E bits. If we call K to the costat ad Kap to the approximated value Etv max =E Kap K (24) Note that this error is always egative uless we roud the costat istead of trucate it. Additioally (24) is ot valid whe E is 0, i this case we should use -0.5 istead of E. The other error is the result of the trucatio of the output value ad is i the (-1;0] rage. I additio, the errors from the previous blocks are affected by the multiplicatio; cosequetly, they are reduced Example We will show a example of the error estimatio to clarify the above metioed compoets. Let us suppose the followig setup: base 10, iput values 34.7 (usiged),
5 output decimals 10 ad the costat is trucated at the 12 th bit. From Table 1 we kow that the error at the output of the approximatio step is i the [-7;5] rage. The egative boudary of the error ca be obtaied multiplyig this error by Kap, ad the addig Etv together with the trucatio error. The value of Kap is ad usig (24) Table 1. Approximatio Error Bits Negative Error (ulp) Positive Error (ulp) =1233/ 4096 Etv max = ulp 10 The egative boudary of the error is = The positive boudary is ot affected by the trucatio = resultig i a expected rage of [-3;1] Error Balacig 8 3 (6) (7) (10) (11) (14) (25) 6 As the above aalysis shows, the core has a ubalaced error rage. This is because the core trucates various values, ad we could balace this error addig a costat to (20) A= N j Ecomp (25) I the previous example we could add 3 ulp chagig the error at the output of the approximatio from [-7;5] to [-4;8] obtaiig a fial error i the [-2;2] rage. This simple compesatio reduced the maximum absolute error from 3 to 2 for this example. Table 2. Implemetatio Results I Out Est. Error Meas. Error Clocks ;3-4;3 33/56/ ;3-4;3 33/64/ ;3-3;3 28/42/ ;2-2;2 23/63/ ;5-3;4 50/50/28 5. IMPLEMENTATION RESULTS I this work, we selected some represetative usiged fixed poit values (8.16, 16.16, 0.15); the particular case of our project (34.7 usig 10 bits for the approximatio) ad the equivalet to a sigle precisio floatig poit matissa (0.24). Furthermore, we selected the decimal logarithm. I the 0.24 case we assumed a already ormalized iput value. I the Xilix case we used ISE Webpack 7.1 (7.1.03i H.41) tool for sythesis, ad the Virtex 4 LX (xc4vlx ff668) device to measure the speed. We disabled the use of BRAMs ad DSP48 uits, selected area optimizatio ad set the optimizatio effort to high. We did ot use ay time costrai. Note that the cosumed resources (FFs ad LUTs) for the Virtex 4 are approximately the same for smaller devices like Sparta II. Table 2 shows the error estimated usig the method described i sectio four, the measured error ad the umber of clocks eeded to compute the result. We measured the error usig 100,000 radom values. The first two values for the clocks are the miimum ad maximum umber of clocks eeded to compute the result (latecy), the third value is the average umber of clock cycles betwee output results whe the core is costatly fed with radom values (throughput). Table 3 shows area usage ad maximum operatig frequecy reported by the tool after the place ad route. The values for the area are flip-flops, 4 iputs LUTs, umber of LUTs used as a route-thru ad slices. I the Actel case we used Libero 7.2 (Syplify Pro 8.5F, Build 001R ad Actel Desiger ), ad the RT54SX72S-1-208CQFP device. We selected a maximum faout of 20 to optimize the used area. Table 4 shows area usage ad maximum operatig frequecy reported by the tool after the place ad route. The values for the area are flip-flops, combiatioal cells ad the percetage of area used. Note that Actel devices use a simpler combiatioal uit, but they provide two of them for each flip-flop. Fig. 3 shows the measured error distributio for the 0.24 case, as metioed above, it was obtaied usig 100,000 radom values. The bar for -4 ulp ca ot be observed because oly four values preseted this error.
6 Table 3. Implemetatio Results (Xilix) I Out Area (FF/LUT/RLUT/Slices) Max. freq /281/ 8/ MHz /290/ 8/ MHz /251/ 8/ MHz /217/ 5/ MHz /409/13/ MHz Table 4. Implemetatio Results (Actel) I Out Area (Reg./Comb./Area%) Max. freq /500/10,8 % 33 MHz /505/11,1 % 32 MHz /446/9,5 % 32 MHz /372/8,5 % 39 MHz /747/15,3 % 23 MHz 8. REFERENCES Fig. 3 Error distributio for the 0.24/0.26 case. 6. CONCLUSION The proposed algorithm could be implemeted o FPGAs from ay vedor. As a result, it does ot rely o specific resources like BRAMs, embedded multipliers, etc. The used area was small. For example whe the 8.16 cofiguratio was used for a decimal logarithm, it was aroud 13% of a 100,000 gates equivalet device ad less tha 1/500 of a moder Virtex 4 LX 200. Whe we used the selected Virtex 4, this core computed a average of 9.78 millio values per secod for the 8.16 cofiguratio ad over tha 5 millio values, whe the output eeded 23 exact bits. The last case eeded three extra bits ad provided the precisio eeded for a sigle precisio floatig poit umber. The core showed a very good area-speed relatioship, ad for applicatios where the speed requiremets are bigger more tha oe core could be used i parallel. [1] M. Haselma, M. Beauchamp, A. Wood, et al, A Compariso of Floatig Poit ad Logarithmic Number Systems for FPGAs, Proc. of the 13th Aual IEEE Symp. o Field-Prog. Custom Comp. (FCCM'05) , Ja [2] Y. Wa ad C. L. Wey, Efficiet algorithms for biary logarithmic coversio ad additio, IEE Proc.-Comp. Digit. Tech., vol. 146, o. 3, May [3] T. C. Che, Automatic computatio of expoetial, logarithms ratios ad square roots, IBM J. Res. Develop., pp , Jul [4] H.-Y. Lo ad Y. Aoki, Geeratio of a precise biary logarithm with differece groupig programmable logic array, IEEE Tras. Comput., vol. C-34, pp , Aug [5] S.-Y. Shi, Shortcut to logarithms combie table lookup ad computatio, Comput. Desig., pp , May [6] I. Kore, Computer arithmetic algorithms, 2 d editio, ISBN , pp [7] D. K. Kostopoulos, A algorithm for the computatio of biary logarithms, IEEE Tras. o Comp., vol. 40, o. 11, /91, Nov [8] B. Parhami, Computer Arithmetic Algorithms ad Hardware Desigs, ISBN , pp [9] M. Pascale, Microcotrollers & CORDIC methods,, Dr. Dobb's Joural, Jul ACKNOWLEDGMENTS We would like to thak to Gustavo Sutter (UAM, Spai), he cotributed valuable ideas ad refereces.
Chapter 3 Classification of FFT Processor Algorithms
Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As
More informationImproving Template Based Spike Detection
Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for
More informationLecture 5. Counting Sort / Radix Sort
Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018
More informationChapter 3. Floating Point Arithmetic
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 3 Floatig Poit Arithmetic Review - Multiplicatio 0 1 1 0 = 6 multiplicad 32-bit ALU shift product right multiplier add
More informationCSC 220: Computer Organization Unit 11 Basic Computer Organization and Design
College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:
More informationAppendix D. Controller Implementation
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);
More informationCSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)
CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a
More informationAutomatic Generation of Polynomial-Basis Multipliers in GF (2 n ) using Recursive VHDL
Automatic Geeratio of Polyomial-Basis Multipliers i GF (2 ) usig Recursive VHDL J. Nelso, G. Lai, A. Teca Abstract Multiplicatio i GF (2 ) is very commoly used i the fields of cryptography ad error correctig
More informationAnalysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis
Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems
More informationLecture 1: Introduction and Strassen s Algorithm
5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access
More information. Written in factored form it is easy to see that the roots are 2, 2, i,
CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or
More informationLecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram
Lecture 2 RTL Desig Methodology Trasitio from Pseudocode & Iterface to a Correspodig Block Diagram Structure of a Typical Digital Data Iputs Datapath (Executio Uit) Data Outputs System Cotrol Sigals Status
More informationLoad balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *
Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of
More informationData Structures and Algorithms. Analysis of Algorithms
Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output
More informationThe following algorithms have been tested as a method of converting an I.F. from 16 to 512 MHz to 31 real 16 MHz USB channels:
DBE Memo#1 MARK 5 MEMO #18 MASSACHUSETTS INSTITUTE OF TECHNOLOGY HAYSTACK OBSERVATORY WESTFORD, MASSACHUSETTS 1886 November 19, 24 Telephoe: 978-692-4764 Fax: 781-981-59 To: From: Mark 5 Developmet Group
More informationPseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured
More informationImprovement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation
Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity
More informationComputer Systems - HS
What have we leared so far? Computer Systems High Level ENGG1203 2d Semester, 2017-18 Applicatios Sigals Systems & Cotrol Systems Computer & Embedded Systems Digital Logic Combiatioal Logic Sequetial Logic
More informationOnes Assignment Method for Solving Traveling Salesman Problem
Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:
More information1.2 Binomial Coefficients and Subsets
1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =
More informationChapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.
Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4
More informationLecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming
Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis
More informationRunning Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments
Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The
More informationAPPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS
APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful
More informationk (check node degree) and j (variable node degree)
A Parallel Turbo Decodig Message Passig Architecture for Array LDPC Codes Kira Guam, Pakaj Bhagawat, Weihuag Wag, Gwa Choi, Mark Yeary * Dept. of Electrical Egieerig, Texas A&M Uiversity, College Statio,
More informationRunning Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments
Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.
More informationAnalysis of Algorithms
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The
More informationArithmetic Sequences
. Arithmetic Sequeces COMMON CORE Learig Stadards HSF-IF.A. HSF-BF.A.1a HSF-BF.A. HSF-LE.A. Essetial Questio How ca you use a arithmetic sequece to describe a patter? A arithmetic sequece is a ordered
More informationA Comparison of Floating Point and Logarithmic Number Systems for FPGAs
Compariso of Floatig Poit ad Logarithmic Number Systems for FPGs Michael Haselma, Michael Beauchamp, aro Wood, Scott Hauck Dept. of Electrical Egieerig Uiversity of Washigto Seattle, W {haselma, mjb7,
More informationAnalysis of Algorithms
Aalysis of Algorithms Ruig Time of a algorithm Ruig Time Upper Bouds Lower Bouds Examples Mathematical facts Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite
More informationPolynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0
Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity
More informationEE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )
EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders
More informationChapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 5 Fuctios for All Subtasks Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 5.1 void Fuctios 5.2 Call-By-Referece Parameters 5.3 Usig Procedural Abstractio 5.4 Testig ad Debuggig
More informationChapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig
More informationGPUMP: a Multiple-Precision Integer Library for GPUs
GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract
More informationEE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control
EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,
More informationBehavioral Modeling in Verilog
Behavioral Modelig i Verilog COE 202 Digital Logic Desig Dr. Muhamed Mudawar Kig Fahd Uiversity of Petroleum ad Mierals Presetatio Outlie Itroductio to Dataflow ad Behavioral Modelig Verilog Operators
More informationMorgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5
Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:
More informationIMP: Superposer Integrated Morphometrics Package Superposition Tool
IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College
More informationSolution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:
CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed
More informationAlgorithm. Counting Sort Analysis of Algorithms
Algorithm Coutig Sort Aalysis of Algorithms Assumptios: records Coutig sort Each record cotais keys ad data All keys are i the rage of 1 to k Space The usorted list is stored i A, the sorted list will
More informationReversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits
Egieerig Letters, :, EL Reversible Realizatio of Quaterary Decoder, Multiplexer, ad Demultiplexer Circuits Mozammel H.. Kha, Member, ENG bstract quaterary reversible circuit is more compact tha the correspodig
More informationOptimized Aperiodic Concentric Ring Arrays
24th Aual Review of Progress i Applied Computatioal Electromagetics March 30 - April 4, 2008 - iagara Falls, Caada 2008 ACES Optimized Aperiodic Cocetric Rig Arrays Rady L Haupt The Pesylvaia State Uiversity
More informationDescriptive Statistics Summary Lists
Chapter 209 Descriptive Statistics Summary Lists Itroductio This procedure is used to summarize cotiuous data. Large volumes of such data may be easily summarized i statistical lists of meas, couts, stadard
More informationBACHMANN-LANDAU NOTATIONS. Lecturer: Dr. Jomar F. Rabajante IMSP, UPLB MATH 174: Numerical Analysis I 1 st Sem AY
BACHMANN-LANDAU NOTATIONS Lecturer: Dr. Jomar F. Rabajate IMSP, UPLB MATH 174: Numerical Aalysis I 1 st Sem AY 018-019 RANKING OF FUNCTIONS Name Big-Oh Eamples Costat O(1 10 Logarithmic O(log log, log(
More informationCreating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA
Creatig Exact Bezier Represetatios of CST Shapes David D. Marshall Califoria Polytechic State Uiversity, Sa Luis Obispo, CA 93407-035, USA The paper presets a method of expressig CST shapes pioeered by
More informationA graphical view of big-o notation. c*g(n) f(n) f(n) = O(g(n))
ca see that time required to search/sort grows with size of We How do space/time eeds of program grow with iput size? iput. time: cout umber of operatios as fuctio of iput Executio size operatio Assigmet:
More informationHow do we evaluate algorithms?
F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:
More informationElementary Educational Computer
Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified
More informationECE4050 Data Structures and Algorithms. Lecture 6: Searching
ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated
More informationCIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19
CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.
More information( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb
Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most
More informationChapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,
More informationFast Fourier Transform (FFT) Algorithms
Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform
More informationProtected points in ordered trees
Applied Mathematics Letters 008 56 50 www.elsevier.com/locate/aml Protected poits i ordered trees Gi-Sag Cheo a, Louis W. Shapiro b, a Departmet of Mathematics, Sugkyukwa Uiversity, Suwo 440-746, Republic
More informationAn Efficient Algorithm for Graph Bisection of Triangularizations
A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu
More informationCIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)
CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig
More informationSorting in Linear Time. Data Structures and Algorithms Andrei Bulatov
Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio
More informationCS : Programming for Non-Majors, Summer 2007 Programming Project #3: Two Little Calculations Due by 12:00pm (noon) Wednesday June
CS 1313 010: Programmig for No-Majors, Summer 2007 Programmig Project #3: Two Little Calculatios Due by 12:00pm (oo) Wedesday Jue 27 2007 This third assigmet will give you experiece writig programs that
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad
More informationDigital System Design
July, 22 9:55 vra235_ch Sheet umber Page umber 65 black chapter Digital System Desig a b c d e f g h 8 7 6 5 4 3 2. Bd3 g6+, Ke8 d8 65 July, 22 9:55 vra235_ch Sheet umber 2 Page umber 66 black 66 CHAPTER
More informationAlgorithms for Disk Covering Problems with the Most Points
Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi
More informationCubic Polynomial Curves with a Shape Parameter
roceedigs of the th WSEAS Iteratioal Coferece o Robotics Cotrol ad Maufacturig Techology Hagzhou Chia April -8 00 (pp5-70) Cubic olyomial Curves with a Shape arameter MO GUOLIANG ZHAO YANAN Iformatio ad
More informationSD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.
SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that
More information3D Model Retrieval Method Based on Sample Prediction
20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer
More informationPattern Recognition Systems Lab 1 Least Mean Squares
Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig
More informationOutline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis
Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis
More informationMobile terminal 3D image reconstruction program development based on Android Lin Qinhua
Iteratioal Coferece o Automatio, Mechaical Cotrol ad Computatioal Egieerig (AMCCE 05) Mobile termial 3D image recostructio program developmet based o Adroid Li Qihua Sichua Iformatio Techology College
More informationChapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 4 Procedural Abstractio ad Fuctios That Retur a Value Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 4.1 Top-Dow Desig 4.2 Predefied Fuctios 4.3 Programmer-Defied Fuctios 4.4
More informationAn Efficient Algorithm for Graph Bisection of Triangularizations
Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe
More informationAdministrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today
Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised
More informationLower Bounds for Sorting
Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:
More informationA New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method
A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro
More informationModule 8-7: Pascal s Triangle and the Binomial Theorem
Module 8-7: Pascal s Triagle ad the Biomial Theorem Gregory V. Bard April 5, 017 A Note about Notatio Just to recall, all of the followig mea the same thig: ( 7 7C 4 C4 7 7C4 5 4 ad they are (all proouced
More information6.854J / J Advanced Algorithms Fall 2008
MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms
More informationChapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 2 C++ Basics Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 2.1 Variables ad Assigmets 2.2 Iput ad Output 2.3 Data Types ad Expressios 2.4 Simple Flow of Cotrol 2.5 Program
More informationUNIVERSITY OF MORATUWA
UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016
More informationA Generalized Set Theoretic Approach for Time and Space Complexity Analysis of Algorithms and Functions
Proceedigs of the 10th WSEAS Iteratioal Coferece o APPLIED MATHEMATICS, Dallas, Texas, USA, November 1-3, 2006 316 A Geeralized Set Theoretic Approach for Time ad Space Complexity Aalysis of Algorithms
More informationData diverse software fault tolerance techniques
Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the
More informationExact Minimum Lower Bound Algorithm for Traveling Salesman Problem
Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute
More informationParabolic Path to a Best Best-Fit Line:
Studet Activity : Fidig the Least Squares Regressio Lie By Explorig the Relatioship betwee Slope ad Residuals Objective: How does oe determie a best best-fit lie for a set of data? Eyeballig it may be
More informationWhat are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs
What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure
More informationFilter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design
Filter desig Desig cosideratios: a framework C ı p ı p H(f) Aalysis of fiite wordlegth effects: I practice oe should check that the quatisatio used i the implemetatio does ot degrade the performace of
More informationThe isoperimetric problem on the hypercube
The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose
More information15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015
15-859E: Advaced Algorithms CMU, Sprig 2015 Lecture #2: Radomized MST ad MST Verificatio Jauary 14, 2015 Lecturer: Aupam Gupta Scribe: Yu Zhao 1 Prelimiaries I this lecture we are talkig about two cotets:
More informationIntro to Scientific Computing: Solutions
Itro to Scietific Computig: Solutios Dr. David M. Goulet. How may steps does it take to separate 3 objects ito groups of 4? We start with 5 objects ad apply 3 steps of the algorithm to reduce the pile
More informationDesigning a learning system
CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try
More informationCMPT 125 Assignment 2 Solutions
CMPT 25 Assigmet 2 Solutios Questio (20 marks total) a) Let s cosider a iteger array of size 0. (0 marks, each part is 2 marks) it a[0]; I. How would you assig a poiter, called pa, to store the address
More informationAnnouncements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components
Aoucemets Readig Chapter 4 (4.1-4.2) Project #4 is o the web ote policy about project #3 missig compoets Homework #1 Due 11/6/01 Chapter 6: 4, 12, 24, 37 Midterm #2 11/8/01 i class 1 Project #4 otes IPv6Iit,
More informationLesson 1. The datapath
Lesso. Computers Structure ad Orgaizatio Graduate i Computer Scieces / Graduate i Computers Egieerig Computers Structure ad Orgaizatio. Graduate i Computer Scieces / Graduate i Computer Egieerig Automatic
More informationENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics
ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced by 50,
More informationLinearising Calibration Methods for a Generic Embedded Sensor Interface (GESI)
1st Iteratioal Coferece o Sesig Techology November 21-23, 2005 Palmersto North, New Zealad Liearisig Calibratio Methods for a Geeric Embedded Sesor Iterface (GESI) Abstract Amra Pašić Work doe i: PEI Techologies,
More informationLecture 3. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram
Lecture 3 RTL Desig Methodology Trasitio from Pseudocode & Iterface to a Correspodig Block Diagram Structure of a Typical Digital Data Iputs Datapath (Executio Uit) Data Outputs System Cotrol Sigals Status
More informationCh 9.3 Geometric Sequences and Series Lessons
Ch 9.3 Geometric Sequeces ad Series Lessos SKILLS OBJECTIVES Recogize a geometric sequece. Fid the geeral, th term of a geometric sequece. Evaluate a fiite geometric series. Evaluate a ifiite geometric
More informationThe golden search method: Question 1
1. Golde Sectio Search for the Mode of a Fuctio The golde search method: Questio 1 Suppose the last pair of poits at which we have a fuctio evaluatio is x(), y(). The accordig to the method, If f(x())
More informationLecture 2: Spectra of Graphs
Spectral Graph Theory ad Applicatios WS 20/202 Lecture 2: Spectra of Graphs Lecturer: Thomas Sauerwald & He Su Our goal is to use the properties of the adjacecy/laplacia matrix of graphs to first uderstad
More informationThe Simeck Family of Lightweight Block Ciphers
The Simeck Family of Lightweight Block Ciphers Gagqiag Yag, Bo Zhu, Valeti Suder, Mark D. Aagaard, ad Guag Gog Electrical ad Computer Egieerig, Uiversity of Waterloo Sept 5, 205 Yag, Zhu, Suder, Aagaard,
More informationSouth Slave Divisional Education Council. Math 10C
South Slave Divisioal Educatio Coucil Math 10C Curriculum Package February 2012 12 Strad: Measuremet Geeral Outcome: Develop spatial sese ad proportioal reasoig It is expected that studets will: 1. Solve
More informationMath 10C Long Range Plans
Math 10C Log Rage Plas Uits: Evaluatio: Homework, projects ad assigmets 10% Uit Tests. 70% Fial Examiatio.. 20% Ay Uit Test may be rewritte for a higher mark. If the retest mark is higher, that mark will
More information