Hardware Design and Performance Estimation of The 128-bit Block Cipher CRYPTON

Size: px
Start display at page:

Download "Hardware Design and Performance Estimation of The 128-bit Block Cipher CRYPTON"

Transcription

1 Hardware Desig ad Performace Estimatio of The 128-bit Block Cipher CRYPTON Eujog Hog, Jai-Hoo Chug, ad Chae Hoo Lim Iformatio ad Commuicatios Research Ceter Future Systems, Ic Yagjae-Dog, Seocho-Ku, Seoul, Korea {ejhog, jhoo, future.co.kr Abstract CRYPTON is a 128-bit block ecryptio algorithm proposed as a cadidate for the Advaced Ecryptio Stadard (AES), ad is expected to be especially efficiet i hardware implemetatio. I this paper, hardware desigs of CRYPTON, ad their performace estimatio results are preseted. Straightforward hardware desigs are improved by exploitig hardware-friedly features of CRYPTON. Hardware architectures are described i VHDL ad simulated. Circuits are sythesized usig 0.35 μm gate array library, ad timig ad gate couts are measured. Data ecryptio rate of 1.6 Gbit/s could be achieved with moderate area of 30,000 gates ad up to 2.6 Gbit/s with less tha 100,000 gates. 1. Itroductio The explosive growth i computer systems ad their itercoectios via etworks has chaged the way we live. From politics to busiess, our lives deped o the iformatio stored ad commuicated usig these systems. This i tur has led to a heighteed awareess of the eed of the iformatio security. To eforce iformatio security by protectig data ad resources from disclosure, a secure ad efficiet ecryptio algorithm is eeded. Sice the widely used ecryptio algorithms, DES or Triple DES, are o more cosidered secure eough or efficiet for future applicatios, a ew block ecryptio algorithm with a stregth equal to or better tha that of Triple DES ad sigificatly improved efficiecy is eeded. Recetly very high badwidth etworkig techologies such as ATM ad Gigabit Etheret are rapidly deployed [1]. Network applicatios such as virtual private etwork [2] eed high-speed executios of ecryptio algorithms for high-speed etworks. I order to utilize the high-speed etworks at lik speed, ecryptio speed as fast as 1 Giga bits per secod is required. Accordig to our experimet, the executio speed of Triple DES o Itel Petium-II, 333MHz is oly 19.6Mbps, ad the highest performace of Triple DES from the commercially available ecryptio hardware is about 200 Mbps [3, 4]. CRYPTON [5, 6] is a 128-bit block ecryptio algorithm proposed as a cadidate for the Advaced Ecryptio Stadard (AES) [7]. I the evaluatio performed by NIST, its software implemetatio o Petium- Pro, 200MHz showed about 40Mbps, the best ecryptio ad decryptio speeds amog the AES cadidates [8]. Hardware implemetatios of CRYPTON are expected to be more efficiet tha software implemetatios - 1 -

2 because it was desiged from the begiig with hardware implemetatios i mid. The ecryptio ad decryptio use the idetical circuitry, ad there eeds o large logic for S-boxes. Moreover it does ot use additio or multiplicatio but oly uses exclusive-or operatios, ad the exclusive-or operatios ca be executed i parallel. CRYPTON is cosidered as the most hardware-friedly AES cadidate o several researches [9-11]. I this paper, hardware desigs of CRYPTON Versio 1.0 [6], which were optimized by exploitig the hardware-friedly features of CRYPTON, are preseted. To maximize operatio parallelism, key geeratio ad data ecryptio operatios are executed simultaeously. Some roud loops ca be urolled for speedup without icreasig the quatity of logic. S-boxes used for both key schedulig ad data ecryptio are shared to miimize the area. Hardware architectures are described i VHDL ad simulated usig Syopsys Compiler ad Simulator. Circuits are sythesized usig 0.35 μm gate array library, ad timig ad gate couts are measured. This paper is orgaized as follows. I Sectio 2, CRYPTON algorithm is briefly itroduced. I Sectio 3, our desig cosideratios of CRYPTON hardware are described. I Sectio 4 ad 5, the detailed hardware desigs of CRYPTON ad estimatio results are preseted respectively. Cocludig remarks are made i Sectio CRYPTON Algorithm I CRYPTON, each data block of 128 bits is processed i the form of a 4 4 byte array. The roud trasformatio of CRYPTON cosists of four steps: byte-wise substitutios, colum-wise bit permutatio, colum-to-row traspositio, ad the key additio. The ecryptio process ivolves 12 repetitios of the same roud trasformatio. The decryptio process ca be made idetical to the ecryptio process with a differet key schedule. Figure 1 shows the high level structure of CRYPTON. 256-bit User Data Iput : 4 32-bit words Schedule Roud K 0 Roud keys K i (i = ) iitial key additio Additio repeat this roud fuctio 12 times Byte Substistutio Bit Permutatio ad Byte Traspostio Additio Output Trasformatio Byte Traspositio Bit Permutatio Byte Traspositio Data Ecryptio Data Output : 4 32-bit words Figure 1. The Structure of CRYPTON. The block cipher CRYPTON has the followig features: 12-roud self-reciprocal cipher (with idetical processes for ecryptio ad decryptio) with block legth 128 bits ad key legth up to 256 bits. Strog security agaist existig attacks

3 Efficiecy i both SW ad HW: thaks to the high degree of parallelisms ad the use of oly very simple operatios (logical ANDs/XORs, rotates, ad table lookups), CRYPTON rus very fast o most platforms i software as well as i hardware. Wide tradeoffs betwee speed ad memory: It ca be implemeted usig the storage of 32, 256, 1K, 4K, 16K or 512K bytes, etc. Its speed o Petium Pro 200 MHz rages from 54 Mbps (i C) to 64 Mbps (i i-lie ASM). Fast key schedulig: The ecryptio key schedule rus much faster tha oe-block ecryptio, so it is very efficiet for use i the applicatio requirig frequet key chages (e.g., i hashig mode). 2.1 Basic Buildig Blocks Byte Substitutio γ CRYPTON uses a oliear byte substitutio usig four 8 8 S-boxes, S i (0 i 3). These S-boxes are derived from oe 8 8 ivolutio S-box S (i.e., S = S -1 ) ad satisfy the iverse relatioships S 2 = S -1-0 ad S 3 = S 1 1. The ivolutio S-box S was desiged to eable efficiet logic implemetatio (see [6] for details). The S-box trasformatio γ cosists of byte-wise substitutios o a 4 4 byte array. Two differet trasformatios are used alteratively i successive rouds: γ ο i odd rouds ad γ e i eve rouds. They are depicted i Figure 2. Observe that the four S-boxes are arraged so that the followig holds for ay 4 4 byte array A: γ ο (γ e (A)) = γ e (γ ο (Α)) = Α. This property is used to derive the idetical process for ecryptio ad decryptio. A[0] a 03 a 02 a 01 a 00 B[0] S 3 (a 03 ) S 2 (a 02 ) S 1 (a 01 ) S 0 (a 00 ) A[1] a 13 a 12 a 11 a 10 γ o B[1] S 0 (a 13 ) S 3 (a 12 ) S 2 (a 11 ) S 1 (a 10 ) A[2] a 23 a 22 a 21 a 20 B[2] S 1 (a 23 ) S 0 (a 22 ) S 3 (a 21 ) S 2 (a 20 ) A[3] a 33 a 32 a 31 a 30 B[3] S 2 (a 33 ) S 1 (a 32 ) S 0 (a 31 ) S 3 (a 30 ) A[0] a 03 a 02 a 01 a 00 B[0] S 1 (a 03 ) S 0 (a 02 ) S 3 (a 01 ) S 2 (a 00 ) A[1] a 13 a 12 a 11 a 10 γ e B[1] S 2 (a 13 ) S 1 (a 12 ) S 0 (a 11 ) S 3 (a 10 ) A[2] a 23 a 22 a 21 a 20 B[2] S 3 (a 23 ) S 2 (a 22 ) S 1 (a 21 ) S 0 (a 20 ) A[3] a 33 a 32 a 31 a 30 B[3] S 0 (a 33 ) S 3 (a 32 ) S 2 (a 31 ) S 1 (a 30 ) Figure 2. The Byte Substitutio γ Bit Permutatio π ad Byte Traspositio τ As liear trasformatios, CRYPTON uses a sequece of bit permutatio ad byte traspositio. The bit permutatio π mixes each byte colum i 4 4 byte array ad the byte traspositio τ trasposes the resultig byte colums ito byte rows. Let us first defie four maskig vectors (M 3, M 2, M 1, M 0 ) for bit extractio used i π as M 0 = m 3 m 2 m 1 m 0 = 0x3fcff3fc, M 1 = m 0 m 3 m 2 m 1 = 0xfc3fcff3, - 3 -

4 M 2 = m 1 m 0 m 3 m 2 = 0xf3fc3fcf, M 3 = m 2 m 1 m 0 m 3 = 0xcff3fc3f, where m 0 = 0xfc, m 1 = 0xf3, m 2 = 0xcf, m 3 = 0x3f. To make the ecryptio ad decryptio processes idetical, two slightly differet versios of a bit permutatio is used; π ο i odd rouds ad π e i eve rouds. They are defied as follows: Bit permutatio π o for odd rouds: B = π o (A) defied by B[0] (A[3] M 3 ) (A[2] M 2 ) (A[1] M 1 ) (A[0] M 0 ), B[1] (A[3] M 0 ) (A[2] M 3 ) (A[1] M 2 ) (A[0] M 1 ), B[2] (A[3] M 1 ) (A[2] M 0 ) (A[1] M 3 ) (A[0] M 2 ), B[3] (A[3] M 2 ) (A[2] M 1 ) (A[1] M 0 ) (A[0] M 3 ). Bit permutatio π e for eve rouds: B = π e (A) defied by B[0] (A[3] M 1 ) (A[2] M 0 ) (A[1] M 3 ) (A[0] M 2 ), B[1] (A[3] M 2 ) (A[2] M 1 ) (A[1] M 0 ) (A[0] M 3 ), B[2] (A[3] M 3 ) (A[2] M 2 ) (A[1] M 1 ) (A[0] M 0 ), B[3] (A[3] M 0 ) (A[2] M 3 ) (A[1] M 2 ) (A[0] M 1 ). Note that i hardware each bit of π outputs requires oly exclusive-ors of three differet bits. The byte traspositio τ simply rearrages a 4 4 byte array by movig the byte at the (i, j)-th positio to the (j, i)-th positio (See Figure 3 for B = τ(α) ). A[0] a 03 a 02 a 01 a 00 B[0] a 30 a 20 a 10 a 00 A[1] a 13 a 12 a 11 a 10 τ B[1] a 31 a 22 a 11 a 01 A[2] a 23 a 22 a 21 a 20 B[2] a 32 a 22 a 12 a 02 A[3] a 33 a 32 a 31 a 30 B[3] a 33 a 23 a 13 a 03 Figure 3. The Byte Traspositio τ Additio σ mixig is performed by simply exclusive-orig roud keys with data words. That is, for a roud key K = (K[3], K[2], K[1], K[0]) t, B = σ Κ (A) is defied by B[i] = A[i] K[i] for i = 0, 1, 2, Roud Trasformatio ρ Oe roud of CRYPTON cosists of applyig i sequece S-box trasformatio, bit permutatio, byte traspositio ad key additio. That is, the ecryptio roud fuctios used for odd ad eve rouds are defied (for roud key K) by ρ οκ = σ Κ τ π ο γ ο ρ eκ = σ Κ τ π e γ e for r = 1, 3,... etc., for r = 2, 4,... etc. 2.2 Ecryptio ad Decryptio The ecryptio trasformatio E K of r-roud CRYPTON uder key K cosists of a iitial key additio ad r/2 times repetitios of ρ ο ad ρ e ad the a fial output trasformatio. Ecryptio ad decryptio ca be - 4 -

5 performed by the same code (logic) if differet roud keys are applied. 2.3 Schedulig r-roud CRYPTON requires total 4(r 1) roud keys each of 32-bit legth. These roud keys are geerated from a user key of 8k (k = 0,1,...,32) bits i two steps: first oliear trasform the user key ito 8 expaded keys ad the geerate the required umber of roud keys from these expaded keys usig simple operatios. This two-step geeratio of roud keys is to allow efficiet o-the-fly roud key computatio i the case where storage requiremets do ot allow storig the whole roud keys. For details of CRYPTON algorithm, please refer to [6]. 3. Desig Cosideratios 3.1 Parallel Executio of Geeratio ad Ecryptio Some block ciphers pre-compute roud keys ad store them. The the stored keys are used repeatedly while ecryptig. This style eeds extra cycles for key setup, ad large storage if all 12 roud keys of 128-bit legth are to be stored. However, CRYPTON ca geerate keys simultaeously with ecryptio. The time it takes to geerate a roud key of CRYPTON is isigificat compared to the time it takes for a roud trasformatio, ad this makes it possible to geerate roud keys with the ecryptio proceedig. The parallel operatio of hardware CRYPTON is illustrated i Figure 4. User Data Iput K1 Geeratio K 1 Roud 1 Trasformatio K2 Geeratio K 2 Roud 2 Trasformatio K12 Geeratio K 12 Roud 12 Trasformatio Data Output Figure 4. Cocurret Operatio of Hardware CRYPTON. 3.2 Loop Urollig Because CRYPTON cosists of 12 repetitios of roud trasformatio, iteratio is a ievitable choice for small area desigs. Rather tha buildig every roud trasformatio separately, a data flow is made to pass the same hardware block repeatedly. We ca build the whole ecryptio with oly a small, basic buildig block by exploitig iteratio. The block diagram for 12-cycle iteratio is show i Figure 5(a)

6 MUX MUX Eve Roud Trasformatio MUX Odd Roud Trasformatio Loop Urollig Odd Roud Trasformatio Eve Roud Trasformatio Register Register X 12 cycles (a) X 6 cycles (b) Figure 5. Loop Urollig. Although iteratio results i small area desigs, it accompaies additioal path delay take from multiplexer ad register. To reduce the umber of pass through multiplexer ad register, we ca uroll the loop so that oe cycle cotais double or may times of the roud trasformatio logic. I Figure 5(b), two roud trasformatios are performed i a cycle, but the umber of compoets icluded i the desig is oe multiplexer less tha that of Figure 5(a). We ca fid that the desig i Figure 5(b) has better speed ad area tha the desig i Figure 5(a). CRYPTON has a good tradeoff betwee speed ad area whe 2, 3, 4, 6, or 12 rouds are cocateated i a loop. 3.3 Commo Logic Sharig CRYPTON has oly a few simple operatios as its compoets. Because the compoets appear i several differet parts of the algorithm, we ca make those parts share a commo logic block rather tha buildig may separate logic blocks. Commo logic sharig will cotribute to area reductio oly if the commo compoets have smaller area tha the multiplexer added. Otherwise it will oly result i icreased cotrol burde ad loger path delay. Data Iput Data Iput Logic B User Logic B User MUX Byte Substitutio Byte Substitutio Commo Logic Sharig Byte Substitutio Logic A Roud Logic C Logic A Roud Logic C Data Output Data Output Schedulig Module Ecryptio Module Schedulig Module Ecryptio Module (a) Figure 6. Commo Logic Sharig. (b) CRYPTON has byte substitutio operatio both i key schedulig module ad ecryptio module. Because byte substitutio charges a sigificatly larger area tha 128 2:1 multiplexers, there is a beefit of adoptig commo logic sharig as show i Figure

7 4. Hardware Desig of CRYPTON I this sectio we propose two hardware desigs of CRYPTON. 4.1 Two-Roud Model Two-roud model performs two successive trasformatios for eve ad odd rouds withi a clock cycle. Tworoud model is efficiet i area-speed tradeoff as we saw i Sectio 3.2, ad it has smaller area tha desigs with 3, 4, 6, or 12 rouds cocateated. Two-roud model cosists of three modules: ecryptio module, key schedulig module, ad cotrol module as show i Figure 9. The followig scheme describes how two-roud model works. (1) Load 128-bit iput data at DATA_IN port, ad 256-bit user key at USER_KEY port. (2) If decryptio is to be performed, apply logic high o DECR port util the ecryptio is fiished. (3) Start ecryptio by applyig logic high pulse for oe clock cycle at the START port. (4) Check the output port DONE to see if the ecryptio is completed. (5) If DONE port outputs logic high, read the ecryptio result through output port DATA_OUT. The 12-roud ecryptio process eeds 13 roud keys; from the roud 0 key to the roud 12 key. Geeratig 13 roud keys will take 7 clock cycles because two-roud model computes two keys withi a clock cycle. Thus the result is available 7 clock cycles later after logic high o START port is latched at the risig edge of clock. Eve Roud Iput : 4 32-bit words M0 M if (roud == 0) M0, else M (128 of 2:1 mux) Eve Roud U Iput : 4 32-bit words M0 if (the 0th roud) M0, else M (128 of 2:1 mux) A M Byte Substitutio, odd if (expasio cycle) U, else A (128 of 2:1 mux) Bit Permutatio, odd Byte Traspositio Byte Substitutio, odd Bit Permutatio, odd Odd Roud 128-bit register Ur Ko Byte Traspositio Byte Substitutio, eve Bit Permutatio, eve Byte Traspositio V 128-bit register B if (expasio cycle) V, else B (128 of 2:1 mux) Byte Substitutio, eve Bit Permutatio, eve Byte Traspositio Byte Traspositio Bit Permutatio, eve Vr Byte Traspositio Byte Traspositio Output: 4 32-bit words Bit Permutatio, eve Byte Traspositio Output : 4 32-bit words (a) (b) Figure 7. Ecryptio Module of Two-Roud Model - 7 -

8 Figure 7(a) shows the ecryptio module of two-roud model. I ecryptio module, two blocks for eve roud trasformatio ad odd roud trasformatio are cascaded serially, but the sequece of blocks i Figure 7(a) is somewhat differet from that of Figure 1. Because roud 0 has uiqueess that oly key additio is performed i it, it will require extra 128 two-iput exclusive-or gates if we did t take the sequece show i Figure 7(a). The block diagram for key schedulig module is depicted i Figure 8. schedulig module coveys two successive roud keys to the ecryptio module at every clock cycle. Oe of our desig priciples is parallel executio metioed i Sectio 3.1, ad we do t adopt ay roud-key pre-computatio style for speed ehacemet. There is desig cocer i the fial stage of roud-key computatio. A roud key is exclusive-or sum of a roud-costat, maskig costats, ad expaded keys. There are oly 4 maskig costats that ca be easily built ito combiatio logic, ad expaded keys are ukow before the roud, thus we have o desig choices about them. But there are as may as 13 roud-costats of 32-bit legth, ad oe should decide if he will build wired logic out of pre-computed roud-costats, or make the desig compute the costats from the specified equatio at each clock cycle. Computatio of costats i fully parallelized desig like Figure 8 eeds at least 4 32-bit adders ad two 32-bit registers. O the other had usig wired logic for the 13 costats itroduces a little loger propagatio delay i key computatio. Because two-roud model aims at small area, implemetatio with wired logic is chose. User Geerate s for Ecryptio Modify s for Decryptio EeL0 EdL0 EeH0 EdH0 if (roud == 0) EeL0, else EL (128 of 2:1 mux) if (roud == 0) EdL0, else EL (128 of 2:1 mux) if (roud < 2) EeH0, else EH (128 of 2:1 mux) if (roud < 2 ) EeH0, else EH (128 of 2:1 mux) the First 4 s for ecryptio the first 4 s for decryptio the Latter 4 s for ecryptio the Latter 4 s for decryptio exteel extedl if (ecryptio) exteel, else extedl (128 of 2:1 mux) exteeh extedh if (ecryptio) exteeh, else extedh (128 of 2:1 mux) register at the risig clock edge of eve rouds EL EL Eve Roud Eve Roud for Ecryptio for Decryptio Ka Kb if (ecryptio) Ka, else Kb (128 of 2:1 mux) register at the risig clock edge of odd rouds EH Odd Roud for Ecryptio Kc EH Odd Roud for Decryptio Kd if (ecryptio) Kc, else Kd (128 of 2:1 mux) Eve Roud Odd Roud Figure 8. Schedulig Module of Two-Roud Model. The most uwieldy part i CRYPTON might be the oliear substitutio, γ. γ cosists of byte-wise substitutio o a 4 4 byte array, i other words etry tables. As Figure 7(a) shows, ecryptio module must have two γ s because γ is distict for odd rouds ad eve rouds. I additio, key schedulig module has - 8 -

9 two γ s for user key expasio, which is used oly oce whe a ew user key is set. Seeig that most of the other operatios takes comparatively small area, icorporatig separate γ s for key expasio looks very mismatched for its rare usage. This lack of balace ca be corrected by sharig γ s betwee the expasio block of key schedulig module ad roud trasformatio block of ecryptio module. This scheme effectively reduces the total area but eeds oe more clock cycle oly for key expasio whe ew expaded keys are to be computed. Figure 7(b) shows the ew block diagram for ecryptio module with γ sharig. I Figure 7(b), outputs Ur ad Vr are fed to the iputs of two 128-bit registers ad stored i them. Those stored values i the registers are used repeatedly for geeratio of roud keys uless a ew user key is set. The key schedulig area also has chage i expasio block. schedulig area takes U ad V [6] the outputs of expaded key registers as iputs, ad performs operatios later tha γ. Combiig ecryptio module ad key schedulig module, the whole ecryptio block is built as show i Figure 9. I Figure 9, Cycle0 is a sigal idicatig the key expasio cycle, ad Cycle1 tells the first cycle amog 7 is goig o. Both sigals are used as cotrol iputs for multiplexers. DATA_IN USER_KEY CLK Separate ito U,V Cycle0 U V Ecryptio Module DATA_OUT Cycle1 Ur 128-bit Register Vr 128-bit Register Ke Ko RESET START Cotrol Module Roud[2:0] U' V' DECR Schedule Module DONE Figure 9. Hierarchical View of Two-Roud Model. 4.2 Full-Roud Model To make a fast desig, pipeliig or loop urollig ca be geerally applied. I pipeliig, the algorithm is partitioed ito several stages, ad this eables several data blocks to be ecrypted simultaeously. This is possible i ECB mode because each result of block ecryptio is idepedet from others. But modes except ECB require the previous result of ecryptio to be available to complete the preset ecryptio, which makes pipeliig useless. Sice most of the recet applicatio of block ciphers use chaiig or feedback mode, speed ehacemet through pipeliig is ot cosidered here. Istead we make full-roud model with 12 rouds fully urolled. This full-roud model computes 12 rouds of trasformatio without loopig, ad it is the fastest but the largest desig amog those exploitig loop-urollig. The loop urollig of 4 or 6 will have just itermediate values of area ad speed betwee those of two-roud model ad full-roud model. Block diagrams for full-roud model are straightforward ad show i Figure 10, Figure 11 ad Figure

10 Iput : 4 32-bit words K0 Byte Substistutio, odd Bit Permutatio, odd Byte Traspositio K1 Byte Substistutio, eve Bit Permutatio, eve Byte Traspositio K2 Byte Substistutio, eve Bit Permutatio, eve Byte Traspositio K12 Byte Traspositio Bit Permutatio, eve Byte Traspositio Output : 4 32-bit words Figure 10. Ecryptio Module of Full-Roud Model. XOR for User Expasio Expasio for Decryptio Kd0 Roud-0 Roud-1 Roud-0 Roud-1 Ke0 MUX K0 Ke0 s s Ke1 Kd0 s s Kd1 Kd1 Roud-2 Roud-3 Roud-2 Roud-3 Ke1 MUX K1 Ke2 Ke4 Ke6 Roud-4 Roud-6 s s s s s s Roud-5 Roud-7 Ke3 Ke5 Ke7 Kd2 Kd4 Kd6 Roud-4 Roud-6 s s s s s s Roud-5 Roud-7 Kd3 Kd5 Kd7 Kd12 Ke12 MUX K12 Roud-8 Roud-9 Roud-8 Roud-9 Ke8 s s Ke9 Kd8 s s Kd9 Roud-10 Roud-11 Roud-10 Roud-11 Ke10 s Ke11 Kd10 s Kd11 Roud-12 Roud-12 Ke12 Kd12 Figure 11. Schedulig Model of Full-Roud Model

11 I key schedulig module of Figure 11, roud-key output K for the th roud will be Ke if ecryptio is performed, ad Kd if decryptio is performed. To achieve high speed, commo logic sharig i expasio block is ot adopted here. Full-roud model tells us area cost of the early pure algorithm because it does t eed ay registers or cotrol uit. But the 12 multiplexers i key schedulig area could ot be removed. INPUT DATA USER KEY ENC/DEC Schedulig Module K0 K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 K11 K12 Ecryptio Module OUTPUT DATA Figure 12. Hierarchical View of Full-Roud Model. 5. Estimatio Results Our estimatios are all based o Syopsys DesigCompiler ad Hyudai 0.35 μm gate array library. Table 1 shows area ad speed of two-roud model ad full-roud model each with two speed-area tradeoffs. I this paper, all estimatios are for typical cases. Gate meas a 2-iput NAND gate, equivalet to 4 trasistors. Table 1. Speed ad Area Estimatios of Hardware CRYPTON. Two-roud model Full-roud model Estimated items Area critical Speed critical Area critical Speed critical Gate cout total gates 18,322 28,179 46,259 93,929 (Cell Area) ecryptio module 8,267 17,958 33,598 74,857 key schedulig module 7,930 8,078 12,661 19,072 Miimum clock period (T clk ) s s s s setup time o the fly o the fly s 7.91 s Time to switch keys s s s 6.13 s Time to ecrypt oe block s s s s (7 T clk ) (7 T clk ) Throughput 898Mbps * 1.66Gbps ** 1.61Gbps ** 2.69Gbps ** * 1Mbps = 1,024 1,024 bps = 1,048,576 bps ** 1Gbps = 1,024 1,024 1,024 bps = 1,073,741,824 bps The two results for two-roud model have idetical logic desig, but are differet oly i optimizatio strategies. The oe idicated as Area Critical was optimized to reduce as much area as possible. O the other had, the result of Speed Critical was obtaied by miimizig the clock period. As show i Time to switch keys, oe clock is spet o expadig the user key whe the user key is switched to a ew value. Sice the desig uses oe clock source for its whole area, the time to switch keys will be at least miimum clock period, but the actual propagatio delay i key expasio is less tha the miimum clock period. Roud keys are geerated o the fly, ad if the user keys remai same, 7 clock cycles are eeded for both ecryptio ad

12 decryptio. The same optimizatio strategies were applied to full-roud model. However, i speed critical optimizatio, the path from data iput to the fial output was optimized rather tha miimizig the clock period as i tworoud model. Although results of full-roud model were etered ito the same format of table with two-roud model, the umbers should ot be compared directly because the two desigs have differet architectures. The followig three explaatios qualify the meaig of each item for full-roud model. Time to switch keys: time from the isertio of a ew user key to the geeratio of the roud 0 key (this is because time take to compute the ext roud key is much shorter tha time to compute oe roud trasformatio. By the time γ operatio of 1 st roud has bee performed, all roud keys from 1 st roud to 12 th roud are available). setup time: time eeded to compute the whole 12 roud keys. Time to ecrypt oe block: The logest path delay from data iput to the fial data output (it is the time take to ecrypt oe block whe all roud keys are set up ad ready to be used). I two-roud model, Total Area is larger tha the sum of ecryptio module ad key schedulig module because area of buffer for expaded keys ad cotrol uit is missig. Full-roud model does t have ay extra logic except ecryptio module ad key schedulig module, ad thus the sum is exactly matched. Two mai issues i logic desig are speed ad area. Because speed ad area are objects of tradeoff i most cases, we ca get the best results o oe criterio by optimizig for it while igorig the other. O the other had, the result is the worst oe for the igored criterio. We ca fid the approximate upper ad lower bouds of speed ad area of CRYPTON by oce optimizig for area ad the for speed. The mai trade-off betwee space ad time takes place i optimizatio of S-box. We could obtai as high speed as 2.6Gbps by growig the area of S-box, but the total umber of gates was over 90,000. Resortig to speed-area tradeoff, two-roud model faster tha full-roud model was possible cotrary to our iitial scheme of usig loop urollig to make a fast desig. I our estimatio, two-roud model was foud more moderate ad practical both i area ad speed tha full-roud model. Although optimizatio i the sythesis tool was very useful to achieve a goal i speed or area, a better result was possible by modifyig the desig itself. 6. Coclusios I this paper, we desiged ad proposed two hardware architectures of CRYPTON exploitig the iheret hardware-friedly features of CRYPTON. The architectures were described i VHDL ad circuits were sythesized usig 0.35 μm gate array with several speed-area tradeoffs. 0.9 Gbps with the smallest area of 18,000 gates ad the fastest speed of 2.6 Gbps with less tha 100,000 gates could be achieved. The speed of 2.6 Gbps is faster tha the commercially available fastest Triple-DES chip with a order of magitude. This is eough speed to support the Gigabit etworks. Sice CRYPTON has good scalability i gate cout, a desiger ca select a proper speed-area tradeoff from the large set choices

13 Refereces 1. L. Geppert ad W. Sweet, Techology 1999 Aalysis & Forcast Commuicatios, IEEE Spectrum, Jauary 1999, pp D. Kosiur, Buildig ad Maagig Virtual Private Networks, Wiley, Aalog Devices, ADSP-2141L SafeNet DSP Prelimiary Techical Data Sheet, REV.PrA, February VLSI Techology, VMS115 IPSec Coprocessor Data Sheet, Rev2.0, Jauary C.H. Lim, CRYPTON: A New 128-bit Block Cipher, Proceedigs of the First Advaced Ecryptio Stadard Cadidate Coferece, (Vetura, Califoria), Natioal Istitute of Stadards ad Techology (NIST), August C.H. Lim, A Revised Versio of CRYPTON CRYPTON Versio 1.0, Proceedigs of the 1999 Fast Software Ecryptio Workshop, March NIST, Request for Cadidate Algorithm Nomiatios for the Advaced Ecryptio Stadard (AES), Federal Register, Vol.62, No.177, September 12, M. Smid ad E. Roback, Developig the Advaced Ecryptio Stadard, Proceedigs of the 1999 RSA Coferece, Jauary E. Biham, A Note o Comparig the AES Cadidates, Proceedigs of the Secod Advaced Ecryptio Stadard Cadidate Coferece, (Rome, Italy), Natioal Istitute of Stadards ad Techology (NIST)", March C.S.K. Clapp, Istructio-level Parallelism i AES Cadidates, Proceedigs of the Secod Advaced Ecryptio Stadard Cadidate Coferece, (Rome, Italy), Natioal Istitute of Stadards ad Techology (NIST)", March B. Scheier, et. al., Performace Compariso of the AES Submissios, Proceedigs of the Secod Advaced Ecryptio Stadard Cadidate Coferece, (Rome, Italy), Natioal Istitute of Stadards ad Techology (NIST)", March

Hardware Design and Performance Estimation of the 128-bit Block Cipher CRYPTON

Hardware Design and Performance Estimation of the 128-bit Block Cipher CRYPTON Hardware Desig ad Performace Estimatio of the 128-bit Block Cipher CRYPTON Eujog Hog, Jai-Hoo Chug, ad Chae Hoo Lim Future Systems, Ic. 372-2 Yagjae-Dog, Seocho-Ku, Seoul, Korea 137-130 E-mail: {ejhog,

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

Lecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

Lecture 2. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram Lecture 2 RTL Desig Methodology Trasitio from Pseudocode & Iterface to a Correspodig Block Diagram Structure of a Typical Digital Data Iputs Datapath (Executio Uit) Data Outputs System Cotrol Sigals Status

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Message Integrity and Hash Functions. TELE3119: Week4

Message Integrity and Hash Functions. TELE3119: Week4 Message Itegrity ad Hash Fuctios TELE3119: Week4 Outlie Message Itegrity Hash fuctios ad applicatios Hash Structure Popular Hash fuctios 4-2 Message Itegrity Goal: itegrity (ot secrecy) Allows commuicatig

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Lecture 3. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram

Lecture 3. RTL Design Methodology. Transition from Pseudocode & Interface to a Corresponding Block Diagram Lecture 3 RTL Desig Methodology Trasitio from Pseudocode & Iterface to a Correspodig Block Diagram Structure of a Typical Digital Data Iputs Datapath (Executio Uit) Data Outputs System Cotrol Sigals Status

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Private Key Cryptography. TELE3119: Week2

Private Key Cryptography. TELE3119: Week2 Private Key Cryptography TELE3119: Week2 Private Key Ecryptio Also referred to as: covetioal ecryptio symmetric key ecryptio secret-key or sigle-key ecryptio Oly alterative before public-key ecryptio i

More information

Algorithm. Counting Sort Analysis of Algorithms

Algorithm. Counting Sort Analysis of Algorithms Algorithm Coutig Sort Aalysis of Algorithms Assumptios: records Coutig sort Each record cotais keys ad data All keys are i the rage of 1 to k Space The usorted list is stored i A, the sorted list will

More information

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Course Site:   Copyright 2012, Elsevier Inc. All rights reserved. Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 ) EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Primitive polynomials selection method for pseudo-random number generator

Primitive polynomials selection method for pseudo-random number generator Joural of hysics: Coferece Series AER OEN ACCESS rimitive polyomials selectio method for pseudo-radom umber geerator To cite this article: I V Aiki ad Kh Alajjar 08 J. hys.: Cof. Ser. 944 0003 View the

More information

K-NET bus. When several turrets are connected to the K-Bus, the structure of the system is as showns

K-NET bus. When several turrets are connected to the K-Bus, the structure of the system is as showns K-NET bus The K-Net bus is based o the SPI bus but it allows to addressig may differet turrets like the I 2 C bus. The K-Net is 6 a wires bus (4 for SPI wires ad 2 additioal wires for request ad ackowledge

More information

Behavioral Modeling in Verilog

Behavioral Modeling in Verilog Behavioral Modelig i Verilog COE 202 Digital Logic Desig Dr. Muhamed Mudawar Kig Fahd Uiversity of Petroleum ad Mierals Presetatio Outlie Itroductio to Dataflow ad Behavioral Modelig Verilog Operators

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

The Simeck Family of Lightweight Block Ciphers

The Simeck Family of Lightweight Block Ciphers The Simeck Family of Lightweight Block Ciphers Gagqiag Yag, Bo Zhu, Valeti Suder, Mark D. Aagaard, ad Guag Gog Electrical ad Computer Egieerig, Uiversity of Waterloo Sept 5, 205 Yag, Zhu, Suder, Aagaard,

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

1. SWITCHING FUNDAMENTALS

1. SWITCHING FUNDAMENTALS . SWITCING FUNDMENTLS Switchig is the provisio of a o-demad coectio betwee two ed poits. Two distict switchig techiques are employed i commuicatio etwors-- circuit switchig ad pacet switchig. Circuit switchig

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea

FPGA IMPLEMENTATION OF BASE-N LOGARITHM. Salvador E. Tropea FPGA IMPLEMENTATION OF BASE-N LOGARITHM Salvador E. Tropea Electróica e Iformática Istituto Nacioal de Tecología Idustrial Bueos Aires, Argetia email: salvador@iti.gov.ar ABSTRACT I this work, we preset

More information

Algorithms Chapter 3 Growth of Functions

Algorithms Chapter 3 Growth of Functions Algorithms Chapter 3 Growth of Fuctios Istructor: Chig Chi Li 林清池助理教授 chigchi.li@gmail.com Departmet of Computer Sciece ad Egieerig Natioal Taiwa Ocea Uiversity Outlie Asymptotic otatio Stadard otatios

More information

A Note on Least-norm Solution of Global WireWarping

A Note on Least-norm Solution of Global WireWarping A Note o Least-orm Solutio of Global WireWarpig Charlie C. L. Wag Departmet of Mechaical ad Automatio Egieerig The Chiese Uiversity of Hog Kog Shati, N.T., Hog Kog E-mail: cwag@mae.cuhk.edu.hk Abstract

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Chapter 4 The Datapath

Chapter 4 The Datapath The Ageda Chapter 4 The Datapath Based o slides McGraw-Hill Additioal material 24/25/26 Lewis/Marti Additioal material 28 Roth Additioal material 2 Taylor Additioal material 2 Farmer Tae the elemets that

More information

EE123 Digital Signal Processing

EE123 Digital Signal Processing Last Time EE Digital Sigal Processig Lecture 7 Block Covolutio, Overlap ad Add, FFT Discrete Fourier Trasform Properties of the Liear covolutio through circular Today Liear covolutio with Overlap ad add

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO Sagwo Seo, Trevor Mudge Advaced Computer Architecture Laboratory Uiversity of Michiga at A Arbor {swseo, tm}@umich.edu Yumig Zhu, Chaitali

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

Project 2.5 Improved Euler Implementation

Project 2.5 Improved Euler Implementation Project 2.5 Improved Euler Implemetatio Figure 2.5.10 i the text lists TI-85 ad BASIC programs implemetig the improved Euler method to approximate the solutio of the iitial value problem dy dx = x+ y,

More information

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 8 Strigs ad Vectors Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Slide 8-3 8.1 A Array Type for Strigs A Array Type for Strigs C-strigs ca be used to represet strigs

More information

Random Graphs and Complex Networks T

Random Graphs and Complex Networks T Radom Graphs ad Complex Networks T-79.7003 Charalampos E. Tsourakakis Aalto Uiversity Lecture 3 7 September 013 Aoucemet Homework 1 is out, due i two weeks from ow. Exercises: Probabilistic iequalities

More information

Cubic Polynomial Curves with a Shape Parameter

Cubic Polynomial Curves with a Shape Parameter roceedigs of the th WSEAS Iteratioal Coferece o Robotics Cotrol ad Maufacturig Techology Hagzhou Chia April -8 00 (pp5-70) Cubic olyomial Curves with a Shape arameter MO GUOLIANG ZHAO YANAN Iformatio ad

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 201 Heaps 201 Goodrich ad Tamassia xkcd. http://xkcd.com/83/. Tree. Used with permissio uder

More information

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

ECE4050 Data Structures and Algorithms. Lecture 6: Searching ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

LU Decomposition Method

LU Decomposition Method SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS LU Decompositio Method Jamie Traha, Autar Kaw, Kevi Marti Uiversity of South Florida Uited States of America kaw@eg.usf.edu http://umericalmethods.eg.usf.edu Itroductio

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must

More information

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015 15-859E: Advaced Algorithms CMU, Sprig 2015 Lecture #2: Radomized MST ad MST Verificatio Jauary 14, 2015 Lecturer: Aupam Gupta Scribe: Yu Zhao 1 Prelimiaries I this lecture we are talkig about two cotets:

More information

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION 397 AN OPTIMIZATION NETWORK FOR MATRIX INVERSION Ju-Seog Jag, S~ Youg Lee, ad Sag-Yug Shi Korea Advaced Istitute of Sciece ad Techology, P.O. Box 150, Cheogryag, Seoul, Korea ABSTRACT Iverse matrix calculatio

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Automatic Generation of Polynomial-Basis Multipliers in GF (2 n ) using Recursive VHDL

Automatic Generation of Polynomial-Basis Multipliers in GF (2 n ) using Recursive VHDL Automatic Geeratio of Polyomial-Basis Multipliers i GF (2 ) usig Recursive VHDL J. Nelso, G. Lai, A. Teca Abstract Multiplicatio i GF (2 ) is very commoly used i the fields of cryptography ad error correctig

More information

Speeding-up dynamic programming in sequence alignment

Speeding-up dynamic programming in sequence alignment Departmet of Computer Sciece Aarhus Uiversity Demark Speedig-up dyamic programmig i sequece aligmet Master s Thesis Dug My Hoa - 443 December, Supervisor: Christia Nørgaard Storm Pederse Implemetatio code

More information

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 8 Strigs ad Vectors Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Copyright 2015 Pearso Educatio, Ltd..

More information

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods.

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods. Software developmet of compoets for complex sigal aalysis o the example of adaptive recursive estimatio methods. SIMON BOYMANN, RALPH MASCHOTTA, SILKE LEHMANN, DUNJA STEUER Istitute of Biomedical Egieerig

More information

Computers and Scientific Thinking

Computers and Scientific Thinking Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

Security of Bluetooth: An overview of Bluetooth Security

Security of Bluetooth: An overview of Bluetooth Security Versio 2 Security of Bluetooth: A overview of Bluetooth Security Marjaaa Träskbäck Departmet of Electrical ad Commuicatios Egieerig mtraskba@cc.hut.fi 52655H ABSTRACT The purpose of this paper is to give

More information

Term Project Report. This component works to detect gesture from the patient as a sign of emergency message and send it to the emergency manager.

Term Project Report. This component works to detect gesture from the patient as a sign of emergency message and send it to the emergency manager. CS2310 Fial Project Loghao Li Term Project Report Itroductio I this project, I worked o expadig exercise 4. What I focused o is makig the real gesture recogizig sesor ad desig proper gestures ad recogizig

More information

The following algorithms have been tested as a method of converting an I.F. from 16 to 512 MHz to 31 real 16 MHz USB channels:

The following algorithms have been tested as a method of converting an I.F. from 16 to 512 MHz to 31 real 16 MHz USB channels: DBE Memo#1 MARK 5 MEMO #18 MASSACHUSETTS INSTITUTE OF TECHNOLOGY HAYSTACK OBSERVATORY WESTFORD, MASSACHUSETTS 1886 November 19, 24 Telephoe: 978-692-4764 Fax: 781-981-59 To: From: Mark 5 Developmet Group

More information

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures COMP 633 - Parallel Computig Lecture 2 August 24, 2017 : The PRAM model ad complexity measures 1 First class summary This course is about parallel computig to achieve high-er performace o idividual problems

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

Math Section 2.2 Polynomial Functions

Math Section 2.2 Polynomial Functions Math 1330 - Sectio. Polyomial Fuctios Our objectives i workig with polyomial fuctios will be, first, to gather iformatio about the graph of the fuctio ad, secod, to use that iformatio to geerate a reasoably

More information

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS Prosejit Bose Evagelos Kraakis Pat Mori Yihui Tag School of Computer Sciece, Carleto Uiversity {jit,kraakis,mori,y

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits

Reversible Realization of Quaternary Decoder, Multiplexer, and Demultiplexer Circuits Egieerig Letters, :, EL Reversible Realizatio of Quaterary Decoder, Multiplexer, ad Demultiplexer Circuits Mozammel H.. Kha, Member, ENG bstract quaterary reversible circuit is more compact tha the correspodig

More information

Filter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design

Filter design. 1 Design considerations: a framework. 2 Finite impulse response (FIR) filter design Filter desig Desig cosideratios: a framework C ı p ı p H(f) Aalysis of fiite wordlegth effects: I practice oe should check that the quatisatio used i the implemetatio does ot degrade the performace of

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS SIAM J. SCI. COMPUT. Vol. 22, No. 6, pp. 2113 2134 c 21 Society for Idustrial ad Applied Mathematics FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS ZHAO ZHANG AND XIAODONG ZHANG

More information

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,

More information

UNIVERSITY OF MORATUWA

UNIVERSITY OF MORATUWA UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016

More information

Fundamentals of. Chapter 1. Microprocessor and Microcontroller. Dr. Farid Farahmand. Updated: Tuesday, January 16, 2018

Fundamentals of. Chapter 1. Microprocessor and Microcontroller. Dr. Farid Farahmand. Updated: Tuesday, January 16, 2018 Fudametals of Chapter 1 Microprocessor ad Microcotroller Dr. Farid Farahmad Updated: Tuesday, Jauary 16, 2018 Evolutio First came trasistors Itegrated circuits SSI (Small-Scale Itegratio) to ULSI Very

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 4 Procedural Abstractio ad Fuctios That Retur a Value Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 4.1 Top-Dow Desig 4.2 Predefied Fuctios 4.3 Programmer-Defied Fuctios 4.4

More information

Neuro Fuzzy Model for Human Face Expression Recognition

Neuro Fuzzy Model for Human Face Expression Recognition IOSR Joural of Computer Egieerig (IOSRJCE) ISSN : 2278-0661 Volume 1, Issue 2 (May-Jue 2012), PP 01-06 Neuro Fuzzy Model for Huma Face Expressio Recogitio Mr. Mayur S. Burage 1, Prof. S. V. Dhopte 2 1

More information

Spectral leakage and windowing

Spectral leakage and windowing EEL33: Discrete-Time Sigals ad Systems Spectral leakage ad widowig. Itroductio Spectral leakage ad widowig I these otes, we itroduce the idea of widowig for reducig the effects of spectral leakage, ad

More information