Low-Area Implementations of SHA-3 Candidates

Size: px
Start display at page:

Download "Low-Area Implementations of SHA-3 Candidates"

Transcription

1 Jens-Peter Cryptographic Engineering Research Group (CERG) Department of ECE, Volgenau School of IT&E, George Mason University, Fairfax, VA, USA SHA-3 Project Review Meeting

2 Outline 2 3

3 Motivation Motivation Assumptions Interface Minimization Technique There have been several comparison of Throughput/Area optimized implementations [Gaj],[Matsuo],[Baldwin],[Guo]. Only few low-area implementations of single SHA-3 algorithms on FPGAs. Not all fully autonomous, Varying interface assumptions. Low-area implementations highlight flexibility of algorithm designs.

4 Motivation Motivation Assumptions Interface Minimization Technique There have been several comparison of Throughput/Area optimized implementations [Gaj],[Matsuo],[Baldwin],[Guo]. Only few low-area implementations of single SHA-3 algorithms on FPGAs. Not all fully autonomous, Varying interface assumptions. Low-area implementations highlight flexibility of algorithm designs. Goal First comprehensive comparison of low-area implementations of Round 2 SHA-3 Candidates. All use the same standardized interface. All optimized for the same parameters under the same assumptions.

5 Assumptions Motivation Assumptions Interface Minimization Technique Implementing for minimum area alone can lead to unrealistic run-times. Goal: Achieve the maximum Throughput/Area ratio for a given area budget. Realistic scenario: System on Chip: Certain area only available. Standalone: Smaller Chip, lower cost, but limit to smallest chip available, e.g. 768 slices on smallest Spartan 3 FPGA.

6 Assumptions Motivation Assumptions Interface Minimization Technique Target Implementing for minimum area alone can lead to unrealistic run-times. Goal: Achieve the maximum Throughput/Area ratio for a given area budget. Realistic scenario: System on Chip: Certain area only available. Standalone: Smaller Chip, lower cost, but limit to smallest chip available, e.g. 768 slices on smallest Spartan 3 FPGA. Xilinx Spartan 3e, low cost FPGA family Budget: 5 slices, Block RAM (BRAM)

7 Interface Motivation Assumptions Interface Minimization Technique Based on Interface and I/O Protocol from [Gaj], w=6. msg len ap, seq len ap (after padding ) in 32-bit words. msg len bp, seq len bp (before padding) in bits. n 2 msg len bp = seq len ap i 32 + seq len bp n i= n msg len ap = seq len ap i 32 w i= clk clk rst rst SHA Core w din dout src_ready dst_ready src_read dst_write w bits msg_len_ap msg_len_bp message w bits seq_len_ap seq seq_len_ap seq seq_len_ap n seq_len_bp n seq n a)sha Interface b)sha Protocol

8 Minimization Technique Motivation Assumptions Interface Minimization Technique Datapath Use BRAM to store state, initialization vectors, constants Use BRAM in each clock cycle Avoid temporary storage or use: Free registers, i.e. unused flip-flops after LUTs Shift Registers (x6 bit / Distributed RAM (x6 bit) LUT = 2 Slice Control Logic Small main state machine, up-to 8 states Counter for clock cycles in longest state Stored Program Control within states BRAM addressing must follow regular sequence, can have offset between rounds

9 Block RAM Motivation Assumptions Interface Minimization Technique BRAM Mode=write_first BRAM Mode=read_first data_in data_out data_in data_out Limits We use exclusively read first mode, i.e. old value is read, new value is written. Saves clock cycles, however, leads to address offset. Control logic might become difficult. Maximum 2 input and 2 output ports, 2 addresses (dual port). Maximum single port w/ 64 bits or dual port w/ 32 bits each.

10 BLAKE A B G G G G2 G G G G G2 G G2 G2 G G2 G G2 G2 G G2 G2 G G2 G G2 G G2 G G G G G G2 G2 G2 G2 Smallest implementation would be 2 G-function BRAM contention. Best result: 2 G-functions, pipelined as shown above. Keeps data in-flight longer, eliminates BRAM contention However, generation of addresses difficult Large control logic.

11 CubeHash Port A BRAM DRam B 3 Port B <<< 7 DRam A din Reg <<< DRam C dout Reg Initialization vector pre-processed stored in BRAM. BRAM contention requires swapping data between BRAM and Distributed RAM (DRAM). This requires additional clock cycles. Leads to simpler control logic. Finalization very costly at 9,296 clock cycles.

12 ECHO Port A BRAM 3 Port B reg A S Box S Box S Box S Box din 3 65 reg B MixCol key key dout 6 5 Mix-Columns contains 2 Mix-Column units with total 8x8 bit register. 4 logic based S-Boxes with integrated pipeline stage. Faster Mix-Columns would exceed 5 slices. Key Generation uses DRAM, 32 bit adder. Allows store salt. Small control unit with room for improvement. 3 init

13 Fugue Port A BRAM Port B S Box S Box S Box S Box Reg 32 >>>8 >>>6 >>>24 SMIX SMIX SMIX SMIX dout din 32 Reg logic based S-Boxes followed by pipeline stage. Only fixed rotations. Most time consuming function: SMIX. Challenge to store the S-Mix table.

14 Grøstl 6 5 dout din 3 dram Reg counter Port A BRAM Port B 7 S Box 7 S Box 3 Reg Reg Reg Reg mix mix Serialized P and Q. Only 2 S-Boxes due to Shift Rows can only use 2x8 bits. Uses single set of registers for two Mix Column units. 6 / 32 bit datapath 7 clock cycles per column.

15 JH input 5 5 reg 5 dout DRAM DRAM 7 7 Port A BRAM Port B rc_d grp_dgrp r_d 3 3 Uses BRAM and two 32x8 DRAMs. Grouping and de-grouping are the most expensive operations. 6 clock cycle operation. multiple narrow memory accesses.

16 Luffa 3 dout 5 6 Port A BRAM Port B MixCol reg reg reg 2 reg 3 din Scrumb reg DROM Reg Message injection uses serialized XORs. SubCrumb is implemented as DROM. Constraints: DRAM had to be used to keep control logic simple.

17 Shabal din Reg Port B 32 BRAM Port A SR2CP SR6 M <<<7 <<<7 dout W SR2 A [] UU new A BXOR [3] [9] [6] SR6 B new B Based on paper by [Detrey]. BRAM contains state register C and initialization vectors. Our I/O is more complex than [Detrey]. BRAM makes controller more complex.

18 SHAvite-3 dout Port A BRAM Port B reg reg reg 2 reg 3 S Box S Box S Box S Box dob reg 3 din 6 2 Reg S Box 5 4 ROM based S-Boxes. Regular path of the mds matrix is an advantage. Mix column is realized through a shift register.

19 SHA-2 Full Datapath 7 slices 65 clock cycles

20 SHA-2 Full Datapath 7 slices 65 clock cycles Small Datapath 52 slices 595 clock cycles

21 SHA-2 Full Datapath 7 slices 65 clock cycles Small Datapath 52 slices 595 clock cycles SHA-2 is not well suited for very small implementations.

22 Performance Equations Performance Equations Implementation Implementation Comparison Block Size (bits) b Clock Cycles to hash N blocks clk = Throughput b (l + p) T Algorithm st +( l + p) N + end BLAKE ( ) N /( 52 T ) CubeHash ( ) N /( 928 T ) ECHO ( ) N /(2545 T ) Fugue ( 2+ 6) N /( 63 T ) Grøstl ( 32+2) N /(52 T ) JH ( ) N /(66 T ) Luffa ( 6+ 66) N /( 622 T ) Shabal ( ) N /( 8 T ) SHAvite ( ) N /( 68 T ) SHA ( ) N /( 595 T )

23 Implementation Performance Equations Implementation Implementation Comparison Area (slices) Block RAMs Maximum Delay (ns) T Throughput (Mbps) Throughput/ Area (Mbps/slice) Throughput (Mbps) Throughput/ Area (Mbps/slice) Algorithm Large m Small m BLAKE CubeHash ECHO Fugue Grøstl JH Luffa Shabal SHAvite SHA

24 for Large Messages Performance Equations Implementation Implementation Comparison 8 Shabal 6 Throughtput [Mbits/s] 4 2 SHAvite 3 BLAKE Groestl ECHO Luffa CubeHash JH Fugue SHA Area [Slices]

25 for Short Messages Performance Equations Implementation Implementation Comparison 6 x BLAKE CubeHash ECHO Fugue Groestl JH Luffa Shabal SHAvite 3 CubeHash JH Latency (ns) 3 Fugue Luffa ECHO 2 BLAKE SHAvite Message Size (bytes) Groestl Shabal

26 Performance Equations Implementation Implementation Comparison Control Logic vs. Datapath 6 5 Datapath Controller 4 Area (Slices) 3 2 CubeHash JH Luffa Groestl Fugue SHAvite 3 Shabal ECHO SHA 2 BLAKE

27 Implementation Comparison Performance Equations Implementation Implementation Comparison Reference Area (slices) Block RAMs Maximum Delay (ns) I/O Width Datapath Width Clock Cycles (l + p) Functionality Throughput (Mbps) Throughput/ Area (Mbps/slice) Algorithm Device BLAKE-32 [?] xc3s5 FA BLAKE-32 TW xc3se FA BMW [?] xcv3 FA 9.. CubeHash [TW] xc3se FA 28.. ECHO [TW] xc3se FA ECHO [?] xc5vlx5-2 FA Fugue [TW] xc3se FA Grøstl [?] xc3s5 FA Grøstl [TW] xc3se FA JH [TW] xc3se FA Keccak [?] xc5vlx5 FA 7..6 Luffa [TW] xc3se FA Shabal [?] xc3s2 FA 8.6 Shabal [TW] xc3se FA Shavite-3 [TW] xc3se FA

28 Performance Equations Implementation Implementation Comparison Comparison of Candidate 2 GMU Others.5 V 2 Throughput/Area V 5.5 V 5 V 3 BLAKE 32 BMW Cubehash ECHO Fugue Groestl JH Keccak Luffa Shabal Shavite 3 SHA 2

29 Best Candidate Performance Equations Implementation Implementation Comparison 2 Virtex Spartan.5 Throughput/Area.5 BMW ECHO Keccak SHA 2 Blake CubeHash Fugue Groestl JH Luffa ShabalSHAvite 3

30 Performance Equations Implementation Implementation Comparison Thanks for your attention.

Lightweight Implementations of SHA-3 Candidates on FPGAs

Lightweight Implementations of SHA-3 Candidates on FPGAs Lightweight of SHA-3 Candidates on FPGAs Jens-Peter Kaps Panasayya Yalla Kishore Kumar Surapathi Bilal Habib Susheel Vadlamudi Smriti Gurung John Pham Cryptographic Engineering Research Group (CERG) http://cryptography.gmu.edu

More information

Implementation & Benchmarking of Padding Units & HMAC for SHA-3 candidates in FPGAs & ASICs

Implementation & Benchmarking of Padding Units & HMAC for SHA-3 candidates in FPGAs & ASICs Implementation & Benchmarking of Padding Units & HMAC for SHA-3 candidates in FPGAs & ASICs Ambarish Vyas Cryptographic Engineering Research Group (CERG) http://cryptography.gmu.edu Department of ECE,

More information

Fair and Comprehensive Methodology for Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates using FPGAs

Fair and Comprehensive Methodology for Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates using FPGAs Fair and Comprehensive Methodology for Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates using FPGAs Kris Gaj, Ekawat Homsirikamol, and Marcin Rogawski ECE Department, George Mason

More information

Use of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates

Use of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates Use of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates Malik Umar Sharif, Rabia Shahid, Marcin Rogawski and Kris Gaj George Mason University, USA Agenda SHA-3 High Speed

More information

Use of Embedded FPGA Resources in Implementa:ons of 14 Round 2 SHA- 3 Candidates

Use of Embedded FPGA Resources in Implementa:ons of 14 Round 2 SHA- 3 Candidates Use of Embedded FPGA Resources in Implementa:ons of 14 Round 2 SHA- 3 Candidates Kris Gaj, Rabia Shahid, Malik Umar Sharif, and Marcin Rogawski George Mason University U.S.A. Co-Authors Rabia Shahid Malik

More information

ECE 545 Lecture 8b. Hardware Architectures of Secret-Key Block Ciphers and Hash Functions. George Mason University

ECE 545 Lecture 8b. Hardware Architectures of Secret-Key Block Ciphers and Hash Functions. George Mason University ECE 545 Lecture 8b Hardware Architectures of Secret-Key Block Ciphers and Hash Functions George Mason University Recommended reading K. Gaj and P. Chodowiec, FPGA and ASIC Implementations of AES, Chapter

More information

Vivado HLS Implementation of Round-2 SHA-3 Candidates

Vivado HLS Implementation of Round-2 SHA-3 Candidates Farnoud Farahmand ECE 646 Fall 2015 Vivado HLS Implementation of Round-2 SHA-3 Candidates Introduction NIST announced a public competition on November 2007 to develop a new cryptographic hash algorithm,

More information

Sharing Resources Between AES and the SHA-3 Second Round Candidates Fugue and Grøstl

Sharing Resources Between AES and the SHA-3 Second Round Candidates Fugue and Grøstl Sharing Resources Between AES and the SHA-3 Second Round Candidates Fugue and Grøstl Kimmo Järvinen Department of Information and Computer Science Aalto University, School of Science and Technology Espoo,

More information

GMU SHA Core Interface & Hash Function Performance Metrics

GMU SHA Core Interface & Hash Function Performance Metrics GMU SHA Core Interface & Hash Function Performance Metrics Interface Why Interface Matters? Pin limit Total number of i/o ports Total number of an FPGA i/o pins Support for the maximum throughput Time

More information

A High-Speed Unified Hardware Architecture for AES and the SHA-3 Candidate Grøstl

A High-Speed Unified Hardware Architecture for AES and the SHA-3 Candidate Grøstl A High-Speed Unified Hardware Architecture for AES and the SHA-3 Candidate Grøstl Marcin Rogawski Kris Gaj Cryptographic Engineering Research Group (CERG) http://cryptography.gmu.edu Department of ECE,

More information

Lightweight Implementations of SHA-3 Candidates on FPGAs

Lightweight Implementations of SHA-3 Candidates on FPGAs Lightweight Implementations of SHA-3 Candidates on FPGAs Jens-Peter Kaps, Panasayya Yalla, Kishore Kumar Surapathi, Bilal Habib, Susheel Vadlamudi, Smriti Gurung, and John Pham ECE Department, George Mason

More information

ECE 545. Digital System Design with VHDL

ECE 545. Digital System Design with VHDL ECE 545 Digital System Design with VHDL Course web page: ECE web page Courses Course web pages ECE 545 http://ece.gmu.edu/coursewebpages/ece/ece545/f10/ Kris Gaj Research and teaching interests: Contact:

More information

GMU SHA Core Interface & Hash Function Performance Metrics Interface

GMU SHA Core Interface & Hash Function Performance Metrics Interface GMU SHA Core Interface & Hash Function Performance Metrics Interface 1 Why Interface Matters? Pin limit Total number of i/o ports Total number of an FPGA i/o pins Support for the maximum throughput Time

More information

Environment for Fair and Comprehensive Performance Evalua7on of Cryptographic Hardware and So=ware. ASIC Status Update

Environment for Fair and Comprehensive Performance Evalua7on of Cryptographic Hardware and So=ware. ASIC Status Update Environment for Fair and Comprehensive Performance Evalua7on of Cryptographic Hardware and So=ware ASIC Status Update ECE Department, Virginia Tech Faculty - Patrick Schaumont, Leyla Nazhandali Students

More information

Two Hardware Designs of BLAKE-256 Based on Final Round Tweak

Two Hardware Designs of BLAKE-256 Based on Final Round Tweak Two Hardware Designs of BLAKE-256 Based on Final Round Tweak Muh Syafiq Irsyadi and Shuichi Ichikawa Dept. Knowledge-based Information Engineering Toyohashi University of Technology, Hibarigaoka, Tempaku,

More information

Groestl Tweaks and their Effect on FPGA Results

Groestl Tweaks and their Effect on FPGA Results Groestl Tweaks and their Effect on FPGA Results Marcin Rogawski and Kris Gaj George Mason University {kgaj, mrogawsk}@gmu.edu Abstract. In January 2011, Groestl team published tweaks to their specification

More information

Evaluation of Hardware Performance for the SHA-3 Candidates Using SASEBO-GII

Evaluation of Hardware Performance for the SHA-3 Candidates Using SASEBO-GII Evaluation of Hardware Performance for the SHA-3 Candidates Using SASEBO-GII Kazuyuki Kobayashi 1, Jun Ikegami 1, Shin ichiro Matsuo 2, Kazuo Sakiyama 1 and Kazuo Ohta 1 1 The University of Electro-Communications,

More information

SHA3 Core Specification. Author: Homer Hsing

SHA3 Core Specification. Author: Homer Hsing SHA3 Core Specification Author: Homer Hsing homer.hsing@gmail.com Rev. 0.1 January 29, 2013 This page has been intentionally left blank. www.opencores.org Rev 0.1 ii Rev. Date Author Description 0.1 01/29/2013

More information

Version 2.0, November 11, Stefan Tillich, Martin Feldhofer, Mario Kirschbaum, Thomas Plos, Jörn-Marc Schmidt, and Alexander Szekely

Version 2.0, November 11, Stefan Tillich, Martin Feldhofer, Mario Kirschbaum, Thomas Plos, Jörn-Marc Schmidt, and Alexander Szekely High-Speed Hardware Implementations of BLAKE, Blue Midnight Wish, CubeHash, ECHO, Fugue, Grøstl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-3, SIMD, and Skein Version 2.0, November 11, 2009 Stefan Tillich,

More information

Federal standards NIST FIPS 46-1 DES FIPS 46-2 DES. FIPS 81 Modes of. operation. FIPS 46-3 Triple DES FIPS 197 AES. industry.

Federal standards NIST FIPS 46-1 DES FIPS 46-2 DES. FIPS 81 Modes of. operation. FIPS 46-3 Triple DES FIPS 197 AES. industry. ECE 646 Lecture 12 Federal Secret- cryptography Banking International Cryptographic Standards NIST FIPS 46-1 DES FIPS 46-2 DES FIPS 81 Modes of operation FIPS 46-3 Triple DES FIPS 197 AES X3.92 DES ANSI

More information

ECE 646 Lecture 12. Cryptographic Standards. Secret-key cryptography standards

ECE 646 Lecture 12. Cryptographic Standards. Secret-key cryptography standards ECE 646 Lecture 12 Cryptographic Standards Secret-key cryptography Federal Banking International NIST FIPS 46-1 DES FIPS 46-2 DES FIPS 81 Modes of operation FIPS 46-3 Triple DES FIPS 197 AES X3.92 DES

More information

Efficient Hardware Implementations of High Throughput SHA-3 Candidates Keccak, Luffa and Blue Midnight Wish for Single- and Multi-Message Hashing

Efficient Hardware Implementations of High Throughput SHA-3 Candidates Keccak, Luffa and Blue Midnight Wish for Single- and Multi-Message Hashing Efficient Hardware Implementations of High Throughput SHA-3 Candidates Keccak, Luffa and Blue Midnight Wish for Single- and Multi-Message Hashing Abdulkadir Akın, Aydın Aysu, Onur Can Ulusel, and Erkay

More information

SHA-3 interoperability

SHA-3 interoperability SHA-3 interoperability Daniel J. Bernstein Department of Computer Science (MC 152) The University of Illinois at Chicago Chicago, IL 60607 7053 djb@cr.yp.to 1 Introduction: You thought software upgrades

More information

Compact Hardware Implementations of ChaCha, BLAKE, Threefish, and Skein on FPGA

Compact Hardware Implementations of ChaCha, BLAKE, Threefish, and Skein on FPGA Compact Hardware Implementations of ChaCha, BLAKE, Threefish, and Skein on FPGA Nuray At, Jean-Luc Beuchat, Eiji Okamoto, İsmail San, and Teppei Yamazaki Department of Electrical and Electronics Engineering,

More information

ECE 545 Fall 2010 Exam 1

ECE 545 Fall 2010 Exam 1 ECE 545 Fall 2010 Exam 1 Introduction & tasks: The SHA-1 (Secure Hash Algorithm-1) circuit is specified below using its a. pseudocode b. block diagram of one of its units called Message Scheduler c. top-level

More information

Compact implementations of Grøstl, JH and Skein for FPGAs

Compact implementations of Grøstl, JH and Skein for FPGAs Compact implementations of Grøstl, JH and Skein for FPGAs Bernhard Jungk Hochschule RheinMain University of Applied Sciences Wiesbaden Rüsselsheim Geisenheim bernhard.jungk@hs-rm.de Abstract. This work

More information

AES as A Stream Cipher

AES as A Stream Cipher > AES as A Stream Cipher < AES as A Stream Cipher Bin ZHOU, Kris Gaj, Department of ECE, George Mason University Abstract This paper presents implementation of advanced encryption standard (AES) as a stream

More information

On Optimized FPGA Implementations of the SHA-3 Candidate Grøstl

On Optimized FPGA Implementations of the SHA-3 Candidate Grøstl On Optimized FPGA Implementations of the SHA-3 Candidate Grøstl Bernhard Jungk, Steffen Reith, and Jürgen Apfelbeck Fachhochschule Wiesbaden University of Applied Sciences {jungk reith}@informatik.fh-wiesbaden.de

More information

ECE 545 Fall 2013 Final Exam

ECE 545 Fall 2013 Final Exam ECE 545 Fall 2013 Final Exam Problem 1 Develop an ASM chart for the circuit EXAM from your Midterm Exam, described below using its A. pseudocode B. table of input/output ports C. block diagram D. interface

More information

Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates Using FPGAs

Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates Using FPGAs Comparing Hardware Performance of Fourteen Round Two SHA-3 Candidates Using FPGAs Ekawat Homsirikamol Marcin Rogawski Kris Gaj George Mason University ehomsiri, mrogawsk, kgaj@gmu.edu Last revised: Decemer

More information

CDA 4253 FPGA System Design Op7miza7on Techniques. Hao Zheng Comp S ci & Eng Univ of South Florida

CDA 4253 FPGA System Design Op7miza7on Techniques. Hao Zheng Comp S ci & Eng Univ of South Florida CDA 4253 FPGA System Design Op7miza7on Techniques Hao Zheng Comp S ci & Eng Univ of South Florida 1 Extracted from Advanced FPGA Design by Steve Kilts 2 Op7miza7on for Performance 3 Performance Defini7ons

More information

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs

ECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs ECE 645: Lecture Basic Adders and Counters Implementation of Adders in FPGAs Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 5, Basic Addition and Counting,

More information

Implementation and Comparative Analysis of AES as a Stream Cipher

Implementation and Comparative Analysis of AES as a Stream Cipher Implementation and Comparative Analysis of AES as a Stream Cipher Bin ZHOU, Yingning Peng Dept. of Electronic Engineering, Tsinghua University, Beijing, China, 100084 e-mail: zhoubin06@mails.tsinghua.edu.cn

More information

Hardware Benchmarking of Cryptographic Algorithms Using High-Level Synthesis Tools: The SHA-3 Contest Case Study

Hardware Benchmarking of Cryptographic Algorithms Using High-Level Synthesis Tools: The SHA-3 Contest Case Study Hardware Benchmarking of Cryptographic Algorithms Using High-Level Synthesis Tools: The SHA-3 Contest Case Study Ekawat Homsirikamol and Kris Gaj Volgenau School of Engineering George Mason University

More information

FPGA Matrix Multiplier

FPGA Matrix Multiplier FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri

More information

C vs. VHDL: Benchmarking CAESAR Candidates Using High- Level Synthesis and Register- Transfer Level Methodologies

C vs. VHDL: Benchmarking CAESAR Candidates Using High- Level Synthesis and Register- Transfer Level Methodologies C vs. VHDL: Benchmarking CAESAR Candidates Using High- Level Synthesis and Register- Transfer Level Methodologies Ekawat Homsirikamol, William Diehl, Ahmed Ferozpuri, Farnoud Farahmand, and Kris Gaj George

More information

A Zynq-based Testbed for the Experimental Benchmarking of Algorithms Competing in Cryptographic Contests

A Zynq-based Testbed for the Experimental Benchmarking of Algorithms Competing in Cryptographic Contests A Zynq-based Testbed for the Experimental Benchmarking of Algorithms Competing in Cryptographic Contests Farnoud Farahmand, Ekawat Homsirikamol, and Kris Gaj George Mason University Fairfax, Virginia 22030

More information

Hardware Implementation of the Code-based Key Encapsulation Mechanism using Dyadic GS Codes (DAGS)

Hardware Implementation of the Code-based Key Encapsulation Mechanism using Dyadic GS Codes (DAGS) Hardware Implementation of the Code-based Key Encapsulation Mechanism using Dyadic GS Codes (DAGS) Viet Dang and Kris Gaj ECE Department George Mason University Fairfax, VA, USA Introduction to DAGS The

More information

Compact Implementation of Threefish and Skein on FPGA

Compact Implementation of Threefish and Skein on FPGA Compact Implementation of Threefish and Skein on FPGA Nuray At, Jean-Luc Beuchat, and İsmail San Department of Electrical and Electronics Engineering, Anadolu University, Eskişehir, Turkey Email: {nat,

More information

Comparison of the Hardware Performance of the AES Candidates Using Reconfigurable Hardware

Comparison of the Hardware Performance of the AES Candidates Using Reconfigurable Hardware Comparison of the Hardware Performance of the AES Candidates Using Reconfigurable Hardware Master s Thesis Pawel Chodowiec MS CpE Candidate, ECE George Mason University Advisor: Dr. Kris Gaj, ECE George

More information

FPGA Implementations of SHA-3 Candidates: CubeHash, Grøstl, LANE, Shabal and Spectral Hash

FPGA Implementations of SHA-3 Candidates: CubeHash, Grøstl, LANE, Shabal and Spectral Hash FPGA Implementations of SHA-3 Candidates: CubeHash, Grøstl, LANE, Shabal and Spectral Hash Brian Baldwin, Andrew Byrne, Mark Hamilton, Neil Hanley, Robert P. McEvoy, Weibo Pan and William P. Marnane Claude

More information

Implementation and Analysis of the PRIMATEs Family of Authenticated Ciphers

Implementation and Analysis of the PRIMATEs Family of Authenticated Ciphers Implementation and Analysis of the PRIMATEs Family of Authenticated Ciphers Ahmed Ferozpuri Abstract Lightweight devices used for encrypted communication require a scheme that can operate in a low resource

More information

GMU Hardware API for Authen4cated Ciphers

GMU Hardware API for Authen4cated Ciphers GMU Hardware API for Authen4cated Ciphers Ekawat Homsirikamol, William Diehl, Ahmed Ferozpuri, Farnoud Farahmand, Malik Umar Sharif, and Kris Gaj George Mason University USA http:/cryptography.gmu.edu

More information

On the parallelization of slice-based Keccak implementations on Xilinx FPGAs

On the parallelization of slice-based Keccak implementations on Xilinx FPGAs On the parallelization of slice-based Keccak implementations on Xilinx FPGAs Jori Winderickx, Joan Daemen and Nele Mentens KU Leuven, ESAT/COSIC & iminds, Leuven, Belgium STMicroelectronics Belgium & Radboud

More information

FPGA Implementation of High Speed AES Algorithm for Improving The System Computing Speed

FPGA Implementation of High Speed AES Algorithm for Improving The System Computing Speed FPGA Implementation of High Speed AES Algorithm for Improving The System Computing Speed Vijaya Kumar. B.1 #1, T. Thammi Reddy.2 #2 #1. Dept of Electronics and Communication, G.P.R.Engineering College,

More information

Low-cost hardware implementations of Salsa20 stream cipher in programmable devices

Low-cost hardware implementations of Salsa20 stream cipher in programmable devices Journal of Polish Safety and Reliability Association Summer Safety and Reliability Seminars, Volume 4, Number 1, 2013 Wrocław University of Technology, Wrocław, Poland Low-cost hardware implementations

More information

Use of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates

Use of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates Use of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates Malik Umar Sharif, Rabia Shahid, Marcin Rogawski, Kris Gaj Abstract In this paper, we present results of the comprehensive

More information

INTRODUCTION TO FPGA ARCHITECTURE

INTRODUCTION TO FPGA ARCHITECTURE 3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)

More information

A Low-Area yet Performant FPGA Implementation of Shabal

A Low-Area yet Performant FPGA Implementation of Shabal A Low-Area yet Performant FPGA Implementation of Shabal Jérémie Detrey, Pierrick Gaudry, and Karim Khalfallah 2 CARAMEL project-team, LORIA, INRIA / CNRS / Nancy Université, Campus Scientifique, BP 239,

More information

High-Speed Hardware for NTRUEncrypt-SVES: Lessons Learned Malik Umar Sharif, and Kris Gaj George Mason University USA

High-Speed Hardware for NTRUEncrypt-SVES: Lessons Learned Malik Umar Sharif, and Kris Gaj George Mason University USA High-Speed Hardware for NTRUEncrypt-SVES: Lessons Learned Malik Umar Sharif, and Kris Gaj George Mason University USA Partially supported by NIST under grant no. 60NANB15D058 1 Co-Author Malik Umar Sharif

More information

Pushing the Limits of SHA-3 Hardware Implementations to Fit on RFID

Pushing the Limits of SHA-3 Hardware Implementations to Fit on RFID Motivation Keccak Our Designs Results Comparison Conclusions 1 / 24 Pushing the Limits of SHA-3 Hardware Implementations to Fit on RFID Peter Pessl and Michael Hutter Motivation Keccak Our Designs Results

More information

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices

Basic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices 3 Digital Systems Implementation Programmable Logic Devices Basic FPGA Architectures Why Programmable Logic Devices (PLDs)? Low cost, low risk way of implementing digital circuits as application specific

More information

Compact Dual Block AES core on FPGA for CCM Protocol

Compact Dual Block AES core on FPGA for CCM Protocol Compact Dual Block AES core on FPGA for CCM Protocol João Carlos C. Resende Ricardo Chaves 1 Compact Dual Block AES core on FPGA for CCM Protocol João CC Resende & Ricardo Chaves Outline Introduction &

More information

A Lightweight Implementation of Keccak Hash Function for Radio-Frequency Identification Applications

A Lightweight Implementation of Keccak Hash Function for Radio-Frequency Identification Applications A Lightweight Implementation of Keccak Hash Function for Radio-Frequency Identification Applications Elif Bilge Kavun and Tolga Yalcin Department of Cryptography Institute of Applied Mathematics, METU

More information

problem maximum score 1 8pts 2 6pts 3 10pts 4 15pts 5 12pts 6 10pts 7 24pts 8 16pts 9 19pts Total 120pts

problem maximum score 1 8pts 2 6pts 3 10pts 4 15pts 5 12pts 6 10pts 7 24pts 8 16pts 9 19pts Total 120pts University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences EECS150 J. Wawrzynek Spring 2010 3/31/09 Name: ID number: Midterm Exam This is a closed-book,

More information

Hardware Performance Evaluation of SHA-3 Candidate Algorithms

Hardware Performance Evaluation of SHA-3 Candidate Algorithms Journal of Information Security, 2012, 3, 69-76 http://dx.doi.org/10.4236/jis.2012.32008 Published Online April 2012 (http://www.scirp.org/journal/jis) Hardware Performance Evaluation of SHA-3 Candidate

More information

Introduction to Field Programmable Gate Arrays

Introduction to Field Programmable Gate Arrays Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Historical introduction.

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 15 Memories

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 15 Memories EE 459/500 HDL Based Digital Design with Programmable Logic Lecture 15 Memories 1 Overview Introduction Memories Read Only Memories Random Access Memories FIFOs 2 1 Motivation Most applications need memory!

More information

EECS150 - Digital Design Lecture 16 - Memory

EECS150 - Digital Design Lecture 16 - Memory EECS150 - Digital Design Lecture 16 - Memory October 17, 2002 John Wawrzynek Fall 2002 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: data & program storage general purpose registers buffering table lookups

More information

Fast implementation and fair comparison of the final candidates for Advanced Encryption Standard using Field Programmable Gate Arrays

Fast implementation and fair comparison of the final candidates for Advanced Encryption Standard using Field Programmable Gate Arrays Kris Gaj and Pawel Chodowiec Electrical and Computer Engineering George Mason University Fast implementation and fair comparison of the final candidates for Advanced Encryption Standard using Field Programmable

More information

Hardware Architectures

Hardware Architectures Hardware Architectures Secret-key Cryptography Public-key Cryptography Cryptanalysis AES & AES candidates estream candidates Hash Functions SHA-3 Montgomery Multipliers ECC cryptosystems Pairing-based

More information

Jaap van Ginkel Security of Systems and Networks

Jaap van Ginkel Security of Systems and Networks Jaap van Ginkel Security of Systems and Networks November 17, 2016 Part 3 Modern Crypto SSN Modern Cryptography Hashes MD5 SHA Secret key cryptography AES Public key cryptography DES Presentations Minimum

More information

Benchmarking of Cryptographic Algorithms in Hardware. Ekawat Homsirikamol & Kris Gaj George Mason University USA

Benchmarking of Cryptographic Algorithms in Hardware. Ekawat Homsirikamol & Kris Gaj George Mason University USA Benchmarking of Cryptographic Algorithms in Hardware Ekawat Homsirikamol & Kris Gaj George Mason University USA 1 Co-Author Ekawat Homsirikamol a.k.a Ice Working on the PhD Thesis entitled A New Approach

More information

Fast implementations of secret-key block ciphers using mixed inner- and outer-round pipelining

Fast implementations of secret-key block ciphers using mixed inner- and outer-round pipelining Pawel Chodowiec, Po Khuon, Kris Gaj Electrical and Computer Engineering George Mason University Fast implementations of secret-key block ciphers using mixed inner- and outer-round pipelining http://ece.gmu.edu/crypto-text.htm

More information

Field Programmable Gate Array (FPGA)

Field Programmable Gate Array (FPGA) Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems

More information

Topics. Midterm Finish Chapter 7

Topics. Midterm Finish Chapter 7 Lecture 9 Topics Midterm Finish Chapter 7 Xilinx FPGAs Chapter 7 Spartan 3E Architecture Source: Spartan-3E FPGA Family Datasheet CLB Configurable Logic Blocks Each CLB contains four slices Each slice

More information

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010

Digital Logic & Computer Design CS Professor Dan Moldovan Spring 2010 Digital Logic & Computer Design CS 434 Professor Dan Moldovan Spring 2 Copyright 27 Elsevier 5- Chapter 5 :: Digital Building Blocks Digital Design and Computer Architecture David Money Harris and Sarah

More information

ALTERA M9K EMBEDDED MEMORY BLOCKS

ALTERA M9K EMBEDDED MEMORY BLOCKS ALTERA M9K EMBEDDED MEMORY BLOCKS M9K Overview M9K memories are Altera s embedded high-density memory arrays Nearly all modern FPGAs include something similar of varying sizes 8192 bits per block (9216

More information

EECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructor: John Wawrzynek. Lecture 18 EE141

EECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructor: John Wawrzynek. Lecture 18 EE141 EECS 151/251A Spring 2019 Digital Design and Integrated Circuits Instructor: John Wawrzynek Lecture 18 Memory Blocks Multi-ported RAM Combining Memory blocks FIFOs FPGA memory blocks Memory block synthesis

More information

ECE 699: Lecture 9. Programmable Logic Memories

ECE 699: Lecture 9. Programmable Logic Memories ECE 699: Lecture 9 Programmable Logic Memories Recommended reading XST User Guide for Virtex-6, Spartan-6, and 7 Series Devices Chapter 7, HDL Coding Techniques Sections: RAM HDL Coding Techniques ROM

More information

Outline of Presentation

Outline of Presentation Built-In Self-Test for Programmable I/O Buffers in FPGAs and SoCs Sudheer Vemula and Charles Stroud Electrical and Computer Engineering Auburn University presented at 2006 IEEE Southeastern Symp. On System

More information

Hardware Implementation of Cryptosystem by AES Algorithm Using FPGA

Hardware Implementation of Cryptosystem by AES Algorithm Using FPGA Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

Topic #6. Processor Design

Topic #6. Processor Design Topic #6 Processor Design Major Goals! To present the single-cycle implementation and to develop the student's understanding of combinational and clocked sequential circuits and the relationship between

More information

AES Core Specification. Author: Homer Hsing

AES Core Specification. Author: Homer Hsing AES Core Specification Author: Homer Hsing homer.hsing@gmail.com Rev. 0.1.1 October 30, 2012 This page has been intentionally left blank. www.opencores.org Rev 0.1.1 ii Revision History Rev. Date Author

More information

HDL Coding Style Xilinx, Inc. All Rights Reserved

HDL Coding Style Xilinx, Inc. All Rights Reserved HDL Coding Style Objective After completing this module, you will be able to: Select a proper coding style to create efficient FPGA designs Specify Xilinx resources that need to be instantiated for various

More information

Programmable Logic. Simple Programmable Logic Devices

Programmable Logic. Simple Programmable Logic Devices Programmable Logic SM098 Computation Structures - Programmable Logic Simple Programmable Logic evices Programmable Array Logic (PAL) AN-OR arrays are common blocks in SPL and CPL architectures Implements

More information

HIGH DATA RATE 8-BIT CRYPTO PROCESSOR

HIGH DATA RATE 8-BIT CRYPTO PROCESSOR HIGH DATA RATE 8-BIT CRYPTO PROCESSOR Sheikh M Farhan, Habibullah Jamal, Mohsin Rahmatullah University of Engineering and Technology, Taxila, Pakistan smfarhan@carepvtltd.com, (+92-51-2874794), 19-Ataturk

More information

Implementation of Pipelined Architecture Based on the DCT and Quantization For JPEG Image Compression

Implementation of Pipelined Architecture Based on the DCT and Quantization For JPEG Image Compression Volume 01, No. 01 www.semargroups.org Jul-Dec 2012, P.P. 60-66 Implementation of Pipelined Architecture Based on the DCT and Quantization For JPEG Image Compression A.PAVANI 1,C.HEMASUNDARA RAO 2,A.BALAJI

More information

CHAPTER 4 BLOOM FILTER

CHAPTER 4 BLOOM FILTER 54 CHAPTER 4 BLOOM FILTER 4.1 INTRODUCTION Bloom filter was formulated by Bloom (1970) and is used widely today for different purposes including web caching, intrusion detection, content based routing,

More information

Compact FPGA Implementations of the Five SHA-3 Finalists

Compact FPGA Implementations of the Five SHA-3 Finalists Compact FPGA Implementations of the Five SHA-3 Finalists Stéphanie Kerckhof 1,François Durvaux 1, Nicolas Veyrat-Charvillon 1, Francesco Regazzoni 1, Guerric Meurice de Dormale 2,andFrançois-Xavier Standaert

More information

CAESAR Hardware API. Ekawat Homsirikamol, William Diehl, Ahmed Ferozpuri, Farnoud Farahmand, Panasayya Yalla, Jens-Peter Kaps, and Kris Gaj

CAESAR Hardware API. Ekawat Homsirikamol, William Diehl, Ahmed Ferozpuri, Farnoud Farahmand, Panasayya Yalla, Jens-Peter Kaps, and Kris Gaj CAESAR Hardware API Ekawat Homsirikamol, William Diehl, Ahmed Ferozpuri, Farnoud Farahmand, Panasayya Yalla, Jens-Peter Kaps, and Kris Gaj Cryptographic Engineering Research Group George Mason University

More information

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items

Core Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items (ULFFT) November 3, 2008 Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E-mail: info@dilloneng.com URL: www.dilloneng.com Core

More information

Design and Implementation of Multi-Rate Encryption Unit Based on Customized AES

Design and Implementation of Multi-Rate Encryption Unit Based on Customized AES International Journal of Video & Image Processing and Network Security IJVIPNS-IJENS Vol: 11 No: 06 6 Design and Implementation of Multi-Rate Encryption Unit Based on Customized AES Ashraf D. Elbayoumy,

More information

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses

Today. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses Today Comments about assignment 3-43 Comments about assignment 3 ASICs and Programmable logic Others courses octor Per should show up in the end of the lecture Mealy machines can not be coded in a single

More information

ECE 448 Lecture 13. FPGA Memories. George Mason University

ECE 448 Lecture 13. FPGA Memories. George Mason University ECE 448 Lecture 13 FPGA Memories George Mason University Recommended reading Spartan-6 FPGA Block RAM Resources: User Guide Google search: UG383 Spartan-6 FPGA Configurable Logic Block: User Guide Google

More information

An 80Gbps FPGA Implementation of a Universal Hash Function based Message Authentication Code

An 80Gbps FPGA Implementation of a Universal Hash Function based Message Authentication Code An 8Gbps FPGA Implementation of a Universal Hash Function based Message Authentication Code Abstract We developed an architecture optimization technique called divide-and-concatenate and applied it to

More information

Fundamentals of Linux Platform Security

Fundamentals of Linux Platform Security Fundamentals of Linux Platform Security Security Training Course Dr. Charles J. Antonelli The University of Michigan 2012 Linux Platform Security Module 2 Password Authentication Roadmap Password Authentication

More information

PERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS

PERFORMANCE ANALYSIS OF HIGH EFFICIENCY LOW DENSITY PARITY-CHECK CODE DECODER FOR LOW POWER APPLICATIONS American Journal of Applied Sciences 11 (4): 558-563, 2014 ISSN: 1546-9239 2014 Science Publication doi:10.3844/ajassp.2014.558.563 Published Online 11 (4) 2014 (http://www.thescipub.com/ajas.toc) PERFORMANCE

More information

Pipelining & Verilog. Sequential Divider. Verilog divider.v. Math Functions in Coregen. Lab #3 due tonight, LPSet 8 Thurs 10/11

Pipelining & Verilog. Sequential Divider. Verilog divider.v. Math Functions in Coregen. Lab #3 due tonight, LPSet 8 Thurs 10/11 Lab #3 due tonight, LPSet 8 Thurs 0/ Pipelining & Verilog Latency & Throughput Pipelining to increase throughput Retiming Verilog Math Functions Debugging Hints Sequential Divider Assume the Divid (A)

More information

Hardware Design with VHDL Design Example: BRAM ECE 443

Hardware Design with VHDL Design Example: BRAM ECE 443 BRAM There are two sources of memory available on most FPGA boards. Internal (on-chip memory) External SRAMs and DRAMs. Internal memory is either distributed (from the LUTs) or block (dedicated on-chip

More information

The Xilinx XC6200 chip, the software tools and the board development tools

The Xilinx XC6200 chip, the software tools and the board development tools The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 4: Memory Hierarchy Memory Taxonomy SRAM Basics Memory Organization DRAM Basics Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering

More information

Low area implementation of AES ECB on FPGA

Low area implementation of AES ECB on FPGA Total AddRoundkey_3 MixCollumns AddRoundkey_ ShiftRows SubBytes 1 Low area implementation of AES ECB on FPGA Abstract This project aimed to create a low area implementation of the Rajindael cipher (AES)

More information

Binary Adders. Ripple-Carry Adder

Binary Adders. Ripple-Carry Adder Ripple-Carry Adder Binary Adders x n y n x y x y c n FA c n - c 2 FA c FA c s n MSB position Longest delay (Critical-path delay): d c(n) = n d carry = 2n gate delays d s(n-) = (n-) d carry +d sum = 2n

More information

EECS150 - Digital Design Lecture 16 Memory 1

EECS150 - Digital Design Lecture 16 Memory 1 EECS150 - Digital Design Lecture 16 Memory 1 March 13, 2003 John Wawrzynek Spring 2003 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: Whenever a large collection of state elements is required. data &

More information

Efficient Hardware Realization of Advanced Encryption Standard Algorithm using Virtex-5 FPGA

Efficient Hardware Realization of Advanced Encryption Standard Algorithm using Virtex-5 FPGA IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.9, September 2009 59 Efficient Hardware Realization of Advanced Encryption Standard Algorithm using Virtex-5 FPGA Muhammad

More information

Digital Signal Processing for Analog Input

Digital Signal Processing for Analog Input Digital Signal Processing for Analog Input Arnav Agharwal Saurabh Gupta April 25, 2009 Final Report 1 Objective The object of the project was to implement a Fast Fourier Transform. We implemented the Radix

More information

NIST SHA-3 ASIC Datasheet

NIST SHA-3 ASIC Datasheet NIST SHA-3 ASIC Datasheet -- NIST SHA-3 Competition Five Finalists on a Chip (Version 1.1) Technology: IBM MOSIS 0.13µm CMR8SF-RVT Standard-Cell Library: ARM s Artisan SAGE-X V2.0 Area: 5mm 2 (Core: 1.656mm

More information

ECE 448 Lecture 5. FPGA Devices

ECE 448 Lecture 5. FPGA Devices ECE 448 Lecture 5 FPGA Devices George Mason University Required reading Spartan-6 FPGA Configurable Logic Block: User Guide CLB Overview Slice Description 2 Recommended reading Highly recommended for the

More information