High-Performance Integer Factoring with Reconfigurable Devices
|
|
- Osborne Griffin
- 5 years ago
- Views:
Transcription
1 FPL 2010, Milan, August 31st September 2nd, 2010 High-Performance Integer Factoring with Reconfigurable Devices Ralf Zimmermann, Tim Güneysu, Christof Paar Horst Görtz Institute for IT-Security Ruhr-University Bochum
2 Outline Introduction Arithmetic using modern s ECM System Layout Results Conclusions & Future Work
3 Outline Introduction Arithmetic using modern s ECM System Layout Results Conclusions & Future Work
4 Introduction to Elliptic Curve Method Why ECM? Factor small numbers Low memory requirements Composite number ~ 200 bits ECM prime factor Nice to know but importance?
5 Introduction to Elliptic Curve Method Why ECM? Factor small numbers Low memory requirements Number Field Sieve RSA modulus 1024 Bits ECM ECM ECM ECM prime factors n = pq Nice to know but useful for co-factorization (NFS)
6 Existing implementations Software: GMP-ECM (P. Zimmermann et al.) Variant: GMP-EECM (Bernstein et al. 2008) GPUs: GPU-ECM (Bernstein et al. 2008) Simka et al. (FPL 2005) Montgomery Curves Virtex-E 2000 w/ embedded micro-controller Proof-of-concept implementation Proof of Concept Gaj et al. (CHES 2006, IEEE Transaction on Computers (Sep 2010)) Implementation on high and low cost s Suggestions to Phase 2 implementation 2nd Phase De Meulenaer et al. (FCCM 2007) Uses DSP hardcores to implement Phase 1 No implementation of Phase 2 (possible?)
7 How to implement ECM? ECM How? Phase 1: scalar multiplication Montgomery Ladder Phase 2: more complex Standard continuation in hardware (rough overview) Precomputations Scalar multiplications jq Storing primes in a memory Main computations point addition accumulation of product d
8 Design Goals and Decisions Usage of Elliptic Curve Method (ECM) in co-factorization Goals Implement both Phases 1 and 2 Usable on COPACOBANA v4 2nd Phase Host PC + COPACOBANA 1. Pre-Compute Initial values 2. Send Curve parameters, moduli 3. Compute final gcd Curves in Montgomery Form Efficient formulas for hardware Computation/storage of y-coordinate can be omitted
9 Outline Introduction Arithmetic using modern s ECM System Layout Results Conclusions & Future Work
10 Modern s Programmable elements Configurable Logic Blocks (CLB) Many, many (even more ) CLBs Connection via interconnect (switching matrix) Modern s contain hardcores Dedicated memory elements Arithmetic hardcores, e.g., to accelerate integer multiplication and addition Embedded PowerPC processors High-speed I/O Transceivers
11 Embedded Memory Elements (Virtex-4) 18kbit storage element (BlockRAM) 400 MHz with connected output register DINA DINB Flexible BRAM configuration Dual-Port storage Single-port storage RAM or ROM ADDRA WEA ADDRB WEB Cascading possible DOUTA DOUTB
12 Digital Signal Processing (Virtex-4) DSP Slice: two adjacent DSP Fast signed 18x18-bit multiplication and signed 48-bit addition/subtraction (400 MHz) Integrated pipeline register Controllable by an OPMODE signal DSP elements can be cascaded
13 Basic Modular Addition Input: Modulus M Operands A M, B M Output: A+B (mod M) A B M R = A+B S = A+B-M If (b = 1) then Return R Else Return S + 0 R borrow (b) S
14 Implemented Mod. Addition/Subtraction Addition of two b*17-bit numbers: b b clock cycles Two designs: one for Virtex-4 DSPs, one for new DSP types Non Virtex-4 Virtex-4 Flipflops Slices Input LUTs DSP 2 Cycles 2*b + 3 Frequency 400 MHz
15 Basic Modular Multiplication Modular Multiplication with Quotient Pipelining (Orup) Input: Modulus dependent par M (M, k, d), operands A, B Output: S A B R -1 (mod M) block selection of S i,0 shift by k bits block of k bits a 0 b 0 m 0 S i,0 a 1 aa 2 b 1 b 2 m 1 M m 2 a 3 a n b 3 b n m 3 m n S i,1 S i,2 i,j S i,3 S i,n i,4 // n is number of rounds for i = 0 to n do q i = S i (mod 2 k ) scalar multiplication 0 S i+1 = S i /2 k + q i M + b i A Return S n+1 accumulation
16 Multiplication: Parallel vs. Sequential Parallel approach: Use b DSPs to multiply two b*17 bit values (b+1) * 6 cycles Sequential approach: Use 3 DSPs to multiply two b*17 bit values 6 + (b+2)*b cycles Example: b=10 7 DSPs less at cost of 60 clock cycles DSP cycles
17 Implemented Modular Multiplication II Multiplication of two 17*b bit numbers: b * (b + 1) b clock cycles Requirements: 5 b; (b+1)-th block = 0 Memory alignment: 5 b 15 Virtex-4 Flipflops 131 Slices 73 4-Input LUTs 83 DSP 3 Cycles b*(b + 2) + 6 Frequency 400 MHz
18 Outline Introduction Arithmetic using modern s ECM System Layout Results Conclusions & Future Work
19 COPACOBANA: Original Architecture host backplane USB DIMM Module 1 controller card DIMM Module 2 Controller DIMM Module 20 l bus data address Backplane with plug-in slots can host up to 20 DIMM-sized modules 6 x low-cost Xilinx Spartan-3 s (XC3S1000) per module Shared 64-bit data and 16-bit address connection on backplane (bi-directional) Controller connects PC with s in a slow Master-Slave scheme (3 MBit/s)
20 ECM System on COPACOBANA Backplane Gigabit Ethernet Control modul Control DIMM modul 1 Virtex-4 DIMM modul 2 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4... DIMM modul 16 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Virtex-4 Data Address Modified COPACOBANA, planned 2006 Working with Virtex-4 SX35 s: 192 DSPs, 192 BRAM Data transfer using Ethernet Host Computer: performs (simple) pre-computations, i.e., curve generation COBACOBANA: performs cost intensive operations, i.e., elliptic curve arithmetic
21 ECM System Architecture
22 Elliptic Curve Processor
23 Workspace Memory Cell = 16 times 17-bit = 272 bit Block = 2 Cells = one Point Addressed by instruction ROM Fast in/out for arithmetic Units Needs 1 BlockRAM resource
24 Instructions 1 BlockRAM core using both ports 54 bits per instruction ( bits) 326 instructions for Phase 1 + Phase 2 Space for more
25 Outline Introduction Arithmetic using modern s ECM System Layout Results Conclusions & Future Work
26 Implementation of Phase Pre-computation using Addition Chain Reduce expensive Pt. Adds Use cheap Pt. Dbl No expensive Pt. Dbl + Add Gaj et al. This work 0 Pt. Dbl Pt. Add (D+A)* Improved/Fixed main computation Scanning algorithm Considering another special case
27 Scalability Related work - target bit-length? Scalable system Fixed layout Exchangeable ROM
28 Timing details Operation Cycles Factor Frequency (MHz) Total Cycles Top -> ECM P1 Pre: 2P P1 Step: 2P + (P+Q) ,312 P2 Pre: P+Q ,030 P2 Pre: 2P ,038 P2: Start Phase P2: Init Round ,480 P2: " " ,784 P2: " " ,226 P2: P+Q ,266 P2: End Phase sync results ECM -> Top Complete Run 1,969,878 Data transfer Phase 1 Phase 2 Data transfer Results for ECM Phase (b = 10, B1 = 960 and B2 = 57000)
29 Results for ECM Phase type Virtex-4 SX 35 Target bit-length 151 Number of ECM units 24 Maximal clock frequency 200 Number of instructions for ECM Phase ,969,878 Time for ECM Phase ms Number of ECM calculations per second 2424 ECM Phase using B1 = 960, B2 = [1376 bit k] 310,272 ECM calculations / second on COPACOBANA 1 1 Optimal COPACOBANA using 16 modules (= 128 )
30 Comparison Apple equals Orange? Different Families Different approaches Incomplete information (speed grade, packaging,..) Synthesis vs Place-and-Route results compare only Virtex 4 implementations Metric: resource utilization? Impossible without designs Let s This normalize does not work! then! Metric: cost-performance? Varing pricing models (EasyPath, volume discounts) Highly non-linear prices (V4LX200 vs V4LX35)
31 Comparison (ECM Phase 1) 198 bits vs 202 bits Arithmetic: Fabric vs. DSPs Improvement by factor 2.24 resp. 37
32 Comparison (ECM Phase 1) 135 bits vs 134 bits Fully pipelined vs non-pipelined DSP Arithmetic: Mult vs Mult+Add Phase 1 vs Phase 1+2: use ~59x parameter size
33 COPACOBANA v4 Results COPACOBANA v4: ECM/s using multiple s COPACOBANA v4: Efficiency using multiple s 80% % 60% % % 30% 20% Expected Achieved 10% 0% Efficiency Bottleneck: COPACOBANA v4 Interface
34 COPACOBANA v4 Bottleneck Bus Architecture Slow single master Bus no interrupts No parallel read/write operations Controller No fast Ethernet No PCI-Express USB with ~40 Mbps Processor Solution: Rivyera Slow CPU Additional host Power consumption, Network latency,
35 Outline Introduction Arithmetic using modern s Implementation ECM System Layout Conclusions & Future Work
36 Conclusions Novel and complete implementation Fastest implementation of phase on s Block parameter b = 10 Place & route results 2nd Phase Generic and scalable system Supports bit integers Exchange instruction ROM Only small changes in fabric/resources Highly parallel architecture for co-factorization Multiple ECM-units per (24 units on Virtex-4 SX 35) Sadly: COPACOBANA cluster (max 128 Virtex-4 SX 35) not suitable
37 Future Work Improve Phase 1 Sliding Window algorithm Enough memory available Modify the instruction ROM Improvement for extensive Phase 1 Evaluate different technologies: Virtex-4 SX $ 192 DSPs Virtex-6 LX130T 1000 $ 480 DSPs (up to 600 MHz) Spartan-6 LX $ 180 DSPs (up to 250 MHz) COPACOBANA v4 RIVYERA Fast data bus Two ring networks Intel i7 CPU
38 Thank you for your attention! Any Questions? Ralf Zimmermann, Tim Güneysu, Christof Paar
Enhancing COPACOBANA for Advanced Applications in Cryptography and Cryptanalysis
Enhancing COPACOBANA for Advanced Applications in Cryptography and Cryptanalysis Tim Güneysu, Christof Paar, Gerd Pfeiffer, Manfred Schimmler Horst Görtz Institute for IT Security, Ruhr University Bochum,
More informationECC on Your Fingertips: A Single Instruction Approach for Lightweight ECC Design in GF(p)
ECC on Your Fingertips: A Single Instruction Approach for Lightweight ECC Design in GF(p) Debapriya Basu Roy, Poulami Das and Debdeep Mukhopadhyay June 19, 2015 Debapriya Basu Roy ECC on Your Fingertips
More informationFPGA Implementation of High Throughput Circuit for Trial Division by Small Primes
FPGA Implementation of High Throughput Circuit for Trial Division by Small Primes Gabriel Southern, Chris Mason, Lalitha Chikkam, Patrick Baier, and Kris Gaj George Mason University {gsouther, cmason4,
More informationThe Next Generation 65-nm FPGA. Steve Douglass, Kees Vissers, Peter Alfke Xilinx August 21, 2006
The Next Generation 65-nm FPGA Steve Douglass, Kees Vissers, Peter Alfke Xilinx August 21, 2006 Hot Chips, 2006 Structure of the talk 65nm technology going towards 32nm Virtex-5 family Improved I/O Benchmarking
More informationCPE/EE 422/522. Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices. Dr. Rhonda Kay Gaede UAH. Outline
CPE/EE 422/522 Introduction to Xilinx Virtex Field-Programmable Gate Arrays Devices Dr. Rhonda Kay Gaede UAH Outline Introduction Field-Programmable Gate Arrays Virtex Virtex-E, Virtex-II, and Virtex-II
More informationBasic FPGA Architecture Xilinx, Inc. All Rights Reserved
Basic FPGA Architecture 2005 Xilinx, Inc. All Rights Reserved Objectives After completing this module, you will be able to: Identify the basic architectural resources of the Virtex -II FPGA List the differences
More informationField Programmable Gate Array (FPGA) Devices
Field Programmable Gate Array (FPGA) Devices 1 Contents Altera FPGAs and CPLDs CPLDs FPGAs with embedded processors ACEX FPGAs Cyclone I,II FPGAs APEX FPGAs Stratix FPGAs Stratix II,III FPGAs Xilinx FPGAs
More informationFPGA architecture and design technology
CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA
More informationBasic FPGA Architectures. Actel FPGAs. PLD Technologies: Antifuse. 3 Digital Systems Implementation Programmable Logic Devices
3 Digital Systems Implementation Programmable Logic Devices Basic FPGA Architectures Why Programmable Logic Devices (PLDs)? Low cost, low risk way of implementing digital circuits as application specific
More informationDigital Integrated Circuits
Digital Integrated Circuits Lecture 9 Jaeyong Chung Robust Systems Laboratory Incheon National University DIGITAL DESIGN FLOW Chung EPC6055 2 FPGA vs. ASIC FPGA (A programmable Logic Device) Faster time-to-market
More informationINTRODUCTION TO FPGA ARCHITECTURE
3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)
More informationFPGA Architecture Overview. Generic FPGA Architecture (1) FPGA Architecture
FPGA Architecture Overview dr chris dick dsp chief architect wireless and signal processing group xilinx inc. Generic FPGA Architecture () Generic FPGA architecture consists of an array of logic tiles
More informationProgrammable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures
Programmable Logic Design Grzegorz Budzyń Lecture 15: Advanced hardware in FPGA structures Plan Introduction PowerPC block RocketIO Introduction Introduction The larger the logical chip, the more additional
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationUse of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates
Use of Embedded FPGA Resources in Implementations of Five Round Three SHA-3 Candidates Malik Umar Sharif, Rabia Shahid, Marcin Rogawski, Kris Gaj Abstract In this paper, we present results of the comprehensive
More informationCOPACOBANA: RECONFIGURABLE COMPUTING IN CRYPTANALYSIS. Ben Johnstone
COPACOBANA: RECONFIGURABLE COMPUTING IN CRYPTANALYSIS Ben Johnstone Overview Goals Architecture DES Performance Conclusion What is COPACOBANA? Cost Optimized Parallel Code Breaker History Developed at
More informationAn Optimized Hardware Architecture for the Montgomery Multiplication Algorithm
An Optimized Hardware Architecture for the Montgomery Multiplication Algorithm Miaoqing Huang 1, Kris Gaj 2, Soonhak Kwon 3, Tarek El-Ghazawi 1 1 The George Washington University, Washington, D.C., U.S.A.
More informationAlgorithms and arithmetic for the implementation of cryptographic pairings
Cairn seminar November 29th, 2013 Algorithms and arithmetic for the implementation of cryptographic pairings Nicolas Estibals CAIRN project-team, IRISA Nicolas.Estibals@irisa.fr What is an elliptic curve?
More information! Program logic functions, interconnect using SRAM. ! Advantages: ! Re-programmable; ! dynamically reconfigurable; ! uses standard processes.
Topics! SRAM-based FPGA fabrics:! Xilinx.! Altera. SRAM-based FPGAs! Program logic functions, using SRAM.! Advantages:! Re-programmable;! dynamically reconfigurable;! uses standard processes.! isadvantages:!
More informationToday. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses
Today Comments about assignment 3-43 Comments about assignment 3 ASICs and Programmable logic Others courses octor Per should show up in the end of the lecture Mealy machines can not be coded in a single
More informationHigh-performance Elliptic Curve Cryptography by Using the CIOS Method for Modular Multiplication
High-performance Elliptic Curve Cryptography by Using the CIOS Method for Modular Multiplication Amine Mrabet, Nadia El-Mrabet, Ronan Lashermes, Jean-Baptiste Rigaud, Belgacem Bouallegue, Sihem Mesnager
More informationUnderstanding Cryptography by Christof Paar and Jan Pelzl. Chapter 9 Elliptic Curve Cryptography
Understanding Cryptography by Christof Paar and Jan Pelzl www.crypto-textbook.com Chapter 9 Elliptic Curve Cryptography ver. February 2nd, 2015 These slides were prepared by Tim Güneysu, Christof Paar
More informationHIGH PERFORMANCE ELLIPTIC CURVE CRYPTO-PROCESSOR FOR FPGA PLATFORMS
HIGH PERFORMANCE ELLIPTIC CURVE CRYPTO-PROCESSOR FOR FPGA PLATFORMS Debdeep Mukhopadhyay Dept. of Computer Science and Engg. IIT Kharagpur 3/6/2010 NTT Labs, Japan 1 Outline Elliptic Curve Cryptography
More informationParallel FIR Filters. Chapter 5
Chapter 5 Parallel FIR Filters This chapter describes the implementation of high-performance, parallel, full-precision FIR filters using the DSP48 slice in a Virtex-4 device. ecause the Virtex-4 architecture
More informationEITF35: Introduction to Structured VLSI Design
EITF35: Introduction to Structured VLSI Design Introduction to FPGA design Rakesh Gangarajaiah Rakesh.gangarajaiah@eit.lth.se Slides from Chenxin Zhang and Steffan Malkowsky WWW.FPGA What is FPGA? Field
More informationCollision Search for Elliptic Curve Discrete Logarithm over GF(2 m ) with FPGA
Collision Search for Elliptic Curve Discrete Logarithm over GF(2 m ) with FPGA Workshop on Cryptographic Hardware and Embedded Systems (CHES 2007) September 2007 Guerric Meurice de Dormale*, Philippe Bulens,
More informationImplementation of the rho, p-1 & the Elliptic Curve Methods of Factoring in Reconfigurable Hardware
Implementation of the rho, p-1 & the Elliptic Curve Methods of Factoring in Reconfigurable Hardware Kris Gaj Patrick Baier Soonhak Kwon Hoang Le Ramakrishna Bachimanchi Khaleeluddin Mohammed Paul Kohlbrenner
More informationUse of Embedded FPGA Resources in Implementa:ons of 14 Round 2 SHA- 3 Candidates
Use of Embedded FPGA Resources in Implementa:ons of 14 Round 2 SHA- 3 Candidates Kris Gaj, Rabia Shahid, Malik Umar Sharif, and Marcin Rogawski George Mason University U.S.A. Co-Authors Rabia Shahid Malik
More informationHigh Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx
High Capacity and High Performance 20nm FPGAs Steve Young, Dinesh Gaitonde August 2014 Not a Complete Product Overview Page 2 Outline Page 3 Petabytes per month Increasing Bandwidth Global IP Traffic Growth
More informationH100 Series FPGA Application Accelerators
2 H100 Series FPGA Application Accelerators Products in the H100 Series PCI-X Mainstream IBM EBlade H101-PCIXM» HPC solution for optimal price/performance» PCI-X form factor» Single Xilinx Virtex 4 FPGA
More informationTSEA44 - Design for FPGAs
2015-11-24 Now for something else... Adapting designs to FPGAs Why? Clock frequency Area Power Target FPGA architecture: Xilinx FPGAs with 4 input LUTs (such as Virtex-II) Determining the maximum frequency
More informationOutline. Field Programmable Gate Arrays. Programming Technologies Architectures. Programming Interfaces. Historical perspective
Outline Field Programmable Gate Arrays Historical perspective Programming Technologies Architectures PALs, PLDs,, and CPLDs FPGAs Programmable logic Interconnect network I/O buffers Specialized cores Programming
More informationReconfigurable Hardware Implementation of Mesh Routing in the Number Field Sieve Factorization
Reconfigurable Hardware Implementation of Mesh Routing in the Number Field Sieve Factorization Sashisu Bajracharya, Deapesh Misra, Kris Gaj George Mason University Tarek El-Ghazawi The George Washington
More informationMaster s Thesis Presentation Hoang Le Director: Dr. Kris Gaj
Master s Thesis Presentation Hoang Le Director: Dr. Kris Gaj Outline RSA ECM Reconfigurable Computing Platforms, Languages and Programming Environments Partitioning t ECM Code between HDLs and HLLs Implementation
More informationUser Manual for FC100
Sundance Multiprocessor Technology Limited User Manual Form : QCF42 Date : 6 July 2006 Unit / Module Description: IEEE-754 Floating-point FPGA IP Core Unit / Module Number: FC100 Document Issue Number:
More informationFPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011
FPGA for Complex System Implementation National Chiao Tung University Chun-Jen Tsai 04/14/2011 About FPGA FPGA was invented by Ross Freeman in 1989 SRAM-based FPGA properties Standard parts Allowing multi-level
More informationVirtex-II Architecture. Virtex II technical, Design Solutions. Active Interconnect Technology (continued)
Virtex-II Architecture SONET / SDH Virtex II technical, Design Solutions PCI-X PCI DCM Distri RAM 18Kb BRAM Multiplier LVDS FIFO Shift Registers BLVDS SDRAM QDR SRAM Backplane Rev 4 March 4th. 2002 J-L
More informationThe DSP Primer 8. FPGA Technology. DSPprimer Home. DSPprimer Notes. August 2005, University of Strathclyde, Scotland, UK
The DSP Primer 8 FPGA Technology Return DSPprimer Home Return DSPprimer Notes August 2005, University of Strathclyde, Scotland, UK For Academic Use Only THIS SLIDE IS BLANK August 2005, For Academic Use
More informationINTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS)
INTRODUCTION TO FIELD PROGRAMMABLE GATE ARRAYS (FPGAS) Bill Jason P. Tomas Dept. of Electrical and Computer Engineering University of Nevada Las Vegas FIELD PROGRAMMABLE ARRAYS Dominant digital design
More informationSimplify System Complexity
1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller
More informationSystem Verification of Hardware Optimization Based on Edge Detection
Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection
More informationHigh-Performance Modular Multiplication on the Cell Broadband Engine
High-Performance Modular Multiplication on the Cell Broadband Engine Joppe W. Bos Laboratory for Cryptologic Algorithms EPFL, Lausanne, Switzerland joppe.bos@epfl.ch 1 / 21 Outline Motivation and previous
More informationAn Optimized Hardware Architecture for the Montgomery Multiplication Algorithm
An Optimized Hardware Architecture for the Montgomery Multiplication Algorithm Miaoqing Huang 1, Kris Gaj 2, Soonhak Kwon 3, and Tarek El-Ghazawi 1 1 The George Washington University, Washington, DC 20052,
More informationAchieving Breakthrough Performance with Virtex-4, the World s Fastest FPGA
Achieving Breakthrough Performance with Virtex-4, the World s Fastest FPGA Xilinx 90nm Design Seminar Series: Part I Xilinx - #1 in 90 nm We Asked our Customers: What are your challenges? Shorter design
More informationA High-Speed FPGA Implementation of an RSD- Based ECC Processor
A High-Speed FPGA Implementation of an RSD- Based ECC Processor Abstract: In this paper, an exportable application-specific instruction-set elliptic curve cryptography processor based on redundant signed
More informationDSP Resources. Main features: 1 adder-subtractor, 1 multiplier, 1 add/sub/logic ALU, 1 comparator, several pipeline stages
DSP Resources Specialized FPGA columns for complex arithmetic functionality DSP48 Tile: two DSP48 slices, interconnect Each DSP48 is a self-contained arithmeticlogical unit with add/sub/multiply/logic
More informationCore Facts. Documentation Design File Formats. Verification Instantiation Templates Reference Designs & Application Notes Additional Items
(FFT_PIPE) Product Specification Dillon Engineering, Inc. 4974 Lincoln Drive Edina, MN USA, 55436 Phone: 952.836.2413 Fax: 952.927.6514 E mail: info@dilloneng.com URL: www.dilloneng.com Core Facts Documentation
More informationSoCWire: a SpaceWire inspired fault tolerant Network on Chip approach for reconfigurable System-on-Chip in Space applications
SoCWire: a SpaceWire inspired fault tolerant Network on Chip approach for reconfigurable System-on-Chip in Space applications Björn Osterloh Institute of Computer and Network Engineering TU Braunschweig,
More informationPricing of Derivatives by Fast, Hardware-Based Monte-Carlo Simulation
Pricing of Derivatives by Fast, Hardware-Based Monte-Carlo Simulation Prof. Dr. Joachim K. Anlauf Universität Bonn Institut für Informatik II Technische Informatik Römerstr. 164 53117 Bonn E-Mail: anlauf@informatik.uni-bonn.de
More informationIntroduction to Field Programmable Gate Arrays
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT Outline Historical introduction.
More informationBackground on Bloom Filter
CSE 535 : Lecture 5 String Matching with Bloom Filters Washington University Fall 23 http://www.arl.wustl.edu/arl/projects/fpx/cse535/ Copyright 23, Sarang Dharmapurikar [Guest Lecture] CSE 535 : Fall
More informationDINI Group. FPGA-based Cluster computing with Spartan-6. Mike Dini Sept 2010
DINI Group FPGA-based Cluster computing with Spartan-6 Mike Dini mdini@dinigroup.com www.dinigroup.com Sept 2010 1 The DINI Group We make big FPGA boards Xilinx, Altera 2 The DINI Group 15 employees in
More informationECE 645: Lecture 1. Basic Adders and Counters. Implementation of Adders in FPGAs
ECE 645: Lecture Basic Adders and Counters Implementation of Adders in FPGAs Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 5, Basic Addition and Counting,
More informationZynq-7000 All Programmable SoC Product Overview
Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform
More informationHardware Architectures
Hardware Architectures Secret-key Cryptography Public-key Cryptography Cryptanalysis AES & AES candidates estream candidates Hash Functions SHA-3 Montgomery Multipliers ECC cryptosystems Pairing-based
More informationSignal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage ECE Temple University
Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage silage@temple.edu ECE Temple University www.temple.edu/scdl Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation
More informationEE 8217 *Reconfigurable Computing Systems Engineering* Sample of Final Examination
1 Student name: Date: June 26, 2008 General requirements for the exam: 1. This is CLOSED BOOK examination; 2. No questions allowed within the examination period; 3. If something is not clear in question
More informationL2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA
L2: FPGA HARDWARE 18-545: ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA 18-545: FALL 2014 2 Admin stuff Project Proposals happen on Monday Be prepared to give an in-class presentation Lab 1 is
More informationEE178 Lecture Module 2. Eric Crabill SJSU / Xilinx Fall 2007
EE178 Lecture Module 2 Eric Crabill SJSU / Xilinx Fall 2007 Lecture #4 Agenda Survey of implementation technologies. Implementation Technologies Small scale and medium scale integration. Up to about 200
More informationPOST-SIEVING ON GPUs
POST-SIEVING ON GPUs Andrea Miele 1, Joppe W Bos 2, Thorsten Kleinjung 1, Arjen K Lenstra 1 1 LACAL, EPFL, Lausanne, Switzerland 2 NXP Semiconductors, Leuven, Belgium 1/18 NUMBER FIELD SIEVE (NFS) Asymptotically
More informationSystem-on Solution from Altera and Xilinx
System-on on-a-programmable-chip Solution from Altera and Xilinx Xun Yang VLSI CAD Lab, Computer Science Department, UCLA FPGAs with Embedded Microprocessors Combination of embedded processors and programmable
More informationHardware/Software Co-Design of Elliptic Curve Cryptography on an 8051 Microcontroller
Hardware/Software Co-Design of Elliptic Curve Cryptography on an 8051 Microcontroller Manuel Koschuch, Joachim Lechner, Andreas Weitzer, Johann Großschädl, Alexander Szekely, Stefan Tillich, and Johannes
More informationThe Virtex FPGA and Introduction to design techniques
The Virtex FPGA and Introduction to design techniques SM098 Computation Structures Lecture 6 Simple Programmable Logic evices Programmable Array Logic (PAL) AN-OR arrays are common blocks in SPL and CPL
More informationNew Integer-FFT Multiplication Architectures and Implementations for Accelerating Fully Homomorphic Encryption
New Integer-FFT Multiplication Architectures and Implementations for Accelerating Fully Homomorphic Encryption Xiaolin Cao, Ciara Moore CSIT, ECIT, Queen s University Belfast, Belfast, Northern Ireland,
More informationSurvey of Codebreaking Machines. Swathi Guruduth Vivekanand Kamanuri Harshad Patil
Survey of Codebreaking Machines Swathi Guruduth Vivekanand Kamanuri Harshad Patil Contents Introduction Motivation Goal Machines considered Comparison based on technology used Brief description of machines
More informationSimplify System Complexity
Simplify System Complexity With the new high-performance CompactRIO controller Fanie Coetzer Field Sales Engineer Northern South Africa 2 3 New control system CompactPCI MMI/Sequencing/Logging FieldPoint
More informationParallelized Radix-4 Scalable Montgomery Multipliers
Parallelized Radix-4 Scalable Montgomery Multipliers Nathaniel Pinckney and David Money Harris 1 1 Harvey Mudd College, 301 Platt. Blvd., Claremont, CA, USA e-mail: npinckney@hmc.edu ABSTRACT This paper
More informationRUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch
RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,
More informationHiding Higher-Order Leakages in Hardware
Hiding Higher-Order Leakages in Hardware 21. May 2015 Ruhr-Universität Bochum Acknowledgement Pascal Sasdrich Tobias Schneider Alexander Wild 2 Story? Threshold Implementation should be explained? 1 st
More informationCopyright 2016 Xilinx
Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building
More informationIntroduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013
Introduction to FPGA Design with Vivado High-Level Synthesis Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products.
More informationDSP using Labview FPGA. T.J.Moir AUT University School of Engineering Auckland New-Zealand
DSP using Labview FPGA T.J.Moir AUT University School of Engineering Auckland New-Zealand Limitations of a basic processor Despite all of the advancements we ve made in the world of processors, they still
More informationEECS150 - Digital Design Lecture 16 - Memory
EECS150 - Digital Design Lecture 16 - Memory October 17, 2002 John Wawrzynek Fall 2002 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: data & program storage general purpose registers buffering table lookups
More informationFIELD PROGRAMMABLE GATE ARRAYS (FPGAS)
FIELD PROGRAMMABLE GATE ARRAYS (FPGAS) 1 Roth Text: Chapter 3 (section 3.4) Chapter 6 Nelson Text: Chapter 11 Programmable logic taxonomy Lab Device 2 Field Programmable Gate Arrays Typical Complexity
More informationField Programmable Gate Array (FPGA)
Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems
More informationQsys and IP Core Integration
Qsys and IP Core Integration Stephen A. Edwards (after David Lariviere) Columbia University Spring 2016 IP Cores Altera s IP Core Integration Tools Connecting IP Cores IP Cores Cyclone V SoC: A Mix of
More informationUsing an RSA Accelerator for Modular Inversion
Using an RSA Accelerator for Modular Inversion by Martin Seysen CHES 2005 Coprocessors on Smart Cards Coprocessors on smart cards have been designed to speed up RSA Examples: Infineon SLE66 ACE Hitachi/Renesas
More informationProgrammable Logic. Simple Programmable Logic Devices
Programmable Logic SM098 Computation Structures - Programmable Logic Simple Programmable Logic evices Programmable Array Logic (PAL) AN-OR arrays are common blocks in SPL and CPL architectures Implements
More informationZynq AP SoC Family
Programmable Logic (PL) Processing System (PS) Zynq -7000 AP SoC Family Cost-Optimized Devices Mid-Range Devices Device Name Z-7007S Z-7012S Z-7014S Z-7010 Z-7015 Z-7020 Z-7030 Z-7035 Z-7045 Z-7100 Part
More informationA Implementing Curve25519 for Side-Channel-Protected Elliptic Curve Cryptography
A Implementing Curve25519 for Side-Channel-Protected Elliptic Curve Cryptography PASCAL SASDRICH, Horst Görtz Institute for IT-Security, Ruhr-Universität Bochum, Germany TIM GÜNEYSU, Horst Görtz Institute
More informationTiny Tate Bilinear Pairing Core Specification. Author: Homer Hsing
Tiny Tate Bilinear Pairing Core Specification Author: Homer Hsing homer.hsing@gmail.com Rev. 0.1 May 3, 2012 This page has been intentionally left blank. www.opencores.org Rev 0.1 ii Rev. Date Author Description
More informationVendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs
Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Presented by Xin Fang Advisor: Professor Miriam Leeser ECE Department Northeastern University 1 Outline Background
More informationMy 2 hours today: 1. Efficient arithmetic in finite fields minute break 3. Elliptic curves. My 2 hours tomorrow:
My 2 hours today: 1. Efficient arithmetic in finite fields 2. 10-minute break 3. Elliptic curves My 2 hours tomorrow: 4. Efficient arithmetic on elliptic curves 5. 10-minute break 6. Choosing curves Efficient
More informationVHDL-MODELING OF A GAS LASER S GAS DISCHARGE CIRCUIT Nataliya Golian, Vera Golian, Olga Kalynychenko
136 VHDL-MODELING OF A GAS LASER S GAS DISCHARGE CIRCUIT Nataliya Golian, Vera Golian, Olga Kalynychenko Abstract: Usage of modeling for construction of laser installations today is actual in connection
More informationNew Software-Designed Instruments
1 New Software-Designed Instruments Nicholas Haripersad Field Applications Engineer National Instruments South Africa Agenda What Is a Software-Designed Instrument? Why Software-Designed Instrumentation?
More informationIntroduction to Modern FPGAs
Introduction to Modern FPGAs Arturo Díaz Pérez Centro de Investigación y de Estudios Avanzados del IPN Departamento de Ingeniería Eléctrica Sección de Computación adiaz@cs.cinvestav.mx Outline Technology
More informationLecture 7: Introduction to Co-synthesis Algorithms
Design & Co-design of Embedded Systems Lecture 7: Introduction to Co-synthesis Algorithms Sharif University of Technology Computer Engineering Dept. Winter-Spring 2008 Mehdi Modarressi Topics for today
More informationVendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs
Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Xin Fang and Miriam Leeser Dept of Electrical and Computer Eng Northeastern University Boston, Massachusetts 02115
More informationScalable Montgomery Multiplication Algorithm
1 Scalable Montgomery Multiplication Algorithm Brock J. Prince Department of Electrical & Computer Engineering, Oregon State University, Corvallis, Oregon 97331 E-mail: princebr@engr.orst.edu May 29, 2002
More informationCOPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design
COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm
More informationInternational Training Workshop on FPGA Design for Scientific Instrumentation and Computing November 2013.
2499-1 International Training Workshop on FPGA Design for Scientific Instrumentation and Computing 11-22 November 2013 FPGA Introduction Cristian SISTERNA National University of San Juan San Juan Argentina
More informationOutline of Presentation Field Programmable Gate Arrays (FPGAs(
FPGA Architectures and Operation for Tolerating SEUs Chuck Stroud Electrical and Computer Engineering Auburn University Outline of Presentation Field Programmable Gate Arrays (FPGAs( FPGAs) How Programmable
More informationTABLE OF CONTENTS 1. INTRODUCTION 1.1. PREFACE KEY FEATURES PERFORMANCE LIST BLOCK DIAGRAM...
Table of Contents TABLE OF CONTENTS 1. INTRODUCTION 1.1. PREFACE... 1-1 1.2. KEY FEATURES... 1-1 1.3. PERFORMANCE LIST... 1-3 1.4. BLOCK DIAGRAM... 1-4 1.5. INTRODUCE THE PCI - BUS... 1-5 1.6. FEATURES...
More informationAttacking Code-Based Cryptosystems with Information Set Decoding using Special-Purpose Hardware
Attacking Code-Based Cryptosystems with Information Set Decoding using Special-Purpose Hardware Stefan Heyse, Ralf Zimmermann, and Christof Paar Horst Görtz Institute for IT-Security (HGI) Ruhr-University
More informationA Scalable Architecture for Montgomery Multiplication
A Scalable Architecture for Montgomery Multiplication Alexandre F. Tenca and Çetin K. Koç Electrical & Computer Engineering Oregon State University, Corvallis, Oregon 97331 {tenca,koc}@ece.orst.edu Abstract.
More informationProgrammable Logic. Any other approaches?
Programmable Logic So far, have only talked about PALs (see 22V10 figure next page). What is the next step in the evolution of PLDs? More gates! How do we get more gates? We could put several PALs on one
More informationE-Passport: Cracking Basic Access Control Keys with COPACOBANA
E-Passport: Cracking Basic Access Control Keys with COPACOBANA Yifei Liu, Timo Kasper, Kerstin Lemke-Rust and Christof Paar Communication Security Group Ruhr University Bochum, Germany http://www.crypto.rub.de
More informationVHX - Xilinx - FPGA Programming in VHDL
Training Xilinx - FPGA Programming in VHDL: This course explains how to design with VHDL on Xilinx FPGAs using ISE Design Suite - Programming: Logique Programmable VHX - Xilinx - FPGA Programming in VHDL
More informationHigh Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems
High Speed Systolic Montgomery Modular Multipliers for RSA Cryptosystems RAVI KUMAR SATZODA, CHIP-HONG CHANG and CHING-CHUEN JONG Centre for High Performance Embedded Systems Nanyang Technological University
More informationVirtex-4 Family Overview
Virtex-4 User Guide 0 Virtex-4 Family Overview DS112 (v1.1) September 10, 2004 0 0 General Description The Virtex-4 Family is the newest generation FPGA from Xilinx. The innovative Advanced Silicon Modular
More information