Are Your Passwords Safe: Energy-Efficient Bcrypt Cracking with Low-Cost Parallel Hardware
|
|
- Albert Smith
- 6 years ago
- Views:
Transcription
1 Are Your Passwords Safe: Energy-Efficient Bcrypt Cracking with Low-Cost Parallel Hardware Katja Malvoni (kmalvoni at openwall.com) Solar Designer (solar at openwall.com) Josip Knezovic (josip.knezovic at fer.hr) KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
2 Motivation Bcrypt is: Slow Sequential Designed to be resistant to brute force attacks and to remain secure despite hardware improvements You could almost think why even bother optimizing KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
3 But KatjaAre Malvoni, Solar Designer, Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware Your Passwords Safe: Energy-Efficient August 19, / 21
4 Outline 1 Bcrypt 2 Implementations Parallella/Epiphany ZedBoard and ZC706 3 Experimental Results 4 Future work 5 Takeaways KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
5 Bcrypt Bcrypt Based on Blowfish block cipher Expensive key setup User defined cost setting Pseudorandom memory accesses Blowfish. Photo source: KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
6 Implementations Parallella/Epiphany Architecture Epiphany 16/64 32-bit RISC cores operating at up to 1 GHz/800 MHz Energy-efficient - 2 W maximum chip power consumption 32 KB of local memory per core FPU can be switched to integer mode KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
7 Implementations Parallella/Epiphany Implementation Epiphany John the Ripper prepares data on ARM cores Bcrypt hashes computed on Epiphany Optimized in assembly Each Epiphany core computes two bcrypt hashes Computation overlapped to exploit dual-issue architecture Integer ALU FPU in integer mode KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
8 Implementations ZedBoard and ZC706 Architecture Zynq 7020 and Zynq 7045 Heterogeneous device Dual ARM Cortex-A9 MPCore Advanced low power 28nm programmable logic Zynq times bigger than Zynq 7020 AXI buses used for CPU-FPGA communication KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
9 Implementations ZedBoard and ZC706 Implementation Zynq 7020 and Zynq 7045 John the Ripper prepares data on ARM cores Bcrypt instances compute hash(es) KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
10 Implementations ZedBoard and ZC706 Implementation Number of concurrent instances Number of concurrent instances limited by available BRAM Overlapping multiple bcrypt instances in one module 56, 70 or 112 instances on Zynq 7020 with 140 BRAMs or many more on Zynq 7045 with 545 BRAMs Large communication overhead for low bcrypt cost setting Hardware defects of boards limit optimizations KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
11 Experimental Results Epiphany vs x86 6,000 Performance (c/s) Energy-efficiency (c/s/w) 5,000 4,812 4,876 4,000 3,000 2,000 2,400 1,000 1,207 1, Epiphany 16 T7200 Epiphany 64 i7 2600K KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
12 Experimental Results Performance and efficiency comparison 25,000 Performance (c/s) Energy-efficiency (c/s/w) 20,000 20,583 15,000 10,000 5, ,044 4,571 4,812 3,522 4,116 4,556 2,285 2,400 1, ,246 6,596 5, Epiphany 16 Zynq 7020 Epiphany 64 Zynq 7020 Zynq 7045 HD 7970 FX 8120 Xeon Phi 5110P i7 4770K (emulated with 7045) KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
13 Experimental Results Cost comparison System price (c/s/$) Chip or device price (c/s/$) $99 $75 $199 - $395 $119 $2495 $ $549 $505 $205 - $2649 $650 $350 Epiphany 16 Epiphany 64 Zynq 7020 Zynq 7045 HD 7970 FX 8120 Xeon Phi 5110P i7 4770K KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
14 Experimental Results Derived performance from cost 12 35,000 Performance for cost 12 (c/s) Theoretical performance for cost 5 (c/s) Measured performance for cost 5 (c/s) 30,000 28,462 25,000 20,000 20,538 15,000 10,000 5, ,112 6,313 6,246 6,753 6,596 5,347 4,571 4,556 5,408 4,490 1,207 1, Epiphany 16 Zynq 7020 Zynq 7045 HD 7970 FX 8120 Xeon Phi 5110P i7 4770K KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
15 Experimental Results Theoretical Peak Performance Analysis Theory c/s = N ports f (2 cost ) N reads 16 N ports - number of available read ports to local memory or L1 cache N reads - number of reads per Blowfish round 2 cost number of Blowfish block encryptions in bcrypt hash computation f (in Hz) - clock rate EksBlowfish Ekspensive key schedule Blowfish bcrypt(cost, salt, pwd) Bcrypt (1) 1: state InitState() 2: state ExpandKey(state, salt, key) 3: repeat(2 cost ) 4: state ExpandKey(state, 0, salt) 5: state ExpandKey(state, 0, key) 6: ctext OrpheanBeholderScryDoubt 7: repeat(64) 8: ctext EncryptECB(state, ctext) 9: return Concatenate(cost, salt, ctext) Katja Malvoni and Solar Designer Energy-efficient bcrypt cracking atjaare Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
16 Experimental Results Theoretical Peak Performance Analysis Comparison 35,000 30,000 Theoretical estimate for cost 5 (c/s) Measured performance for cost 5 (c/s) 29,179 25,000 20,000 17,989 20,538 15,000 10,000 5, ,494 11,093 7,496 4,497 4,571 4,812 4,876 6,595 1,207 Epiphany 16 Zynq 7020 Epiphany 64 Zynq 7045 i7 2600K i7 4770K KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
17 Experimental Results Related work F.Wiemer, R. Zimmermann. Speed and Area-Optimized Password Search of bcrypt on FPGAs bcrypt running on ZedBoard at 80 MHz 40 parallel instances 5208 c/s at cost 5, 41.6 c/s at cost 12 Yuri Gonzaga, Google Summer of Code 2011 KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
18 Future work Future work Parallella/Epiphany Using both Epiphany and Zynq 7020 at once Possible to integrate up to 64 chips on a single board Scalability of current implementation is promising 64 * 64 = 4096 cores with theoretical performance of c/s FPGA Zynq 7020 and 7045 optimizations Improve clock rate Reduce communication overhead Targeting bigger FPGAs Targeting multi-fpga boards KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
19 Takeaways Takeaways Many-core low power RISC platforms and FPGAs are capable of exploiting bcrypt peculiarities to achieve comparable performance and higher energy-efficiency Higher energy-efficiency enables higher density More chips per board, more boards per system It doesn t take ASICs to improve bcrypt cracking energy-efficiency by a factor of 45+ Although ASICs would do better yet KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
20 Thanks Thanks Sayantan Datta Steve Thomas Parallella project Google Summer of Code Xilinx Faculty of Electrical Engineering and Computing, University of Zagreb KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
21 Questions Questions? KatjaAre Malvoni, Your Passwords Solar Designer, Safe: Energy-Efficient Josip Knezovic Bcrypt Cracking with Low-Cost Parallel Hardware August 19, / 21
Survey of Codebreaking Machines. Swathi Guruduth Vivekanand Kamanuri Harshad Patil
Survey of Codebreaking Machines Swathi Guruduth Vivekanand Kamanuri Harshad Patil Contents Introduction Motivation Goal Machines considered Comparison based on technology used Brief description of machines
More informationA Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013
A Closer Look at the Epiphany IV 28nm 64 core Coprocessor Andreas Olofsson PEGPUM 2013 1 Adapteva Achieves 3 World Firsts 1. First processor company to reach 50 GFLOPS/W 3. First semiconductor company
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationscrypt: A new key derivation function
Doing our best to thwart TLAs armed with ASICs Colin Percival Tarsnap cperciva@tarsnap.com May 9, 2009 Making bcrypt obsolete Colin Percival Tarsnap cperciva@tarsnap.com May 9, 2009 Are you sure your SSH
More informationChapter 18 - Multicore Computers
Chapter 18 - Multicore Computers Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ Luis Tarrataca Chapter 18 - Multicore Computers 1 / 28 Table of Contents I 1 2 Where to focus your study Luis Tarrataca
More informationECE 471 Embedded Systems Lecture 2
ECE 471 Embedded Systems Lecture 2 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 7 September 2018 Announcements Reminder: The class notes are posted to the website. HW#1 will
More informationZynq-7000 All Programmable SoC Product Overview
Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform
More informationESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder
ESE532: System-on-a-Chip Architecture Day 5: September 18, 2017 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Basic Approach Dataflow variants Motivations/demands for
More informationOutline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work
Using Non-blocking Operations in HPC to Reduce Execution Times David Buettner, Julian Kunkel, Thomas Ludwig Euro PVM/MPI September 8th, 2009 Outline 1 Motivation 2 Theory of a non-blocking benchmark 3
More informationHEAD HardwarE Accelerated Deduplication
HEAD HardwarE Accelerated Deduplication Final Report CS710 Computing Acceleration with FPGA December 9, 2016 Insu Jang Seikwon Kim Seonyoung Lee Executive Summary A-Z development of deduplication SW version
More informationCryptographic Hash Functions. Secure Software Systems
1 Cryptographic Hash Functions 2 Cryptographic Hash Functions Input: Message of arbitrary size Output: Digest (hashed output) of fixed size Loreum ipsum Hash Function 23sdfw83x8mjyacd6 (message of arbitrary
More informationCS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it
Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1
More informationVersal: AI Engine & Programming Environment
Engineering Director, Xilinx Silicon Architecture Group Versal: Engine & Programming Environment Presented By Ambrose Finnerty Xilinx DSP Technical Marketing Manager October 16, 2018 MEMORY MEMORY MEMORY
More informationFPGA Acceleration of the LFRic Weather and Climate Model in the EuroExa Project Using Vivado HLS
FPGA Acceleration of the LFRic Weather and Climate Model in the EuroExa Project Using Vivado HLS Mike Ashworth, Graham Riley, Andrew Attwood and John Mawer Advanced Processor Technologies Group School
More informationCOPACOBANA: RECONFIGURABLE COMPUTING IN CRYPTANALYSIS. Ben Johnstone
COPACOBANA: RECONFIGURABLE COMPUTING IN CRYPTANALYSIS Ben Johnstone Overview Goals Architecture DES Performance Conclusion What is COPACOBANA? Cost Optimized Parallel Code Breaker History Developed at
More informationDNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs
IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei
More informationSide-Channel Countermeasures for Hardware: is There a Light at the End of the Tunnel?
Side-Channel Countermeasures for Hardware: is There a Light at the End of the Tunnel? 11. Sep 2013 Ruhr University Bochum Outline Power Analysis Attack Masking Problems in hardware Possible approaches
More informationLecture 41: Introduction to Reconfigurable Computing
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41: Introduction to Reconfigurable Computing Michael Le, Sp07 Head TA April 30, 2007 Slides Courtesy of Hayden So, Sp06 CS61c Head TA Following
More informationZynq Ultrascale+ Architecture
Zynq Ultrascale+ Architecture Stephanie Soldavini and Andrew Ramsey CMPE-550 Dec 2017 Soldavini, Ramsey (CMPE-550) Zynq Ultrascale+ Architecture Dec 2017 1 / 17 Agenda Heterogeneous Computing Zynq Ultrascale+
More informationA Prototype Storage Subsystem based on PCM
PSS A Prototype Storage Subsystem based on IBM Research Zurich Ioannis Koltsidas, Roman Pletka, Peter Mueller, Thomas Weigold, Evangelos Eleftheriou University of Patras Maria Varsamou, Athina Ntalla,
More informationTen (or so) Small Computers
Ten (or so) Small Computers by Jon "maddog" Hall Executive Director Linux International and President, Project Cauã 1 of 50 Who Am I? Half Electrical Engineer, Half Business, Half Computer Software In
More informationIntroduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013
Introduction to FPGA Design with Vivado High-Level Synthesis Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products.
More informationEffective Password Hashing
Effective Password Hashing November 18th, 2015 Colin Keigher colin@keigher.ca ~ @afreak ~ https://afreak.ca ~ https://canary.pw Who am I? I am a Senior Security Analyst at a large Canadian company Actively
More informationThere s STILL plenty of room at the bottom! Andreas Olofsson
There s STILL plenty of room at the bottom! Andreas Olofsson 1 Richard Feynman s Lecture (1959) There's Plenty of Room at the Bottom An Invitation to Enter a New Field of Physics Why cannot we write the
More informationNear Memory Key/Value Lookup Acceleration MemSys 2017
Near Key/Value Lookup Acceleration MemSys 2017 October 3, 2017 Scott Lloyd, Maya Gokhale Center for Applied Scientific Computing This work was performed under the auspices of the U.S. Department of Energy
More informationParallelizing Cryptography. Gordon Werner Samantha Kenyon
Parallelizing Cryptography Gordon Werner Samantha Kenyon Outline Security requirements Cryptographic Primitives Block Cipher Parallelization of current Standards AES RSA Elliptic Curve Cryptographic Attacks
More informationDesigning for Performance. Patrick Happ Raul Feitosa
Designing for Performance Patrick Happ Raul Feitosa Objective In this section we examine the most common approach to assessing processor and computer system performance W. Stallings Designing for Performance
More informationCS 31: Intro to Systems Threading & Parallel Applications. Kevin Webb Swarthmore College November 27, 2018
CS 31: Intro to Systems Threading & Parallel Applications Kevin Webb Swarthmore College November 27, 2018 Reading Quiz Making Programs Run Faster We all like how fast computers are In the old days (1980
More informationAn Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection
An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection Hiroyuki Usui, Jun Tanabe, Toru Sano, Hui Xu, and Takashi Miyamori Toshiba Corporation, Kawasaki, Japan Copyright 2013,
More informationEECS 151/251A Fall 2017 Digital Design and Integrated Circuits. Instructor: John Wawrzynek and Nicholas Weaver. Lecture 14 EE141
EECS 151/251A Fall 2017 Digital Design and Integrated Circuits Instructor: John Wawrzynek and Nicholas Weaver Lecture 14 EE141 Outline Parallelism EE141 2 Parallelism Parallelism is the act of doing more
More informationEmbedded Systems. 7. System Components
Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
More informationSoC Platforms and CPU Cores
SoC Platforms and CPU Cores COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University
More informationLecture 4: RISC Computers
Lecture 4: RISC Computers Introduction Program execution features RISC characteristics RISC vs. CICS Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) represents an important
More informationLecture 4: RISC Computers
Lecture 4: RISC Computers Introduction Program execution features RISC characteristics RISC vs. CICS Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) is an important innovation
More informationMulticore Hardware and Parallelism
Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3
More informationArchitecture without explicit locks for logic simulation on SIMD machines
Architecture without explicit locks for logic on machines M. Chimeh Department of Computer Science University of Glasgow UKMAC, 2016 Contents 1 2 3 4 5 6 The Using models to replicate the behaviour of
More informationStrober: Fast and Accurate Sample-Based Energy Simulation Framework for Arbitrary RTL
Strober: Fast and Accurate Sample-Based Energy Simulation Framework for Arbitrary RTL Donggyu Kim, Adam Izraelevitz, Christopher Celio, Hokeun Kim, Brian Zimmer, Yunsup Lee, Jonathan Bachrach, Krste Asanović
More informationChosen-IV Correlation Power Analysis on KCipher-2 and a Countermeasure
Fourth International Workshop on Constructive Side-Channel Analysis and Secure Design (COSADE 2013) Chosen-IV Correlation Power Analysis on KCipher-2 and a Countermeasure Takafumi Hibiki*, Naofumi Homma*,
More informationFPGA Acceleration of the LFRic Weather and Climate Model in the EuroExa Project Using Vivado HLS
FPGA Acceleration of the LFRic Weather and Climate Model in the EuroExa Project Using Vivado HLS Mike Ashworth, Graham Riley, Andrew Attwood and John Mawer Advanced Processor Technologies Group School
More informationPERFORMANCE MEASUREMENT
Administrivia CMSC 411 Computer Systems Architecture Lecture 3 Performance Measurement and Reliability Homework problems for Unit 1 posted today due next Thursday, 2/12 Start reading Appendix C Basic Pipelining
More informationDesign Choices for FPGA-based SoCs When Adding a SATA Storage }
U4 U7 U7 Q D U5 Q D Design Choices for FPGA-based SoCs When Adding a SATA Storage } Lorenz Kolb & Endric Schubert, Missing Link Electronics Rudolf Usselmann, ASICS World Services Motivation for SATA Storage
More informationMulticore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.
CS 320 Ch. 18 Multicore Computers Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. Definitions: Hyper-threading Intel's proprietary simultaneous
More informationEmbedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory
Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives
More informationRenesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development
Renesas Synergy MCUs Build a Foundation for Groundbreaking Integrated Embedded Platform Development New Family of Microcontrollers Combine Scalability and Power Efficiency with Extensive Peripheral Capabilities
More informationExploring OpenCL Memory Throughput on the Zynq
Exploring OpenCL Memory Throughput on the Zynq Technical Report no. 2016:04, ISSN 1652-926X Chalmers University of Technology Bo Joel Svensson bo.joel.svensson@gmail.com Abstract The Zynq platform combines
More informationCS 101, Mock Computer Architecture
CS 101, Mock Computer Architecture Computer organization and architecture refers to the actual hardware used to construct the computer, and the way that the hardware operates both physically and logically
More informationThe Mont-Blanc approach towards Exascale
http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are
More informationFPGAhammer: Remote Voltage Fault Attacks on Shared FPGAs, suitable for DFA on AES
, suitable for DFA on AES Jonas Krautter, Dennis R.E. Gnad, Mehdi B. Tahoori 10.09.2018 INSTITUTE OF COMPUTER ENGINEERING CHAIR OF DEPENDABLE NANO COMPUTING KIT Die Forschungsuniversität in der Helmholtz-Gemeinschaft
More informationBuilding and Using the ATLAS Transactional Memory System
Building and Using the ATLAS Transactional Memory System Njuguna Njoroge, Sewook Wee, Jared Casper, Justin Burdick, Yuriy Teslyar, Christos Kozyrakis, Kunle Olukotun Computer Systems Laboratory Stanford
More informationProbabilistic password generators
Probabilistic password generators (and fancy curves) Simon Marechal (bartavelle at openwall.com) http://www.openwall.com @Openwall December 212 Warning! I am no mathematician Conclusions might be erroneous
More informationA Software Development Toolset for Multi-Core Processors. Yuichi Nakamura System IP Core Research Labs. NEC Corp.
A Software Development Toolset for Multi-Core Processors Yuichi Nakamura System IP Core Research Labs. NEC Corp. Motivations Embedded Systems: Performance enhancement by multi-core systems CPU0 CPU1 Multi-core
More information金融商品取引アルゴリズムのハードウェアアクセラレ ーションに関する研究 [ 課題研究報告書 ] Description Supervisor: 田中清史, 情報科学研究科, 修士
JAIST Reposi https://dspace.j Title 金融商品取引アルゴリズムのハードウェアアクセラレ ーションに関する研究 [ 課題研究報告書 ] Author(s) 小林, 弘幸 Citation Issue Date 2018-03 Type Thesis or Dissertation Text version author URL http://hdl.handle.net/10119/15214
More informationSysAlloc: A Hardware Manager for Dynamic Memory Allocation in Heterogeneous Systems
SysAlloc: A Hardware Manager for Dynamic Memory Allocation in Heterogeneous Systems Zeping Xue, David Thomas zeping.xue10@imperial.ac.uk FPL 2015, London 4 Sept 2015 1 Single Processor System Allocator
More informationSTORING DATA: DISK AND FILES
STORING DATA: DISK AND FILES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? How does a DBMS store data? disk, SSD, main memory The Buffer manager controls how
More informationFrequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System
Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Chi Zhang, Viktor K Prasanna University of Southern California {zhan527, prasanna}@usc.edu fpga.usc.edu ACM
More informationEECS150 - Digital Design Lecture 09 - Parallelism
EECS150 - Digital Design Lecture 09 - Parallelism Feb 19, 2013 John Wawrzynek Spring 2013 EECS150 - Lec09-parallel Page 1 Parallelism Parallelism is the act of doing more than one thing at a time. Optimization
More informationIntroduction to Microprocessor
Introduction to Microprocessor Slide 1 Microprocessor A microprocessor is a multipurpose, programmable, clock-driven, register-based electronic device That reads binary instructions from a storage device
More informationPorting BLIS to new architectures Early experiences
1st BLIS Retreat. Austin (Texas) Early experiences Universidad Complutense de Madrid (Spain) September 5, 2013 BLIS design principles BLIS = Programmability + Performance + Portability Share experiences
More informationCPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces
CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces Zvonimir Z. Bandic, Sr. Director Robert Golla, Sr. Fellow Dejan Vucinic,
More informationYarn password hashing function
Yarn password hashing function Evgeny Kapun abacabadabacaba@gmail.com 1 Introduction I propose a password hashing function Yarn. This is a memory-hard function, however, a prime objective of this function
More informationLeveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD
Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina TEC- OpenSPARC T1 The T1 is a new-from-the-ground-up SPARC microprocessor implementation
More informationEECS151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: John Wawrzynek and Nick Weaver. Lecture 19: Caches EE141
EECS151/251A Spring 2018 Digital Design and Integrated Circuits Instructors: John Wawrzynek and Nick Weaver Lecture 19: Caches Cache Introduction 40% of this ARM CPU is devoted to SRAM cache. But the role
More informationOptimization of Lattice QCD with CG and multi-shift CG on Intel Xeon Phi Coprocessor
Optimization of Lattice QCD with CG and multi-shift CG on Intel Xeon Phi Coprocessor Intel K. K. E-mail: hirokazu.kobayashi@intel.com Yoshifumi Nakamura RIKEN AICS E-mail: nakamura@riken.jp Shinji Takeda
More informationLinux Storage System Bottleneck Exploration
Linux Storage System Bottleneck Exploration Bean Huo / Zoltan Szubbocsev Beanhuo@micron.com / zszubbocsev@micron.com 215 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications
More informationHiding Higher-Order Leakages in Hardware
Hiding Higher-Order Leakages in Hardware 21. May 2015 Ruhr-Universität Bochum Acknowledgement Pascal Sasdrich Tobias Schneider Alexander Wild 2 Story? Threshold Implementation should be explained? 1 st
More informationFundamentals of Quantitative Design and Analysis
Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature
More informationProtecting Embedded Systems from Zero-Day Attacks
Protecting Embedded Systems from Zero-Day Attacks Professor Stephen Taylor Thayer School of Engineering at Dartmouth stnh.email@icloud.com (603) 727-8945 MicroArx.com Apiotics.com 1 Research Support Current
More informationDRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric
DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based
More informationDRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric
DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based
More informationjohn-devkit: specialized compiler for hash cracking
john-devkit: specialized compiler for hash cracking Aleksey Cherepanov lyosha@openwall.com May 26, 2015 General john-devkit is an experiment not yet embraced by John the Ripper developer community is a
More informationAn FPGA Architecture Supporting Dynamically-Controlled Power Gating
An FPGA Architecture Supporting Dynamically-Controlled Power Gating Altera Corporation March 16 th, 2012 Assem Bsoul and Steve Wilton {absoul, stevew}@ece.ubc.ca System-on-Chip Research Group Department
More informationManycore and GPU Channelisers. Seth Hall High Performance Computing Lab, AUT
Manycore and GPU Channelisers Seth Hall High Performance Computing Lab, AUT GPU Accelerated Computing GPU-accelerated computing is the use of a graphics processing unit (GPU) together with a CPU to accelerate
More informationARM Cortex-A9 ARM v7-a. A programmer s perspective Part1
ARM Cortex-A9 ARM v7-a A programmer s perspective Part1 ARM: Advanced RISC Machine First appeared in 1985 as Acorn RISC Machine from Acorn Computers in Manchester England Limited success outcompeted by
More informationScaling the Peak: Maximizing floating point performance on the Epiphany NoC
Scaling the Peak: Maximizing floating point performance on the Epiphany NoC Anish Varghese, Gaurav Mitra, Robert Edwards and Alistair Rendell Research School of Computer Science The Australian National
More informationScalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA
Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA Yun R. Qu, Viktor K. Prasanna Ming Hsieh Dept. of Electrical Engineering University of Southern California Los Angeles, CA 90089
More informationMainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation
Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer
More informationDeveloping a Prototyping Board for Emerging Memory
Developing a Prototyping Board for Emerging Memory 2013. 10. 25 Sungjoo Yoo Embedded System Architecture Lab. POSTECH Introduction scaling problem [ITRS, 2012] Year 2012 2013 2014 2015 2016 2017 2018 2019
More informationAn Alternative to GPU Acceleration For Mobile Platforms
Inventing the Future of Computing An Alternative to GPU Acceleration For Mobile Platforms Andreas Olofsson andreas@adapteva.com 50 th DAC June 5th, Austin, TX Adapteva Achieves 3 World Firsts 1. First
More informationSoftware Defined Radio (SDR) for Parallel SDR intro in a simple way Satellite Reception in Mobile/Deployable Ground Segments
Software Defined Radio (SDR) for Parallel SDR intro in a simple way Satellite Reception in Mobile/Deployable Ground Segments Small Satellite Conference, Logan, Utah 11/08/2015 1 2 Dr Mark Bowyer & 1 Dr
More informationMainstream Computer System Components
Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved
More informationThe Many Dimensions of SDR Hardware
The Many Dimensions of SDR Hardware Plotting a Course for the Hardware Behind the Software Sept 2017 John Orlando Epiq Solutions LO RFIC Epiq Solutions in a Nutshell Schaumburg, IL EST 2009 N. Virginia
More informationComputer System Architecture
CSC 203 1.5 Computer System Architecture Budditha Hettige Department of Statistics and Computer Science University of Sri Jayewardenepura Microprocessors 2011 Budditha Hettige 2 Processor Instructions
More informationKartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18
Accelerating PageRank using Partition-Centric Processing Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Outline Introduction Partition-centric Processing Methodology Analytical Evaluation
More informationData Encryption Standard
ECE 646 Lecture 7 Data Encryption Standard Required Reading W. Stallings, "Cryptography and Network-Security," 5th Edition, Chapter 3: Block Ciphers and the Data Encryption Standard Chapter 6.1: Multiple
More informationVector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks
Vector Architectures Vs. Superscalar and VLIW for Embedded Media Benchmarks Christos Kozyrakis Stanford University David Patterson U.C. Berkeley http://csl.stanford.edu/~christos Motivation Ideal processor
More informationSecurity Applications
1. Introduction Security Applications Abhyudaya Chodisetti Paul Wang Lee Garrett Smith Cryptography applications generally involve a large amount of processing. Thus, there is the possibility that these
More informationKnowledge Organiser. Computing. Year 10 Term 1 Hardware
Organiser Computing Year 10 Term 1 Hardware Enquiry Question How does a computer do everything it does? Big questions that will help you answer this enquiry question: 1. What is the purpose of the CPU?
More informationVersal: The New Xilinx Adaptive Compute Acceleration Platform (ACAP) in 7nm
Engineering Director, Xilinx Silicon Architecture Group Versal: The New Xilinx Adaptive Compute Acceleration Platform (ACAP) in 7nm Presented By Kees Vissers Fellow February 25, FPGA 2019 Technology scaling
More informationLecture 8: RISC & Parallel Computers. Parallel computers
Lecture 8: RISC & Parallel Computers RISC vs CISC computers Parallel computers Final remarks Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) is an important innovation in computer
More informationFACTFILE: GCE DIGITAL TECHNOLOGY
FACTFILE: GCE DIGITAL TECHNOLOGY AS2: FUNDAMENTALS OF DIGITAL TECHNOLOGY Hardware and Software Architecture 1 Learning Outcomes Students should be able to: describe the internal components of a computer
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationAuthenticated Storage Using Small Trusted Hardware Hsin-Jung Yang, Victor Costan, Nickolai Zeldovich, and Srini Devadas
Authenticated Storage Using Small Trusted Hardware Hsin-Jung Yang, Victor Costan, Nickolai Zeldovich, and Srini Devadas Massachusetts Institute of Technology November 8th, CCSW 2013 Cloud Storage Model
More informationSri Vidya College of Engineering and Technology. EC6703 Embedded and Real Time Systems Unit IV Page 1.
Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and Real Time Systems Page 1 Sri Vidya College of Engineering and Technology ERTS Course Material EC6703 Embedded and
More informationIMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM
IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM I5 AND I7 PROCESSORS Juan M. Cebrián 1 Lasse Natvig 1 Jan Christian Meyer 2 1 Depart. of Computer and Information
More informationCopyright 2016 Xilinx
Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building
More informationParallella: A $99 Open Hardware Parallel Computing Platform
Inventing the Future of Computing Parallella: A $99 Open Hardware Parallel Computing Platform Andreas Olofsson andreas@adapteva.com IPDPS May 22th, Cambridge, MA Adapteva Achieves 3 World Firsts 1. First
More informationChapter 1: Introduction to Parallel Computing
Parallel and Distributed Computing Chapter 1: Introduction to Parallel Computing Jun Zhang Laboratory for High Performance Computing & Computer Simulation Department of Computer Science University of Kentucky
More informationFPGA system development What you need to think about. Frédéric Leens, CEO
FPGA system development What you need to think about Frédéric Leens, CEO About Byte Paradigm 2005 : Founded by 3 ASIC-SoC-FPGA engineers as a Design Center for high-end FPGA and board design. 2007 : GP
More informationComputer System Components
Computer System Components CPU Core 1 GHz - 3.2 GHz 4-way Superscaler RISC or RISC-core (x86): Deep Instruction Pipelines Dynamic scheduling Multiple FP, integer FUs Dynamic branch prediction Hardware
More informationArgon2 for password hashing and cryptocurrencies
Argon2 for password hashing and cryptocurrencies Alex Biryukov, Daniel Dinu, Dmitry Khovratovich University of Luxembourg 2nd April 2016 Motivation Password-based authentication Keyless password authentication:
More information