Efficient Event Processing through Reconfigurable Hardware for Algorithmic Trading. University of Toronto
|
|
- Lydia Patterson
- 5 years ago
- Views:
Transcription
1 Efficient Event Processing through Reconfigurable Hardware for Algorithmic Trading Martin Labrecque Harsh Singh Warren Shum Hans-Arno Jacobsen University of Toronto
2 Algorithm Trading
3 Examples of Financial Strategies & Market Events Market Feed (event) [stock = ABX, TSE ask = 40.04, NYSE ask = 40.05] Investment Strategies (subscription) 1 Classical arbitrage strategy [stock = ABX, TSE ask NYSE ask, ACTION: BUY & SELL] 2 Classical short-sell strategy [stock = ABX, TSE ask 40.04, ACTION: SELL ] [stock = ABX, TSE ask 38.04, ACTION: BUY ]
4 Algorithm Trading Vision Algorithm Trading Key Observation 1 Every 1-millisecond reduction in response-time is estimated to generate the staggering amount of over 100 million a year 2 Millions of market events (and increasing) are expected per second Our Solution We propose a novel FPGA-based event processing platform to significantly speed up algorithm trading computations, namely, market event parsing and market event matching against strategies
5 Why FPGAs 1 Hardware reconfigurability: the ability to be re-configured on-demand into a highly parallel custom hardware circuit 2 Hardware parallelism: eliminating inter-processor signaling and message passing overhead at the program and OS level 3 High throughput packet processing: using multiple high bandwidth (giga-bit) I/O pins to eliminate the OS layer latency overhead in moving data between input and output ports
6 Propagation Data Structure Overview (SIGMOD 01) hash(?) S 1 S 2 hash(?) S 8 S 15 S 20 hash(?) S 5 S i AP(S i ) Hash(AP(S i )) S i hash(?) S 31 S 4 1 Strategies are distributed in disjoint clusters to enables highly parallelizable event matching through custom hardware units 2 In each cluster, strategies are stored as contiguous blocks of memory to enable fast sequential access to improve memory locality
7 Soft-Processor Approach 1 Simplest solution with virtually no deployment effort 2 Identical C program is compiled to execute on FPGA soft-processor
8 Hybrid Approach 1 4 Matching units (custom processors) are ran in parallel 2 Strategies are stored both in off-chip DDR2 and on-chip BRAM 3 DDR2 memory access are batched to reduce hand-shaking latency 4 BRAM memory is accessed during DDR2 hand-shaking phase 5 Scales in order hundred of thousands strategies
9 Hardware-only Approach 1 Each strategy encoded as a matching unit (a custom processor) 2 High rate of matching due to lack of memory access 3 High degree of parallelization, all strategies are executed in parallel 4 Resource exhaustive with respect to the number of strategies 5 Scales in order of thousands of strategies (latest FPGA chip)
10 Verilog Snippet case (CurrentState) IDLE: begin if (Go) NextState = SELECT_CLUSTER_ID; else NextState = IDLE; end SELECT_CLUSTER_ID: begin // Select a valid cluster index if (curcluster > LAST_CLUSTER) // Finished reading last cluster // When all clusters are invalid NextState = WAIT; else NextState = START_ADDRESS; end START_ADDRESS: begin if (curcluster > LAST_CLUSTER) NextState = WAIT; else if (can_take_more_requests) NextState = MEM_BURST_WAIT; else NextState = START_ADDRESS; end NEXT_ADDRESS: begin if (clusterendfound) NextState = SELECT_CLUSTER_ID; else if (can_take_more_requests) NextState = MEM_BURST_WAIT; else NextState = NEXT_ADDRESS; end default NextState = IDLE; endcase // Select cluster address // Finished reading last cluster // When all clusters are invalid // If 'can_take_more_requests' is low, wait // Check data from cluster terminator // Increment curaddr by 16 bytes // If 'can_take_more_requests' is low, wait
11 Verilog Compilation 1 Synthesis: checks syntax and analyzes the design to ensure that it is optimized for the architecture; outputs a design Netlist file 2 Design Translation: merges the input Netlists and design constraints; outputs a NGD file, describing the logical design reduced to gate primitives 3 Mapping: maps an NGD logic into FPGA; outputs a native circuit description (NCD) that represents the design mapped to the FPGA. 4 Place & Route: takes a NCD file and places and routes the design; outputs an NCD file for bitstream generation.
12 Evaluation Testbed 1 Throughput is the maximum sustainable input packet rate, determined through a bisection search, when no packet is dropped 2 Latency is the interval between the time a market event packet leaves the Event Monitor output queue to the time the action is received
13 Experimental Results End-to-end System Latency (µs) Workload PC Soft-Processor Hybrid Hardware-only K N/A 10K , N/A 100K 2, , , N/A System Throughput (market events/sec) Workload PC Soft-Processor Hybrid Hardware-only ,654 14, ,142 1,024,590 1K , ,500 N/A 10K ,779 N/A 100K N/A
14 Event Sender
15 Stock Event Sender
16 Workload Replayer
17 Event Packet Analyzer
18 Lessons Learned 1 Algorithmic trading trends Account for 70% of all trading in equities Cost millions per subsecond response delay 2 Expressive predicate language Support classical arbitrage strategy Support buy-and-hold strategy 3 Reconfigurable hardware (FPGA) Accelerate using custom logic circuit Utilize hardware parallelism 4 Line-rate algorithmic trading Eliminate OS layer latency Leverage on-board packet processing 5
19 Thank You,
Multi-Query Stream Processing on FPGAs. University of Toronto
Multi-Query Stream Processing on FPGAs Mohammad Sadoghi Rija Javed Naif Tarafdar Harsh Singh Rohan Palaniappan Hans-Arno Jacobsen April 2012 Algorithmic Trading NASDAQ NYSE TSX AMGN=58 HON=24 Market ORCL=12
More informationAdaptive Parallel Compressed Event Matching
Adaptive Parallel Compressed Event Matching Mohammad Sadoghi 1,2 Hans-Arno Jacobsen 2 1 IBM T.J. Watson Research Center 2 Middleware Systems Research Group, University of Toronto April 2014 Mohammad Sadoghi
More informationFPGA Design Flow 1. All About FPGA
FPGA Design Flow 1 In this part of tutorial we are going to have a short intro on FPGA design flow. A simplified version of FPGA design flow is given in the flowing diagram. FPGA Design Flow 2 FPGA_Design_FLOW
More informationFCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow
FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture
More informationFaster FAST Multicore Acceleration of
Faster FAST Multicore Acceleration of Streaming Financial Data Virat Agarwal, David A. Bader, Lin Dan, Lurng-Kuo Liu, Davide Pasetto, Michael Perrone, Fabrizio Petrini Financial Market Finance can be defined
More informationParallelizing FPGA Technology Mapping using GPUs. Doris Chen Deshanand Singh Aug 31 st, 2010
Parallelizing FPGA Technology Mapping using GPUs Doris Chen Deshanand Singh Aug 31 st, 2010 Motivation: Compile Time In last 12 years: 110x increase in FPGA Logic, 23x increase in CPU speed, 4.8x gap Question:
More informationDigital Design with FPGAs. By Neeraj Kulkarni
Digital Design with FPGAs By Neeraj Kulkarni Some Basic Electronics Basic Elements: Gates: And, Or, Nor, Nand, Xor.. Memory elements: Flip Flops, Registers.. Techniques to design a circuit using basic
More informationA Novel Design Framework for the Design of Reconfigurable Systems based on NoCs
Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction
More informationHigh-Speed NAND Flash
High-Speed NAND Flash Design Considerations to Maximize Performance Presented by: Robert Pierce Sr. Director, NAND Flash Denali Software, Inc. History of NAND Bandwidth Trend MB/s 20 60 80 100 200 The
More informationHYRISE In-Memory Storage Engine
HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University
More informationLOW LATENCY DATA DISTRIBUTION IN CAPITAL MARKETS: GETTING IT RIGHT
LOW LATENCY DATA DISTRIBUTION IN CAPITAL MARKETS: GETTING IT RIGHT PATRICK KUSTER Head of Business Development, Enterprise Capabilities, Thomson Reuters +358 (40) 840 7788; patrick.kuster@thomsonreuters.com
More informationLarge-Scale Network Simulation Scalability and an FPGA-based Network Simulator
Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Stanley Bak Abstract Network algorithms are deployed on large networks, and proper algorithm evaluation is necessary to avoid
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A
More informationAchieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation
Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation
More informationECE 4514 Digital Design II. Spring Lecture 15: FSM-based Control
ECE 4514 Digital Design II Lecture 15: FSM-based Control A Design Lecture Overview Finite State Machines Verilog Mapping: one, two, three always blocks State Encoding User-defined or tool-defined State
More informationUsing FPGAs as Microservices
Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer (University of Florida, DELL EMC, Intel Corporation) The 9 th Workshop on Big Data Benchmarks,
More informationElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests
ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests Mingxing Tan 1 2, Gai Liu 1, Ritchie Zhao 1, Steve Dai 1, Zhiru Zhang 1 1 Computer Systems Laboratory, Electrical and Computer
More informationWhat is Xilinx Design Language?
Bill Jason P. Tomas University of Nevada Las Vegas Dept. of Electrical and Computer Engineering What is Xilinx Design Language? XDL is a human readable ASCII format compatible with the more widely used
More informationBE-Tree: An Index Structure to Efficiently Match Boolean Expressions over High-dimensional Space. University of Toronto
BE-Tree: An Index Structure to Efficiently Match Boolean Expressions over High-dimensional Space Mohammad Sadoghi Hans-Arno Jacobsen University of Toronto June 15, 2011 Mohammad Sadoghi (University of
More informationEECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis
EECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis Jan 31, 2012 John Wawrzynek Spring 2012 EECS150 - Lec05-verilog_synth Page 1 Outline Quick review of essentials of state elements Finite State
More informationSoftMC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies
SoftMC A Flexible and Practical Open-Source Infrastructure for Enabling Experimental DRAM Studies Hasan Hassan, Nandita Vijaykumar, Samira Khan, Saugata Ghose, Kevin Chang, Gennady Pekhimenko, Donghyuk
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationFPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP
FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP 1 M.DEIVAKANI, 2 D.SHANTHI 1 Associate Professor, Department of Electronics and Communication Engineering PSNA College
More informationISE Design Suite Software Manuals and Help
ISE Design Suite Software Manuals and Help These documents support the Xilinx ISE Design Suite. Click a document title on the left to view a document, or click a design step in the following figure to
More informationSolace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery
Solace JMS Broker Delivers Highest Throughput for Persistent and Non-Persistent Delivery Java Message Service (JMS) is a standardized messaging interface that has become a pervasive part of the IT landscape
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationImplementing Ultra Low Latency Data Center Services with Programmable Logic
Implementing Ultra Low Latency Data Center Services with Programmable Logic John W. Lockwood, CEO: Algo-Logic Systems, Inc. http://algo-logic.com Solutions@Algo-Logic.com (408) 707-3740 2255-D Martin Ave.,
More informationChapter 9: Integration of Full ASIP and its FPGA Implementation
Chapter 9: Integration of Full ASIP and its FPGA Implementation 9.1 Introduction A top-level module has been created for the ASIP in VHDL in which all the blocks have been instantiated at the Register
More informationFPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1
FPGA Design Challenge :Techkriti 14 Digital Design using Verilog Part 1 Anurag Dwivedi Digital Design : Bottom Up Approach Basic Block - Gates Digital Design : Bottom Up Approach Gates -> Flip Flops Digital
More informationand 32 bit for 32 bit. If you don t pay attention to this, there will be unexpected behavior in the ISE software and thing may not work properly!
This tutorial will show you how to: Part I: Set up a new project in ISE 14.7 Part II: Implement a function using Schematics Part III: Simulate the schematic circuit using ISim Part IV: Constraint, Synthesize,
More informationReconfigurable Acceleration of Fitness Evaluation in Trading Strategies
Reconfigurable Acceleration of Fitness Evaluation in Trading Strategies INGRID FUNIE, PAUL GRIGORAS, PAVEL BUROVSKIY, WAYNE LUK, MARK SALMON Department of Computing Imperial College London Published in
More informationTutorial on Software-Hardware Codesign with CORDIC
ECE5775 High-Level Digital Design Automation, Fall 2017 School of Electrical Computer Engineering, Cornell University Tutorial on Software-Hardware Codesign with CORDIC 1 Introduction So far in ECE5775
More informationEvaluation of the Chelsio T580-CR iscsi Offload adapter
October 2016 Evaluation of the Chelsio T580-CR iscsi iscsi Offload makes a difference Executive Summary As application processing demands increase and the amount of data continues to grow, getting this
More informationLaboratory Exercise 7
Laboratory Exercise 7 Finite State Machines This is an exercise in using finite state machines. Part I We wish to implement a finite state machine (FSM) that recognizes two specific sequences of applied
More informationTutorial: Working with Verilog and the Xilinx FPGA in ISE 9.2i
Tutorial: Working with Verilog and the Xilinx FPGA in ISE 9.2i This tutorial will show you how to: Use Verilog to specify a design Simulate that Verilog design Define pin constraints for the FPGA (.ucf
More informationA 3-D CPU-FPGA-DRAM Hybrid Architecture for Low-Power Computation
A 3-D CPU-FPGA-DRAM Hybrid Architecture for Low-Power Computation Abstract: The power budget is expected to limit the portion of the chip that we can power ON at the upcoming technology nodes. This problem,
More informationDesign principles in parser design
Design principles in parser design Glen Gibb Dept. of Electrical Engineering Advisor: Prof. Nick McKeown Header parsing? 2 Header parsing? Identify headers & extract fields A???? B???? C?? Field Field
More informationNikhil Gupta. FPGA Challenge Takneek 2012
Nikhil Gupta FPGA Challenge Takneek 2012 RECAP FPGA Field Programmable Gate Array Matrix of logic gates Can be configured in any way by the user Codes for FPGA are executed in parallel Configured using
More informationOverview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router
Overview Implementing Gigabit Routers with NetFPGA Prof. Sasu Tarkoma The NetFPGA is a low-cost platform for teaching networking hardware and router design, and a tool for networking researchers. The NetFPGA
More informationScalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA
Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA Yun R. Qu, Viktor K. Prasanna Ming Hsieh Dept. of Electrical Engineering University of Southern California Los Angeles, CA 90089
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationHigh Frequency Trading Turns to High Frequency Technology to Reduce Latency
High Frequency Trading Turns to High Frequency Technology to Reduce Latency For financial companies engaged in high frequency trading, profitability depends on how quickly trades are executed. Now, new
More informationPerformance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of
More informationHardware Acceleration for Database Systems using Content Addressable Memories
Hardware Acceleration for Database Systems using Content Addressable Memories Nagender Bandi, Sam Schneider, Divyakant Agrawal, Amr El Abbadi University of California, Santa Barbara Overview The Memory
More informationCloud Bursting: Top Reasons Your Organization will Benefit. Scott Jeschonek Director of Cloud Products Avere Systems
Cloud Bursting: Top Reasons Your Organization will Benefit Scott Jeschonek Director of Cloud Products Avere Systems Agenda Define Cloud Bursting Benefits of using Cloud Bursting Identify Cloud Bursting
More informationIntroduction to Verilog. Mitch Trope EECS 240 Spring 2004
Introduction to Verilog Mitch Trope mtrope@ittc.ku.edu EECS 240 Spring 2004 Overview What is Verilog? Verilog History Max+Plus II Schematic entry Verilog entry System Design Using Verilog: Sum of Products
More informationOptimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications
Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications K. Vaidyanathan, P. Lai, S. Narravula and D. K. Panda Network Based Computing Laboratory
More informationDesigning Next Generation Data-Centers with Advanced Communication Protocols and Systems Services
Designing Next Generation Data-Centers with Advanced Communication Protocols and Systems Services P. Balaji, K. Vaidyanathan, S. Narravula, H. W. Jin and D. K. Panda Network Based Computing Laboratory
More informationFlexible Architecture Research Machine (FARM)
Flexible Architecture Research Machine (FARM) RAMP Retreat June 25, 2009 Jared Casper, Tayo Oguntebi, Sungpack Hong, Nathan Bronson Christos Kozyrakis, Kunle Olukotun Motivation Why CPUs + FPGAs make sense
More informationBoosting the Performance of FPGA-based Graph Processor using Hybrid Memory Cube: A Case for Breadth First Search
Boosting the Performance of FPGA-based Graph Processor using Hybrid Memory Cube: A Case for Breadth First Search Jialiang Zhang, Soroosh Khoram and Jing Li 1 Outline Background Big graph analytics Hybrid
More informationUltra-high-speed In-memory Data Management Software for High-speed Response and High Throughput
Ultra-high-speed In-memory Data Management Software for High-speed Response and High Throughput Yasuhiko Hashizume Kikuo Takasaki Takeshi Yamazaki Shouji Yamamoto The evolution of networks has created
More informationPOLYMORPHIC ON-CHIP NETWORKS
POLYMORPHIC ON-CHIP NETWORKS Martha Mercaldi Kim, John D. Davis*, Mark Oskin, Todd Austin** University of Washington *Microsoft Research, Silicon Valley ** University of Michigan On-Chip Network Selection
More informationFPGA Augmented ASICs: The Time Has Come
FPGA Augmented ASICs: The Time Has Come David Riddoch Steve Pope Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. Hardware acceleration is Niche (With the obvious exception of graphics
More informationEvaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades
Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state
More informationPersistent Memory. High Speed and Low Latency. White Paper M-WP006
Persistent Memory High Speed and Low Latency White Paper M-WP6 Corporate Headquarters: 3987 Eureka Dr., Newark, CA 9456, USA Tel: (51) 623-1231 Fax: (51) 623-1434 E-mail: info@smartm.com Customer Service:
More informationOverview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips
Overview CSE372 Digital Systems Organization and Design Lab Prof. Milo Martin Unit 5: Hardware Synthesis CAD (Computer Aided Design) Use computers to design computers Virtuous cycle Architectural-level,
More informationFPGA-Based Rapid Prototyping of Digital Signal Processing Systems
FPGA-Based Rapid Prototyping of Digital Signal Processing Systems Kevin Banovic, Mohammed A. S. Khalid, and Esam Abdel-Raheem Presented By Kevin Banovic July 29, 2005 To be presented at the 48 th Midwest
More informationA Low Latency Solution Stack for High Frequency Trading. High-Frequency Trading. Solution. White Paper
A Low Latency Solution Stack for High Frequency Trading White Paper High-Frequency Trading High-frequency trading has gained a strong foothold in financial markets, driven by several factors including
More informationMODELING LANGUAGES AND ABSTRACT MODELS. Giovanni De Micheli Stanford University. Chapter 3 in book, please read it.
MODELING LANGUAGES AND ABSTRACT MODELS Giovanni De Micheli Stanford University Chapter 3 in book, please read it. Outline Hardware modeling issues: Representations and models. Issues in hardware languages.
More informationA Fast Ethernet Tester Using FPGAs and Handel-C
A Fast Ethernet Tester Using FPGAs and Handel-C R. Beuran, R.W. Dobinson, S. Haas, M.J. LeVine, J. Lokier, B. Martin, C. Meirosu Copyright 2000 OPNET Technologies, Inc. The Large Hadron Collider at CERN
More informationExploration of Cache Coherent CPU- FPGA Heterogeneous System
Exploration of Cache Coherent CPU- FPGA Heterogeneous System Wei Zhang Department of Electronic and Computer Engineering Hong Kong University of Science and Technology 1 Outline ointroduction to FPGA-based
More informationAN 831: Intel FPGA SDK for OpenCL
AN 831: Intel FPGA SDK for OpenCL Host Pipelined Multithread Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Intel FPGA SDK for OpenCL Host Pipelined Multithread...3 1.1
More informationECE1387 Exercise 3: Using the LegUp High-level Synthesis Framework
ECE1387 Exercise 3: Using the LegUp High-level Synthesis Framework 1 Introduction and Motivation This lab will give you an overview of how to use the LegUp high-level synthesis framework. In LegUp, you
More informationPERG-Rx: An FPGA-based Pattern-Matching Engine with Limited Regular Expression Support for Large Pattern Database. Johnny Ho
PERG-Rx: An FPGA-based Pattern-Matching Engine with Limited Regular Expression Support for Large Pattern Database Johnny Ho Supervisor: Guy Lemieux Date: September 11, 2009 University of British Columbia
More informationEnabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center
Enabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center Naif Tarafdar, Thomas Lin, Eric Fukuda, Hadi Bannazadeh, Alberto Leon-Garcia, Paul Chow University of Toronto 1 Cloudy with
More informationTen Reasons to Optimize a Processor
By Neil Robinson SoC designs today require application-specific logic that meets exacting design requirements, yet is flexible enough to adjust to evolving industry standards. Optimizing your processor
More informationWindowing System on a 3D Pipeline. February 2005
Windowing System on a 3D Pipeline February 2005 Agenda 1.Overview of the 3D pipeline 2.NVIDIA software overview 3.Strengths and challenges with using the 3D pipeline GeForce 6800 220M Transistors April
More informationNear Memory Key/Value Lookup Acceleration MemSys 2017
Near Key/Value Lookup Acceleration MemSys 2017 October 3, 2017 Scott Lloyd, Maya Gokhale Center for Applied Scientific Computing This work was performed under the auspices of the U.S. Department of Energy
More informationHigh-Performance Holistic XML Twig Filtering Using GPUs. Ildar Absalyamov, Roger Moussalli, Walid Najjar and Vassilis Tsotras
High-Performance Holistic XML Twig Filtering Using GPUs Ildar Absalyamov, Roger Moussalli, Walid Najjar and Vassilis Tsotras Outline! Motivation! XML filtering in the literature! Software approaches! Hardware
More informationSpeaker: Kayting Adviser: Prof. An-Yeu Wu Date: 2009/11/23
98-1 Under-Graduate Project Synthesis of Combinational Logic Speaker: Kayting Adviser: Prof. An-Yeu Wu Date: 2009/11/23 What is synthesis? Outline Behavior Description for Synthesis Write Efficient HDL
More informationPerformance and Overhead in a Hybrid Reconfigurable Computer
Performance and Overhead in a Hybrid Reconfigurable Computer Osman Devrim Fidanci 1, Dan Poznanovic 2, Kris Gaj 3, Tarek El-Ghazawi 1, Nikitas Alexandridis 1 1 George Washington University, 2 SRC Computers
More informationExadata Implementation Strategy
Exadata Implementation Strategy BY UMAIR MANSOOB 1 Who Am I Work as Senior Principle Engineer for an Oracle Partner Oracle Certified Administrator from Oracle 7 12c Exadata Certified Implementation Specialist
More informationChapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup
Chapter 4 Routers with Tiny Buffers: Experiments This chapter describes two sets of experiments with tiny buffers in networks: one in a testbed and the other in a real network over the Internet2 1 backbone.
More informationVerilog Fundamentals. Shubham Singh. Junior Undergrad. Electrical Engineering
Verilog Fundamentals Shubham Singh Junior Undergrad. Electrical Engineering VERILOG FUNDAMENTALS HDLs HISTORY HOW FPGA & VERILOG ARE RELATED CODING IN VERILOG HDLs HISTORY HDL HARDWARE DESCRIPTION LANGUAGE
More information8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments
8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments QII51017-9.0.0 Introduction The Quartus II incremental compilation feature allows you to partition a design, compile partitions
More informationProgrammable Logic Devices HDL-Based Design Flows CMPE 415
HDL-Based Design Flows: ASIC Toward the end of the 80s, it became difficult to use schematic-based ASIC flows to deal with the size and complexity of >5K or more gates. HDLs were introduced to deal with
More informationSecure Function Evaluation using an FPGA Overlay Architecture
Secure Function Evaluation using an FPGA Overlay Architecture Xin Fang Stratis Ioannidis Miriam Leeser Dept. of Electrical and Computer Engineering Northeastern University Boston, MA, USA FPGA 217 1 Introduction
More informationTop-Down Network Design
Top-Down Network Design Chapter Two Analyzing Technical Goals and Tradeoffs Copyright 2010 Cisco Press & Priscilla Oppenheimer 1 Technical Goals Scalability Availability Performance Security Manageability
More informationGraduate Institute of Electronics Engineering, NTU. Lecturer: Chihhao Chao Date:
Design of Datapath Controllers and Sequential Logic Lecturer: Date: 2009.03.18 ACCESS IC LAB Sequential Circuit Model & Timing Parameters ACCESS IC LAB Combinational Logic Review Combinational logic circuits
More informationP4 Pub/Sub. Practical Publish-Subscribe in the Forwarding Plane
P4 Pub/Sub Practical Publish-Subscribe in the Forwarding Plane Outline Address-oriented routing Publish/subscribe How to do pub/sub in the network Implementation status Outlook Subscribers Publish/Subscribe
More informationFPGA design with National Instuments
FPGA design with National Instuments Rémi DA SILVA Systems Engineer - Embedded and Data Acquisition Systems - MED Region ni.com The NI Approach to Flexible Hardware Processor Real-time OS Application software
More informationCS 268: Computer Networking
CS 268: Computer Networking L-6 Router Congestion Control TCP & Routers RED XCP Assigned reading [FJ93] Random Early Detection Gateways for Congestion Avoidance [KHR02] Congestion Control for High Bandwidth-Delay
More informationCreating Safe State Machines
Creating Safe State Machines Definition & Overview Finite state machines are widely used in digital circuit designs. Generally, when designing a state machine using a hardware description language (HDL),
More informationReal Time NoC Based Pipelined Architectonics With Efficient TDM Schema
Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam
More informationCS 856 Latency in Communication Systems
CS 856 Latency in Communication Systems Winter 2010 Latency Challenges CS 856, Winter 2010, Latency Challenges 1 Overview Sources of Latency low-level mechanisms services Application Requirements Latency
More informationMobile Memory Forum 2011
SSD for Mobile Jonathan Hubert Director, Strategic Marketing Micron Technology Mobile Memory Forum 2011 The Third Wave of Compute Platforms Billions of users Broadband Mobile Mobile Compute Platforms Cloud-Based
More informationArchitecting Low Latency Cloud Networks
Architecting Low Latency Cloud Networks As data centers transition to next generation virtualized & elastic cloud architectures, high performance and resilient cloud networking has become a requirement
More informationSUPERNA RPO REPORTING AND BROCADE IP EXTENSION WITH ISILON SYNCIQ
SUPERNA RPO REPORTING AND BROCADE IP EXTENSION WITH ISILON SYNCIQ Reduce risk and data loss exposure with the Eyeglass RPO Reporting and Brocade 7840 IP Extension solution for Isilon SyncIQ SOLUTION ESSENTIALS
More informationSECURE PARTIAL RECONFIGURATION OF FPGAs. Amir S. Zeineddini Kris Gaj
SECURE PARTIAL RECONFIGURATION OF FPGAs Amir S. Zeineddini Kris Gaj Outline FPGAs Security Our scheme Implementation approach Experimental results Conclusions FPGAs SECURITY SRAM FPGA Security Designer/Vendor
More informationMaximizing Server Efficiency from μarch to ML accelerators. Michael Ferdman
Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency from μarch to ML accelerators Michael Ferdman Maximizing Server Efficiency with ML accelerators Michael
More informationRiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner
RiceNIC Prototyping Network Interfaces Jeffrey Shafer Scott Rixner RiceNIC Overview Gigabit Ethernet Network Interface Card RiceNIC - Prototyping Network Interfaces 2 RiceNIC Overview Reconfigurable and
More informationA Hardware Structure for FAST Protocol Decoding Adapting to 40Gbps Bandwidth Lei-Lei YU 1,a, Yu-Zhuo FU 2,b,* and Ting LIU 3,c
2017 3rd International Conference on Computer Science and Mechanical Automation (CSMA 2017) ISBN: 978-1-60595-506-3 A Hardware Structure for FAST Protocol Decoding Adapting to 40Gbps Bandwidth Lei-Lei
More informationOCP Engineering Workshop - Telco
OCP Engineering Workshop - Telco Low Latency Mobile Edge Computing Trevor Hiatt Product Management, IDT IDT Company Overview Founded 1980 Workforce Approximately 1,800 employees Headquarters San Jose,
More informationUsing FPGAs in Supercomputing Reconfigurable Supercomputing
Using FPGAs in Supercomputing Reconfigurable Supercomputing Why FPGAs? FPGAs are 10 100x faster than a modern Itanium or Opteron Performance gap is likely to grow further in the future Several major vendors
More informationLab 1: Using the LegUp High-level Synthesis Framework
Lab 1: Using the LegUp High-level Synthesis Framework 1 Introduction and Motivation This lab will give you an overview of how to use the LegUp high-level synthesis framework. In LegUp, you can compile
More informationHigh-Throughput Publish/Subscribe in the Forwarding Plane
1 High-Throughput Publish/Subscribe in the Forwarding Plane Theo Jepsen, Masoud Moushref, Antonio Carzaniga, Nate Foster, Xiaozhou Li, Milad Sharif, Robert Soulé Università della Svizzera italiana (USI)
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationPerformance in the Multicore Era
Performance in the Multicore Era Gustavo Alonso Systems Group -- ETH Zurich, Switzerland Systems Group Enterprise Computing Center Performance in the multicore era 2 BACKGROUND - SWISSBOX SwissBox: An
More informationNetwork Support for Multimedia
Network Support for Multimedia Daniel Zappala CS 460 Computer Networking Brigham Young University Network Support for Multimedia 2/33 make the best of best effort use application-level techniques use CDNs
More information