Using FPGAs as Microservices
|
|
- Clifton Cross
- 5 years ago
- Views:
Transcription
1 Using FPGAs as Microservices David Ojika, Ann Gordon-Ross, Herman Lam, Bhavesh Patel, Gaurav Kaul, Jayson Strayer (University of Florida, DELL EMC, Intel Corporation) The 9 th Workshop on Big Data Benchmarks, Performance Optimization and Emerging Hardware ASPLOS 2018 March 25, 2018
2 Contents Motivation Challenges of FPGA integration Proposed solutions Case study with Spark Key design elements Conclusions & future work 2
3 Motivation FPGAs are increasingly being used in low-latency / high throughput computing (e.g., big data, machine learning) Cloud service providers (CSPs) continue to introduce FPGAs in their datacenters (e.g., Amazon, Microsoft) to meet computing demands Cloud Emerging cloud computing trends (microservices, serverless) show promise in hyperscale datacenters, increased developer productivity Developer 3
4 FPGA + Microservices? 4
5 Overview of FPGAs FPGA: Field-programable gate array A reconfigurable chip composed of CLBs Basic structure consists of: Look-up table (LUT), Flip-Flop (FF), Routing wires, I/O pads CLB: configurable logic block Contemporary FPGAs architectures incorporate more advanced resources: Multiply-accumulate (MAC) Off-chip memory controllers High-speed serial transceivers Embedded, distributed memories Phase-locked loops (PLLs) Up to 2 million logic cells 5
6 Overview of FPGAs FPGA: Field-programable gate array A reconfigurable chip composed of CLBs Basic structure consists of: Look-up table (LUT), Flip-Flop (FF), Routing wires, I/O pads Contemporary FPGAs architectures incorporate more advanced resources: Multiply-accumulate (MAC) Off-chip memory controllers High-speed serial transceivers Embedded, distributed memories Phase-locked loops (PLLs) Up to 2 million logic cells Key benefits: Flexibility, Power, Performance CLB: configurable logic block 6
7 What are Microservices Microservices: a collection of loosely-coupled software services - Unlike Monolithic, provide lightweight (shared) services - Scale consistently with changing workloads Monolith / Layered Microservice / Disaggregated Our approach: use Microservices (as collection of loosely-coupled FPGA accelerator functions) to support FPGAs as software services 7
8 What are Microservices Microservices: a collection of loosely-coupled software services - Unlike Monolithic, provide lightweight (shared) services - Scale consistently with changing workloads Key benefits: scalability, fault-tolerance, dynamic configuration, auto-deployment Monolith / Layered Microservice / Disaggregated Our approach: use Microservices (as collection of loosely-coupled FPGA accelerator functions) to support FPGAs as software services 8
9 Challenges of FPGA integration 9
10 1. Intricate software interface JVM-to-FPGAinterfacing is cumbersome How to keep FPGA accelerator fully-utilized Expose all FPGA functions? 10
11 2. Non-trivial FPGA sharing and data movement FPGA and CPU thread co-existence is complicated Data transfer overheads can hurt performance FPGA mgt. & data movement? 11
12 3. FPGA reconfiguration Reconfiguration can take milliseconds to a few seconds Certain applications may be intolerable to downtime FPGA App CPU Hiding reconfiguration time CPU fallback? 12
13 4. Challenging programming model Requirement on hardware-specific knowledge Long synthesis (compile) time Programmer / Developer HLS HDL binary hours of development / design iterations Programmer vs Developer? FPGA 13
14 Proposed Solution FaaM: a runtime framework for deploying FPGA accelerators as microservices Leverage optimized hardware libraries Share accelerator across threads, applications, users Platform-agnostic (OpenCL, etc.) Seamless integration with Java, support wide range of apps/frameworks Takes care of accelerator management (init./setup, buffer mgt., etc.) 14
15 FPGA Microservices Overview Accelerator Manager /w HW-accelerated function FPGA Library FaaM 15
16 Experimenting with Spark 1. Map convert raw input into key/value pairs. Output to memory (or disk) 2. Shuffle/Sort All reduces retrieve all records from all mappers over network 3. Reduce For each distinct key, do something with all the corresponding values. Output to disk Data Shuffle/Sort in Spark data movement can hurt performance 16
17 Experiment Setup Platform: Xeon+FPGA with Arria10 FPGA Hardware-accelerated function: DEFLATE compression algorithm (level 9) Workload: Sort Evaluation technique: Focus on compression of Spark Output RDDs Spark Worker (Xeon+FPGA Server) CPU FPGA With FaaM (FPGA Microservice): No change to application code Only update Spark config. files 17
18 Performance Results FPGA as drop-in replacement via FaaM 4X memory saving 3.2X speedup Functional Evaluation Performance Evaluation 18
19 System-level optimization #1: enable RDD caching + 2X improvement *CPU used only as baseline. RDD: Resilient Distributed Disk 19
20 System-level optimization #2: increase CPU count Increased FPGA accelerator utilization, resulting in improved application performance 20
21 Key design elements 21
22 1. FPGA accelerator abstraction Driver Arria 10 FPGA AFU: Accelerator Functional Unit 22
23 1. FPGA accelerator abstraction New Library Driver Arria 10 FPGA AFU: Accelerator Functional Unit 23
24 2. Task scheduler Disk Task scheduler Library FPGA-to-Host Memory Communication JVM and native memory data transfers can incur significant overhead Leverage non-blocking I/O; access data in streaming fashion No garbage collection Buffer re-use among groups of SW threads Hde data transfer latency by overlapping comm. with comp. Pinned buffer (NIO) 24
25 3. JVM runtime system Runtime system Task scheduler CPU-FPGA Thread Co-existence Keep FPGA busy all time with tasks Stateless Accelerator management, (buffer size, int./clean-up, etc.) Library 25
26 Buffer R/W performance Sweet spot resulting in almost no performance loss (next slide) for the runtime system. 26
27 Compression: Raw Performance equal almost equal Functional Evaluation Performance Evaluation 27
28 Conclusions & Future Work FPGAs compliment CPUs in certain compute-intensive task 3.2X speedup, 4X memory footprint reduction in Spark (single-node deployment!) Potentially increased benefit in larger-scale scenarios FPGA accelerators can leverage microservices architecture Key to efficiency is CPU-memory-FPGA interaction; accelerator management Optimizations to both low-level and high-level stack help improve performance Future work: Performance evaluation with cloud and machine learning workloads 28
Using FPGAs as Microservices: Technology, Challenges and Case Study
Using FPGAs as Microservices: Technology, Challenges and Case Study David Ojika¹ ², Ann Gordon-Ross¹, Herman Lam¹, Bhavesh Patel², Gaurav Kaul³, Jayson Strayer³ ¹University of Florida ²DELL EMC ³Intel
More informationEnabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center
Enabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center Naif Tarafdar, Thomas Lin, Eric Fukuda, Hadi Bannazadeh, Alberto Leon-Garcia, Paul Chow University of Toronto 1 Cloudy with
More informationField Programmable Gate Array (FPGA)
Field Programmable Gate Array (FPGA) Lecturer: Krébesz, Tamas 1 FPGA in general Reprogrammable Si chip Invented in 1985 by Ross Freeman (Xilinx inc.) Combines the advantages of ASIC and uc-based systems
More informationPactron FPGA Accelerated Computing Solutions
Pactron FPGA Accelerated Computing Solutions Intel Xeon + Altera FPGA 2015 Pactron HJPC Corporation 1 Motivation for Accelerators Enhanced Performance: Accelerators compliment CPU cores to meet market
More informationUsing the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver
Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data
More informationDid I Just Do That on a Bunch of FPGAs?
Did I Just Do That on a Bunch of FPGAs? Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto About the Talk Title It s the measure
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationData Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research
Data Processing at the Speed of 100 Gbps using Apache Crail Patrick Stuedi IBM Research The CRAIL Project: Overview Data Processing Framework (e.g., Spark, TensorFlow, λ Compute) Spark-IO Albis Pocket
More informationDRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric
DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based
More informationEnabling FPGAs in Hyperscale Data Centers
J. Weerasinghe; IEEE CBDCom 215, Beijing; 13 th August 215 Enabling s in Hyperscale Data Centers J. Weerasinghe 1, F. Abel 1, C. Hagleitner 1, A. Herkersdorf 2 1 IBM Research Zurich Laboratory 2 Technical
More informationCloud-Native Applications. Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0
Cloud-Native Applications Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0 Cloud-Native Characteristics Lean Form a hypothesis, build just enough to validate or disprove it. Learn
More informationToward a Memory-centric Architecture
Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationFPGA: What? Why? Marco D. Santambrogio
FPGA: What? Why? Marco D. Santambrogio marco.santambrogio@polimi.it 2 Reconfigurable Hardware Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially much
More informationSpark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies
Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay 1 Apache Spark - Intro Spark within the Big Data ecosystem Data Sources Data Acquisition / ETL Data Storage Data Analysis / ML Serving 3 Apache
More informationScaling Up Performance Benchmarking
Scaling Up Performance Benchmarking -with SPECjbb2015 Anil Kumar Runtime Performance Architect @Intel, OSG Java Chair Monica Beckwith Runtime Performance Architect @Arm, Java Champion FaaS Serverless Frameworks
More informationOutline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need??
Outline EEL 7 Graduate Computer Architecture Chapter 3 Limits to ILP and Simultaneous Multithreading! Limits to ILP! Thread Level Parallelism! Multithreading! Simultaneous Multithreading Ann Gordon-Ross
More informationPerformance Matters Scaling Integration Processes to Meet the Needs of Your Business. James Ahlborn, Chief Software Architect, Dell Boomi
Performance Matters Scaling Integration Processes to Meet the Needs of Your Business James Ahlborn, Chief Software Architect, Dell Boomi 1 Atoms Agenda Atoms vs. Molecules Atom Clouds Atom Workers Performance
More informationProcesses, Threads and Processors
1 Processes, Threads and Processors Processes and Threads From Processes to Threads Don Porter Portions courtesy Emmett Witchel Hardware can execute N instruction streams at once Ø Uniprocessor, N==1 Ø
More informationDRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric
DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based
More informationIntegrate MATLAB Analytics into Enterprise Applications
Integrate Analytics into Enterprise Applications Aurélie Urbain MathWorks Consulting Services 2015 The MathWorks, Inc. 1 Data Analytics Workflow Data Acquisition Data Analytics Analytics Integration Business
More informationProviding Multi-tenant Services with FPGAs: Case Study on a Key-Value Store
Zsolt István *, Gustavo Alonso, Ankit Singla Systems Group, Computer Science Dept., ETH Zürich * Now at IMDEA Software Institute, Madrid Providing Multi-tenant Services with FPGAs: Case Study on a Key-Value
More informationAN 831: Intel FPGA SDK for OpenCL
AN 831: Intel FPGA SDK for OpenCL Host Pipelined Multithread Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Intel FPGA SDK for OpenCL Host Pipelined Multithread...3 1.1
More informationECE 8823: GPU Architectures. Objectives
ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading
More informationFABRICATION TECHNOLOGIES
FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general
More informationPRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
Scalable and Energy-Efficient Architecture Lab (SEAL) PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in -based Main Memory Ping Chi *, Shuangchen Li *, Tao Zhang, Cong
More informationZhang Tianfei. Rosen Xu
Zhang Tianfei Rosen Xu Agenda Part 1: FPGA and OPAE - Intel FPGAs and the Modern Datacenter - Platform Options and the Acceleration Stack - FPGA Hardware overview - Open Programmable Acceleration Engine
More informationSCYLLA: NoSQL at Ludicrous Speed. 主讲人 :ScyllaDB 软件工程师贺俊
SCYLLA: NoSQL at Ludicrous Speed 主讲人 :ScyllaDB 软件工程师贺俊 Today we will cover: + Intro: Who we are, what we do, who uses it + Why we started ScyllaDB + Why should you care + How we made design decisions to
More informationEnergy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package
High Performance Machine Learning Workshop Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package Matheus Souza, Lucas Maciel, Pedro Penna, Henrique Freitas 24/09/2018 Agenda Introduction
More informationJava Without the Jitter
TECHNOLOGY WHITE PAPER Achieving Ultra-Low Latency Table of Contents Executive Summary... 3 Introduction... 4 Why Java Pauses Can t Be Tuned Away.... 5 Modern Servers Have Huge Capacities Why Hasn t Latency
More informationPocket: Elastic Ephemeral Storage for Serverless Analytics
Pocket: Elastic Ephemeral Storage for Serverless Analytics Ana Klimovic*, Yawen Wang*, Patrick Stuedi +, Animesh Trivedi +, Jonas Pfefferle +, Christos Kozyrakis* *Stanford University, + IBM Research 1
More informationHiTune. Dataflow-Based Performance Analysis for Big Data Cloud
HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241
More informationHigh Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx
High Capacity and High Performance 20nm FPGAs Steve Young, Dinesh Gaitonde August 2014 Not a Complete Product Overview Page 2 Outline Page 3 Petabytes per month Increasing Bandwidth Global IP Traffic Growth
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More informationMoneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories
Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Adrian M. Caulfield Arup De, Joel Coburn, Todor I. Mollov, Rajesh K. Gupta, Steven Swanson Non-Volatile Systems
More informationDriveScale-DellEMC Reference Architecture
DriveScale-DellEMC Reference Architecture DellEMC/DRIVESCALE Introduction DriveScale has pioneered the concept of Software Composable Infrastructure that is designed to radically change the way data center
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationSimplify System Complexity
1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller
More informationAn Architectural Framework for Accelerating Dynamic Parallel Algorithms on Reconfigurable Hardware
An Architectural Framework for Accelerating Dynamic Parallel Algorithms on Reconfigurable Hardware Tao Chen, Shreesha Srinath Christopher Batten, G. Edward Suh Computer Systems Laboratory School of Electrical
More informationHardware Accelerated Application Integration: Challenges and Opportunities. ACM/IFIP/USENIX Middleware 2017 Daniel Ritter
Hardware Accelerated Application Integration: Challenges and Opportunities Active @ ACM/IFIP/USENIX Middleware 2017 Daniel Ritter Application Integration EAI System, Integration Processes - De-coupling
More informationTOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT
TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT Eric Kelmelis 28 March 2018 OVERVIEW BACKGROUND Evolution of processing hardware CROSS-PLATFORM KERNEL DEVELOPMENT Write once, target multiple hardware
More informationAN OPEN-SOURCE BENCHMARK SUITE FOR MICROSERVICES AND THEIR HARDWARE-SOFTWARE IMPLICATIONS FOR CLOUD AND EDGE SYSTEMS
AN OPEN-SOURCE BENCHMARK SUITE FOR MICROSERVICES AND THEIR HARDWARE-SOFTWARE IMPLICATIONS FOR CLOUD AND EDGE SYSTEMS Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayantara Katarki,
More informationCenter Extreme Scale CS Research
Center Extreme Scale CS Research Center for Compressible Multiphase Turbulence University of Florida Sanjay Ranka Herman Lam Outline 10 6 10 7 10 8 10 9 cores Parallelization and UQ of Rocfun and CMT-Nek
More informationA U G U S T 8, S A N T A C L A R A, C A
A U G U S T 8, 2 0 1 8 S A N T A C L A R A, C A Data-Centric Innovation Summit LISA SPELMAN VICE PRESIDENT & GENERAL MANAGER INTEL XEON PRODUCTS AND DATA CENTER MARKETING Increased integration and optimization
More informationCover TBD. intel Quartus prime Design software
Cover TBD intel Quartus prime Design software Fastest Path to Your Design The Intel Quartus Prime software is revolutionary in performance and productivity for FPGA, CPLD, and SoC designs, providing a
More informationThe Use of Cloud Computing Resources in an HPC Environment
The Use of Cloud Computing Resources in an HPC Environment Bill, Labate, UCLA Office of Information Technology Prakashan Korambath, UCLA Institute for Digital Research & Education Cloud computing becomes
More informationMerging Enterprise Applications with Docker* Container Technology
Solution Brief NetApp Docker Volume Plugin* Intel Xeon Processors Intel Ethernet Converged Network Adapters Merging Enterprise Applications with Docker* Container Technology Enabling Scale-out Solutions
More informationApache Flink- A System for Batch and Realtime Stream Processing
Apache Flink- A System for Batch and Realtime Stream Processing Lecture Notes Winter semester 2016 / 2017 Ludwig-Maximilians-University Munich Prof Dr. Matthias Schubert 2016 Introduction to Apache Flink
More informationIBM Education Assistance for z/os V2R2
IBM Education Assistance for z/os V2R2 Item: RSM Scalability Element/Component: Real Storage Manager Material current as of May 2015 IBM Presentation Template Full Version Agenda Trademarks Presentation
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationCS 470 Spring Virtualization and Cloud Computing. Mike Lam, Professor. Content taken from the following:
CS 470 Spring 2018 Mike Lam, Professor Virtualization and Cloud Computing Content taken from the following: A. Silberschatz, P. B. Galvin, and G. Gagne. Operating System Concepts, 9 th Edition (Chapter
More informationAdapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES
Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES Tom Atwood Business Development Manager Sun Microsystems, Inc. Takeaways Understand the technical differences between
More informationIntegrate MATLAB Analytics into Enterprise Applications
Integrate Analytics into Enterprise Applications Dr. Roland Michaely 2015 The MathWorks, Inc. 1 Data Analytics Workflow Access and Explore Data Preprocess Data Develop Predictive Models Integrate Analytics
More informationWhat s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1
What s New in VMware vsphere 4.1 Performance VMware vsphere 4.1 T E C H N I C A L W H I T E P A P E R Table of Contents Scalability enhancements....................................................................
More informationFPGA design with National Instuments
FPGA design with National Instuments Rémi DA SILVA Systems Engineer - Embedded and Data Acquisition Systems - MED Region ni.com The NI Approach to Flexible Hardware Processor Real-time OS Application software
More informationNeural Network based Energy-Efficient Fault Tolerant Architect
Neural Network based Energy-Efficient Fault Tolerant Architectures and Accelerators University of Rochester February 7, 2013 References Flexible Error Protection for Energy Efficient Reliable Architectures
More informationData Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research
Data Processing at the Speed of 100 Gbps using Apache Crail Patrick Stuedi IBM Research The CRAIL Project: Overview Data Processing Framework (e.g., Spark, TensorFlow, λ Compute) Spark-IO FS Albis Streaming
More informationIBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics
IBM Data Science Experience White paper R Transforming R into a tool for big data analytics 2 R Executive summary This white paper introduces R, a package for the R statistical programming language that
More informationEnd-to-End Java Security Performance Enhancements for Oracle SPARC Servers Performance engineering for a revenue product
End-to-End Java Security Performance Enhancements for Oracle SPARC Servers Performance engineering for a revenue product Luyang Wang, Pallab Bhattacharya, Yao-Min Chen, Shrinivas Joshi and James Cheng
More informationCPU-GPU Heterogeneous Computing
CPU-GPU Heterogeneous Computing Advanced Seminar "Computer Engineering Winter-Term 2015/16 Steffen Lammel 1 Content Introduction Motivation Characteristics of CPUs and GPUs Heterogeneous Computing Systems
More informationTop five Docker performance tips
Top five Docker performance tips Top five Docker performance tips Table of Contents Introduction... 3 Tip 1: Design design applications as microservices... 5 Tip 2: Deployment deploy Docker components
More informationOperating System Support for Shared-ISA Asymmetric Multi-core Architectures
Operating System Support for Shared-ISA Asymmetric Multi-core Architectures Tong Li, Paul Brett, Barbara Hohlt, Rob Knauerhase, Sean McElderry, Scott Hahn Intel Corporation Contact: tong.n.li@intel.com
More informationScaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks
Scaling up MATLAB Analytics Marta Wilczkowiak, PhD Senior Applications Engineer MathWorks 2013 The MathWorks, Inc. 1 Agenda Giving access to your analytics to more users Handling larger problems 2 When
More informationPractical Near-Data Processing for In-Memory Analytics Frameworks
Practical Near-Data Processing for In-Memory Analytics Frameworks Mingyu Gao, Grant Ayers, Christos Kozyrakis Stanford University http://mast.stanford.edu PACT Oct 19, 2015 Motivating Trends End of Dennard
More informationReactive Microservices Architecture on AWS
Reactive Microservices Architecture on AWS Sascha Möllering Solutions Architect, @sascha242, Amazon Web Services Germany GmbH Why are we here today? https://secure.flickr.com/photos/mgifford/4525333972
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationHedvig as backup target for Veeam
Hedvig as backup target for Veeam Solution Whitepaper Version 1.0 April 2018 Table of contents Executive overview... 3 Introduction... 3 Solution components... 4 Hedvig... 4 Hedvig Virtual Disk (vdisk)...
More informationElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests
ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests Mingxing Tan 1 2, Gai Liu 1, Ritchie Zhao 1, Steve Dai 1, Zhiru Zhang 1 1 Computer Systems Laboratory, Electrical and Computer
More informationChallenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008
Michael Doggett Graphics Architecture Group April 2, 2008 Graphics Processing Unit Architecture CPUs vsgpus AMD s ATI RADEON 2900 Programming Brook+, CAL, ShaderAnalyzer Architecture Challenges Accelerated
More informationFPGA architecture and design technology
CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA
More informationMicroservices without the Servers: AWS Lambda in Action
Microservices without the Servers: AWS Lambda in Action Dr. Tim Wagner, General Manager AWS Lambda August 19, 2015 Seattle, WA 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Two
More informationDeploying and Operating Cloud Native.NET apps
Deploying and Operating Cloud Native.NET apps Jenny McLaughlin, Sr. Platform Architect Cornelius Mendoza, Sr. Platform Architect Pivotal Cloud Native Practices Continuous Delivery DevOps Microservices
More informationFPGA-based Supercomputing: New Opportunities and Challenges
FPGA-based Supercomputing: New Opportunities and Challenges Naoya Maruyama (RIKEN AICS)* 5 th ADAC Workshop Feb 15, 2018 * Current Main affiliation is Lawrence Livermore National Laboratory SIAM PP18:
More informationEECS4201 Computer Architecture
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be
More informationMapReduce for Data Intensive Scientific Analyses
apreduce for Data Intensive Scientific Analyses Jaliya Ekanayake Shrideep Pallickara Geoffrey Fox Department of Computer Science Indiana University Bloomington, IN, 47405 5/11/2009 Jaliya Ekanayake 1 Presentation
More informationDeduplication File System & Course Review
Deduplication File System & Course Review Kai Li 12/13/13 Topics u Deduplication File System u Review 12/13/13 2 Storage Tiers of A Tradi/onal Data Center $$$$ Mirrored storage $$$ Dedicated Fibre Clients
More informationHigh Performance Computing on GPUs using NVIDIA CUDA
High Performance Computing on GPUs using NVIDIA CUDA Slides include some material from GPGPU tutorial at SIGGRAPH2007: http://www.gpgpu.org/s2007 1 Outline Motivation Stream programming Simplified HW and
More informationASSEMBLY LANGUAGE MACHINE ORGANIZATION
ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction
More informationECE 636. Reconfigurable Computing. Lecture 2. Field Programmable Gate Arrays I
ECE 636 Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays I Overview Anti-fuse and EEPROM-based devices Contemporary SRAM devices - Wiring - Embedded New trends - Single-driver wiring -
More informationEvolution of Virtual Machine Technologies for Portability and Application Capture. Bob Vandette Java Hotspot VM Engineering Sept 2004
Evolution of Virtual Machine Technologies for Portability and Application Capture Bob Vandette Java Hotspot VM Engineering Sept 2004 Topics Virtual Machine Evolution Timeline & Products Trends forcing
More informationMATE-EC2: A Middleware for Processing Data with Amazon Web Services
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering
More informationBuilding a Data-Friendly Platform for a Data- Driven Future
Building a Data-Friendly Platform for a Data- Driven Future Benjamin Hindman - @benh 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami BENJAMIN HINDMAN Co-founder and Chief Architect of Mesosphere,
More informationCover TBD. intel Quartus prime Design software
Cover TBD intel Quartus prime Design software Fastest Path to Your Design The Intel Quartus Prime software is revolutionary in performance and productivity for FPGA, CPLD, and SoC designs, providing a
More informationMellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions
Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions Providing Superior Server and Storage Performance, Efficiency and Return on Investment As Announced and Demonstrated at
More informationWORKLOAD CHARACTERIZATION OF INTERACTIVE CLOUD SERVICES BIG AND SMALL SERVER PLATFORMS
WORKLOAD CHARACTERIZATION OF INTERACTIVE CLOUD SERVICES ON BIG AND SMALL SERVER PLATFORMS Shuang Chen*, Shay Galon**, Christina Delimitrou*, Srilatha Manne**, and José Martínez* *Cornell University **Cavium
More informationECE 331 Digital System Design
ECE 331 Digital System Design Tristate Buffers, Read-Only Memories and Programmable Logic Devices (Lecture #17) The slides included herein were taken from the materials accompanying Fundamentals of Logic
More informationTen things hyperconvergence can do for you
Ten things hyperconvergence can do for you Francis O Haire Director, Technology & Strategy DataSolutions Evolution of Enterprise Infrastructure 1990s Today Virtualization Server Server Server Server Scale-Out
More informationData Path acceleration techniques in a NFV world
Data Path acceleration techniques in a NFV world Mohanraj Venkatachalam, Purnendu Ghosh Abstract NFV is a revolutionary approach offering greater flexibility and scalability in the deployment of virtual
More informationUsing MySQL in a Virtualized Environment. Scott Seighman Systems Engineer Sun Microsystems
Using MySQL in a Virtualized Environment Scott Seighman Systems Engineer Sun Microsystems 1 Agenda Virtualization Overview > Why Use Virtualization > Options > Considerations MySQL & Virtualization Best
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationIBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM s sole discretion.
Please note Copyright 2018 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM IBM s statements
More informationBERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Amazon Aurora: Amazon s New Relational Database Engine Carlos Conde Technology Evangelist @caarlco 2015, Amazon Web Services,
More informationBig Data Systems on Future Hardware. Bingsheng He NUS Computing
Big Data Systems on Future Hardware Bingsheng He NUS Computing http://www.comp.nus.edu.sg/~hebs/ 1 Outline Challenges for Big Data Systems Why Hardware Matters? Open Challenges Summary 2 3 ANYs in Big
More informationDNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs
IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei
More informationPerformance Benefits of Running RocksDB on Samsung NVMe SSDs
Performance Benefits of Running RocksDB on Samsung NVMe SSDs A Detailed Analysis 25 Samsung Semiconductor Inc. Executive Summary The industry has been experiencing an exponential data explosion over the
More informationCloud Programming. Programming Environment Oct 29, 2015 Osamu Tatebe
Cloud Programming Programming Environment Oct 29, 2015 Osamu Tatebe Cloud Computing Only required amount of CPU and storage can be used anytime from anywhere via network Availability, throughput, reliability
More informationDatabase Acceleration Solution Using FPGAs and Integrated Flash Storage
Database Acceleration Solution Using FPGAs and Integrated Flash Storage HK Verma, Xilinx Inc. August 2017 1 FPGA Analytics in Flash Storage System In-memory or Flash storage based DB reduce disk access
More informationModern hyperconverged infrastructure. Karel Rudišar Systems Engineer, Vmware Inc.
Modern hyperconverged infrastructure Karel Rudišar Systems Engineer, Vmware Inc. 2 What Is Hyper-Converged Infrastructure? - The Ideal Architecture for SDDC Management SDDC Compute Networking Storage Simplicity
More information