Labeled RISC-V: A New Perspective on Software-Defined Architecture

Size: px
Start display at page:

Download "Labeled RISC-V: A New Perspective on Software-Defined Architecture"

Transcription

1 Labeled RISC-V: A New Perspective on Software-Defined Architecture Zihao Yu, Jiuyue Ma, Bowen Huang, Xin Jin, Huizhe Wang, Yaoyang Zhou, Zihao Chang, Yan Cao, Sa Wang, Yungang Bao May 9 th, ShangHai Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) 1

2 Software-Defined Architecture Why? Agenda How? What Effect?

3 Tradeoff in Data Center: Util. vs. QoS LOW utilization[1,2]: 6%~12% Sharing causes interference Sacrifice utilization to guarantee QoS Lo et al. Heracles: Improving Resource Efficiency at Scale, ISCA, [1] [2] J. M. Kaplan, W. Forrest, and N. Kindler. Revolutionizing data center energy efficiency. McKinsey & Company,

4 More Hardware Support Needed 4

5 QoS-unawared Hardware Shared resources Unable to distinguish Unmanaged Sharing No architectural support for QoS Ambulance? 5

6 Software-Defined Architecture Why? Agenda How? What Effect?

7 Labeled Networking Fine-grain: every packet has a label Semantic association: correlate labels with users demand Propagation: propagate labels in a whole network DiffServ: process packets differentiately based on labels MPLS is widely used for VPN and QoS

8 The Computer as a Network Hardware components communicate via internal packets, e.g., PCIe packets, NoC packets, QPI packets Yes!

9 Labeled von Neumann Architecture (LvNA) 4. Software-defined control logic 2. Sematic association Input Core Core P3 P1 P0 Processor Output 1. Fine-grained object Memory 3. Propagation Bao and Wang, Labeled von Neumann Architecture for Software-Defined Cloud, Journal of Computer Science and Technology, 2017 Vol. 32 (2):

10 Add label registers to req. sources VM0 VM1 VMn Core Core Core Add label registers Shared Last Level Cache I/O Chipset Memory Controller Disk Disk Disk NIC 10

11 Sematic association VM0 VM1 VMn Core Core Core Shared Last Level Cache I/O Chipset Memory Controller Disk Disk Disk NIC 11

12 Propagation VM0 VM1 VMn Core Core Core Shared Last Level Cache CPU request I/O Chipset DMA Memory Controller Disk Disk Disk NIC 12

13 Software-defined label-based control logic VM0 VM1 VMn Core Core Core Shared Last Level Cache CL I/O Chipset CL Memory CL Controller Disk Disk Disk NIC 13

14 Programmable Architecture for Resourcing-on-Demand PARD Leverage LvNA to perform fine-grained control Ma et. al, Supporting Differentiated Services in Computers via Programmable Architecture for Resourcing-on-Demand (PARD), ASPLOS,

15 Tables as CL Cache Controller Memory Controller 15

16 Closed-loop Control stat op > < = threshold Trigger Table 1 Cond-1 Action-1 1 Cond-2 Action-2 2 Cond-3 Action-3 e.g. miss_rate > 30% Statistics Table 1 Stat1 Stat2 2 Stat1 Stat2 Parameter Table 3 1 Param1 Param2 2 Param1 Param2 3 Trigger => Action action signal + e.g. adjust way mask firmware action script 16

17 Platform Resource Manager (PRM) Augmented IPMI Connect all control logics Run linux-based firmware Abstract CLs as files Core CL VM0 I/O Chipset VM1 Core VMn Core Shared Last Level Cache Memory Controller CL CL /sys/cpa cpa0 ident type ldoms ldom0 parameter param1 param2 statistics trigger ldom1 ldom2 cpa1 cpa2 Disk Disk Disk CL CL CL CL NIC Monitoring & Interrupts Programming Centralized PRM 17

18 Access Control Logics Query Control Logic Info cat /sys/cpa/cpa0/ident cat /sys/cpa/cpa0/type Query Parameters cat /sys/cpa/cpa0/ /parameter/param1 Setting Parameters echo 10 > /sys/cpa/cpa0/ /parameter/param2 /sys/cpa cpa0 ident type ldoms ldom0 parameter param1 param2 statistics trigger ldom1 ldom2 cpa1 cpa2 18

19 Software-Defined Architecture Why? Agenda How? What Effect?

20 Implementation Full-system cycle-accurate simulator Open Sourced * FPGA prototype on Xilinx VC709 evaluation board MicroBlaze version RISC-V version Deprecated Open Sourced + *

21 AXI4 Memory Bus AXI4 I/O Bus Labeled Intr. CN Bus (I 2 C-based) PC IP: (dhcp, tftp, httpd) LDom #1 LDom #2 LDom #3 LDom #4 SFP+ UART Core Control Logic Core0 Core1 Core2 Core3 Cache Control Logic Cache / XBar Memory Control Logic Memory Controller (MIG7) I/O CL Eth0 Eth1 Eth2 UART*4 CN Switch PRM Xilinx VC709 Evaluation Board 21

22 Evaluation - Performance Isolation 4 Ldoms: 1 X 429.mcf + 3 X Attacker Allocate different LLC capacities Perf. degradation: 7%(w/ PARD) vs. 48%(w/o PARD) solo attacker Attacker + T->A 22

23 LvNA + RISC-V = Labeled RISC-V Switching to RISC-V Add labels in the core easier riscv-go to run container Goal - establish Labeled RISC-V branch 23

24 LvNA + RISC-V = Labeled RISC-V Switching to RISC-V Add labels in the core easier riscv-go to run container Goal - establish Labeled RISC-V branch 24

25 LvNA + RISC-V = Labeled RISC-V Switching to RISC-V Add labels in the core easier riscv-go to run container Goal - establish Labeled RISC-V branch CL 25

26 LvNA + RISC-V = Labeled RISC-V Switching to RISC-V Add labels in the core easier riscv-go to run container Goal - establish Labeled RISC-V branch CL TL2.dsid <-> axi.user 26

27 LvNA + RISC-V = Labeled RISC-V Switching to RISC-V Add labels in the core easier riscv-go to run container Goal - establish Labeled RISC-V branch CL Address mapping Base Len 1 0x0000 0x x8000 0x8000 Labeled token bucket CL TL2.dsid <-> axi.user 27

28 LvNA + RISC-V = Labeled RISC-V Switching to RISC-V Add labels in the core easier riscv-go to run container Goal - establish Labeled RISC-V branch PRM CL Address mapping Base Len 1 0x0000 0x x8000 0x8000 Labeled token bucket CL TL2.dsid <-> axi.user 28

29 Overheads 16 lines of chisel code to add labels into RocketChip Control Logics Add dsid member in the Bundle of TileLink2 Attach labels at the TileLink2 masters of core tiles < 5% resource overheads for CLs Much more less with complex cores, e.g. BOOM 29

30 Demo 1 - NoHype Push the software hypervisor down to LvNA Isolate the resources (address space, device) by CLs Partitioned into several sub-machines Label-based CL Label-based CL Label-based CL 30

31 Demo 1 - NoHype LDom0 LDom1 31

32 Demo 2 Memory Bandwidth Control Use labeled token buckets to isolate the bandwidth attacker Solo interfered isolated 32

33 Hardware - LvNA Labels everywhere in the hardware Plan to tape out with 40nm TSMC! PRM (rocket) ROB boom + SMT CL CL CL PCI-E boom + SMT CL L2 boom + SMT CL CL boom + SMT CL MEM Application Schedule Framework Compiler Runtime Library Operating System Hypervisor Hardware... 33

34 Hypervisor - NoHype Finished Push the software hypervisor down to LvNA Remove run-time overhead VMentry, VMexit Isolated by NoHype VM 4 VM n VM 1 VM 2 VM 3 software VMM Application Schedule Framework Compiler Runtime Library Operating System Hypervisor Hardware LvNA 34

35 Operating System - Fine-grained labeling Add fine-grained label as context resource Process (integrated into Cgroup) Process/container-level Thread-level Address space Function-level Object-level Provide libraries pthread_create_with_dsid() malloc_with_dsid() Process relative labeling Finished P0 VM0 P1 Core dsid start end 1 0x8000 0xffff 3 0x2000 0x27ff Application Schedule Framework Compiler Runtime Library Operating System Hypervisor Hardware Address space labeling 35

36 Compiler - collect QoS info. from prog Express QoS info. from source files Additional compilation results Address space labeling info Extra ELF sections for loader Resource requirement QoS desc. file for schedule framework Binary QoS Desc. Application Schedule Framework Compiler Runtime Library Operating System Hypervisor dsid start end 1 0x8000 0xffff call sort SLA = 10s working set = 64KB Hardware 3 0x2000 0x27ff #progma qos(10s) sort(); 36

37 Sche. Framework - QoS resource schedule Expose QoS resources to schedule frameworks Integrate QoS resources into OpenStack Finished Application Schedule Framework Compiler Runtime Library Operating System Hypervisor Hardware 37

38 A lot to explore! Theory: How does LvNA impact on RAM, PRAM, LogP models? Finished On-going Have ideas Feature work Hardware/Arch: How to implement LvNA at CPU pipeline/smt, memory, storage, networking? How to correlate LvNA and SDN by labels? OS/Hypervisor: How to correlate labels with VMs, containers, processes, threads? How to abstract programming interfaces for labels? Programing Model and Compilers: How to express users requirements and propagate to the hardware via labels? How to make compilers support labels? Distributed systems: How to correlate labels with distributed resources? How to manage distributed systems with label mechanisms? Measurement/Audit: How to leverage labels to gauge and audit resource usages? 38

39 Summary LvNA: a model of software-defined architecture PARD: a proof of concept of LvNA Labeled RISC-V: an implementation of LvNA 39

40 Thanks Scan the QR code to join the discussion group with WeChat Labeled RISC-V: A New Perspective on Software-Defined Architecture 40

Labeled RISC-V: A New Perspective on Software-Defined Architecture

Labeled RISC-V: A New Perspective on Software-Defined Architecture Labeled RISC-V: A New Perspective on Software-Defined Architecture Zihao Yu, Bowen Huang, Jiuyue Ma, Ninghui Sun, Yungang Bao Oct 14 th, 2017 @ Boston Institute of Computing Technology (ICT), Chinese Academy

More information

The Case for Labeled von Neumann Architecture ( LvNA )

The Case for Labeled von Neumann Architecture ( LvNA ) The Case for Labeled von Neumann Architecture ( LvNA ) Yungang Bao, Zihao Yu, Dejun Jiang ICT, CAS 2018-6-3 Tutorial Organizers Yungang Bao Professor @ ICT Director @ ACS Dejun Jiang Research Center for

More information

Labeled RISC-V Demos

Labeled RISC-V Demos Labeled RISC-V Demos Zihao Yu, Yungang Bao June 3 rd, 2018 @ Los Angeles Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) 1 We have provided a server for you! Please prepare your

More information

OpenPrefetch. (in-progress)

OpenPrefetch. (in-progress) OpenPrefetch Let There Be Industry-Competitive Prefetching in RISC-V Processors (in-progress) Bowen Huang, Zihao Yu, Zhigang Liu, Chuanqi Zhang, Sa Wang, Yungang Bao Institute of Computing Technology(ICT),

More information

SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE PENG

SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE PENG SOFT CONTAINER TOWARDS 100% RESOURCE UTILIZATION ACCELA ZHAO, LAYNE PENG 1 WHO ARE THOSE GUYS Accela Zhao, Technologist at EMC OCTO, active Openstack community contributor, experienced in cloud scheduling

More information

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization:

Knut Omang Ifi/Oracle 20 Oct, Introduction to virtualization (Virtual machines) Aspects of network virtualization: Software and hardware support for Network Virtualization part 2 Knut Omang Ifi/Oracle 20 Oct, 2015 32 Overview Introduction to virtualization (Virtual machines) Aspects of network virtualization: Virtual

More information

PROTECTING VM REGISTER STATE WITH AMD SEV-ES DAVID KAPLAN LSS 2017

PROTECTING VM REGISTER STATE WITH AMD SEV-ES DAVID KAPLAN LSS 2017 PROTECTING VM REGISTER STATE WITH AMD SEV-ES DAVID KAPLAN LSS 2017 BACKGROUND-- HARDWARE MEMORY ENCRYPTION AMD Secure Memory Encryption (SME) / AMD Secure Encrypted Virtualization (SEV) Hardware AES engine

More information

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS

SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS SOFTWARE-DEFINED MEMORY HIERARCHIES: SCALABILITY AND QOS IN THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CSAIL IAP MEETING MAY 21, 2013 Research Agenda Lack of technology progress Moore s Law still alive Power

More information

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016 Xen and the Art of Virtualization CSE-291 (Cloud Computing) Fall 2016 Why Virtualization? Share resources among many uses Allow heterogeneity in environments Allow differences in host and guest Provide

More information

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010 Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed

More information

RISC-V based core as a soft processor in FPGAs Chowdhary Musunuri Sr. Director, Solutions & Applications Microsemi

RISC-V based core as a soft processor in FPGAs Chowdhary Musunuri Sr. Director, Solutions & Applications Microsemi Power Matters. TM RISC-V based core as a soft processor in FPGAs Chowdhary Musunuri Sr. Director, Solutions & Applications Microsemi chowdhary.musunuri@microsemi.com RIC217 1 Agenda A brief introduction

More information

An FPGA-Based Optical IOH Architecture for Embedded System

An FPGA-Based Optical IOH Architecture for Embedded System An FPGA-Based Optical IOH Architecture for Embedded System Saravana.S Assistant Professor, Bharath University, Chennai 600073, India Abstract Data traffic has tremendously increased and is still increasing

More information

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班 I/O virtualization Jiang, Yunhong Yang, Xiaowei 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE,

More information

Programmable NICs. Lecture 14, Computer Networks (198:552)

Programmable NICs. Lecture 14, Computer Networks (198:552) Programmable NICs Lecture 14, Computer Networks (198:552) Network Interface Cards (NICs) The physical interface between a machine and the wire Life of a transmitted packet Userspace application NIC Transport

More information

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II

CS-580K/480K Advanced Topics in Cloud Computing. VM Virtualization II CS-580K/480K Advanced Topics in Cloud Computing VM Virtualization II 1 How to Build a Virtual Machine? 2 How to Run a Program Compiling Source Program Loading Instruction Instruction Instruction Instruction

More information

Providing Multi-tenant Services with FPGAs: Case Study on a Key-Value Store

Providing Multi-tenant Services with FPGAs: Case Study on a Key-Value Store Zsolt István *, Gustavo Alonso, Ankit Singla Systems Group, Computer Science Dept., ETH Zürich * Now at IMDEA Software Institute, Madrid Providing Multi-tenant Services with FPGAs: Case Study on a Key-Value

More information

CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces

CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces Zvonimir Z. Bandic, Sr. Director Robert Golla, Sr. Fellow Dejan Vucinic,

More information

Software Development Using Full System Simulation with Freescale QorIQ Communications Processors

Software Development Using Full System Simulation with Freescale QorIQ Communications Processors Patrick Keliher, Simics Field Application Engineer Software Development Using Full System Simulation with Freescale QorIQ Communications Processors 1 2013 Wind River. All Rights Reserved. Agenda Introduction

More information

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including:

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including: IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including: 1. IT Cost Containment 84 topics 2. Cloud Computing Readiness 225

More information

Towards Converged SmartNIC Architecture for Bare Metal & Public Clouds. Layong (Larry) Luo, Tencent TEG August 8, 2018

Towards Converged SmartNIC Architecture for Bare Metal & Public Clouds. Layong (Larry) Luo, Tencent TEG August 8, 2018 Towards Converged Smart Architecture for Bare Metal & Public Clouds Layong (Larry) Luo, Tencent TEG August 8, 2018 Agenda 1 Smart in Bare Metal Cloud 2 Smart in Public Cloud 3 Converged Smart Architecture

More information

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors University of Crete School of Sciences & Engineering Computer Science Department Master Thesis by Michael Papamichael Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

More information

The von Neuman architecture characteristics are: Data and Instruction in same memory, memory contents addressable by location, execution in sequence.

The von Neuman architecture characteristics are: Data and Instruction in same memory, memory contents addressable by location, execution in sequence. CS 320 Ch. 3 The von Neuman architecture characteristics are: Data and Instruction in same memory, memory contents addressable by location, execution in sequence. The CPU consists of an instruction interpreter,

More information

The lowrisc project Alex Bradbury

The lowrisc project Alex Bradbury The lowrisc project Alex Bradbury lowrisc C.I.C. 3 rd April 2017 lowrisc We are producing an open source Linux capable System-on-a- Chip (SoC) 64-bit multicore Aim to be the Linux of the Hardware world

More information

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36

The Challenges of X86 Hardware Virtualization. GCC- Virtualization: Rajeev Wankar 36 The Challenges of X86 Hardware Virtualization GCC- Virtualization: Rajeev Wankar 36 The Challenges of X86 Hardware Virtualization X86 operating systems are designed to run directly on the bare-metal hardware,

More information

Simultaneous Multithreading on Pentium 4

Simultaneous Multithreading on Pentium 4 Hyper-Threading: Simultaneous Multithreading on Pentium 4 Presented by: Thomas Repantis trep@cs.ucr.edu CS203B-Advanced Computer Architecture, Spring 2004 p.1/32 Overview Multiple threads executing on

More information

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers

The Missing Piece of Virtualization. I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers The Missing Piece of Virtualization I/O Virtualization on 10 Gb Ethernet For Virtualized Data Centers Agenda 10 GbE Adapters Built for Virtualization I/O Throughput: Virtual & Non-Virtual Servers Case

More information

Zhang Tianfei. Rosen Xu

Zhang Tianfei. Rosen Xu Zhang Tianfei Rosen Xu Agenda Part 1: FPGA and OPAE - Intel FPGAs and the Modern Datacenter - Platform Options and the Acceleration Stack - FPGA Hardware overview - Open Programmable Acceleration Engine

More information

Free Chips Project: a nonprofit for hosting opensource RISC-V implementations, tools, code. Yunsup Lee SiFive

Free Chips Project: a nonprofit for hosting opensource RISC-V implementations, tools, code. Yunsup Lee SiFive Free Chips Project: a nonprofit for hosting opensource RISC-V implementations, tools, code Yunsup Lee SiFive SiFive Open Source We Open-Sourced the Freedom E310 Chip! 3 We Open-Sourced the Freedom E310

More information

An 80-core GRVI Phalanx Overlay on PYNQ-Z1:

An 80-core GRVI Phalanx Overlay on PYNQ-Z1: An 80-core GRVI Phalanx Overlay on PYNQ-Z1: Pynq as a High Productivity Platform For FPGA Design and Exploration Jan Gray jan@fpga.org http://fpga.org/grvi-phalanx FCCM 2017 05/03/2017 Pynq Workshop My

More information

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun EECS750: Advanced Operating Systems 2/24/2014 Heechul Yun 1 Administrative Project Feedback of your proposal will be sent by Wednesday Midterm report due on Apr. 2 3 pages: include intro, related work,

More information

Re-architecting Virtualization in Heterogeneous Multicore Systems

Re-architecting Virtualization in Heterogeneous Multicore Systems Re-architecting Virtualization in Heterogeneous Multicore Systems Himanshu Raj, Sanjay Kumar, Vishakha Gupta, Gregory Diamos, Nawaf Alamoosa, Ada Gavrilovska, Karsten Schwan, Sudhakar Yalamanchili College

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Spring 2018 Lecture 16: Virtual Machine Monitors Geoffrey M. Voelker Virtual Machine Monitors 2 Virtual Machine Monitors Virtual Machine Monitors (VMMs) are a hot

More information

KVM as The NFV Hypervisor

KVM as The NFV Hypervisor KVM as The NFV Hypervisor Jun Nakajima Contributors: Mesut Ergin, Yunhong Jiang, Krishna Murthy, James Tsai, Wei Wang, Huawei Xie, Yang Zhang 1 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED

More information

RISCV with Sanctum Enclaves. Victor Costan, Ilia Lebedev, Srini Devadas

RISCV with Sanctum Enclaves. Victor Costan, Ilia Lebedev, Srini Devadas RISCV with Sanctum Enclaves Victor Costan, Ilia Lebedev, Srini Devadas Today, privilege implies trust (1/3) If computing remotely, what is the TCB? Priviledge CPU HW Hypervisor trusted computing base OS

More information

Deterministic Memory Abstraction and Supporting Multicore System Architecture

Deterministic Memory Abstraction and Supporting Multicore System Architecture Deterministic Memory Abstraction and Supporting Multicore System Architecture Farzad Farshchi $, Prathap Kumar Valsan^, Renato Mancuso *, Heechul Yun $ $ University of Kansas, ^ Intel, * Boston University

More information

64-bit ARM Unikernels on ukvm

64-bit ARM Unikernels on ukvm 64-bit ARM Unikernels on ukvm Wei Chen Senior Software Engineer Tokyo / Open Source Summit Japan 2017 2017-05-31 Thanks to Dan Williams, Martin Lucina, Anil Madhavapeddy and other Solo5

More information

COSC6376 Cloud Computing Lecture 15: IO Virtualization

COSC6376 Cloud Computing Lecture 15: IO Virtualization COSC6376 Cloud Computing Lecture 15: IO Virtualization Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston IOV Outline PCI-E Sharing Terminology System Image 1 Virtual

More information

Workloads, Scalability and QoS Considerations in CMP Platforms

Workloads, Scalability and QoS Considerations in CMP Platforms Workloads, Scalability and QoS Considerations in CMP Platforms Presenter Don Newell Sr. Principal Engineer Intel Corporation 2007 Intel Corporation Agenda Trends and research context Evolving Workload

More information

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names Fast access ===> use map to find object HW == SW ===> map is in HW or SW or combo Extend range ===> longer, hierarchical names How is map embodied: --- L1? --- Memory? The Environment ---- Long Latency

More information

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive)

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive) NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive) NVDLA NVIDIA DEEP LEARNING ACCELERATOR IP Core for deep learning part of NVIDIA s Xavier

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation 12.2

More information

Virtualization. Dr. Yingwu Zhu

Virtualization. Dr. Yingwu Zhu Virtualization Dr. Yingwu Zhu Virtualization Definition Framework or methodology of dividing the resources of a computer into multiple execution environments. Types Platform Virtualization: Simulate a

More information

Cross-layer Optimization for Virtual Machine Resource Management

Cross-layer Optimization for Virtual Machine Resource Management Cross-layer Optimization for Virtual Machine Resource Management Ming Zhao, Arizona State University Lixi Wang, Amazon Yun Lv, Beihang Universituy Jing Xu, Google http://visa.lab.asu.edu Virtualized Infrastructures,

More information

An NVMe-based FPGA Storage Workload Accelerator

An NVMe-based FPGA Storage Workload Accelerator An NVMe-based FPGA Storage Workload Accelerator Dr. Sean Gibb, VP Software Eideticom Santa Clara, CA 1 PCIe Bus NVMe SSD NVMe SSD Acceleration Host CPU HDD RDMA NIC NoLoad Accel. Card TM Storage I/O Bandwidth

More information

Optimize New Intel Xeon E based Ser vers with Emulex OneConnect and OneCommand Manager

Optimize New Intel Xeon E based Ser vers with Emulex OneConnect and OneCommand Manager W h i t e p a p e r Optimize New Intel Xeon E5-2600-based Ser vers with Emulex OneConnect and OneCommand Manager Emulex products complement Intel Xeon E5-2600 processor capabilities for virtualization,

More information

Course Review. Hui Lu

Course Review. Hui Lu Course Review Hui Lu Syllabus Cloud computing Server virtualization Network virtualization Storage virtualization Cloud operating system Object storage Syllabus Server Virtualization Network Virtualization

More information

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C.

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C. Resource Containers A new facility for resource management in server systems G. Banga, P. Druschel, J. C. Mogul OSDI 1999 Presented by Uday Ananth Lessons in history.. Web servers have become predominantly

More information

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names Fast access ===> use map to find object HW == SW ===> map is in HW or SW or combo Extend range ===> longer, hierarchical names How is map embodied: --- L1? --- Memory? The Environment ---- Long Latency

More information

Performance Analysis of Virtual Environments

Performance Analysis of Virtual Environments Performance Analysis of Virtual Environments Nikhil V Mishrikoti (nikmys@cs.utah.edu) Advisor: Dr. Rob Ricci Co-Advisor: Anton Burtsev 1 Introduction Motivation Virtual Machines (VMs) becoming pervasive

More information

Impact of Cache Coherence Protocols on the Processing of Network Traffic

Impact of Cache Coherence Protocols on the Processing of Network Traffic Impact of Cache Coherence Protocols on the Processing of Network Traffic Amit Kumar and Ram Huggahalli Communication Technology Lab Corporate Technology Group Intel Corporation 12/3/2007 Outline Background

More information

High Performance Packet Processing with FlexNIC

High Performance Packet Processing with FlexNIC High Performance Packet Processing with FlexNIC Antoine Kaufmann, Naveen Kr. Sharma Thomas Anderson, Arvind Krishnamurthy University of Washington Simon Peter The University of Texas at Austin Ethernet

More information

Troubleshooting Converged Enterprise Networks

Troubleshooting Converged Enterprise Networks Troubleshooting Converged Enterprise Networks VoiceCon San Francisco 2008 Steven Guthrie Director, Product Marketing CA, Inc. Agenda > What I m hearing >What is QoS? > Network and Voice Service Management

More information

Getting to Work with OpenPiton. Princeton University. OpenPit

Getting to Work with OpenPiton. Princeton University.   OpenPit Getting to Work with OpenPiton Princeton University http://openpiton.org OpenPit Princeton Parallel Research Group Redesigning the Data Center of the Future Chip Architecture Operating Systems and Runtimes

More information

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing

More information

Four Components of a Computer System

Four Components of a Computer System Four Components of a Computer System Operating System Concepts Essentials 2nd Edition 1.1 Silberschatz, Galvin and Gagne 2013 Operating System Definition OS is a resource allocator Manages all resources

More information

Performance Profiling

Performance Profiling Performance Profiling Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University msryu@hanyang.ac.kr Outline History Understanding Profiling Understanding Performance Understanding Performance

More information

Service Edge Virtualization - Hardware Considerations for Optimum Performance

Service Edge Virtualization - Hardware Considerations for Optimum Performance Service Edge Virtualization - Hardware Considerations for Optimum Performance Executive Summary This whitepaper provides a high level overview of Intel based server hardware components and their impact

More information

AMD SEV Update Linux Security Summit David Kaplan, Security Architect

AMD SEV Update Linux Security Summit David Kaplan, Security Architect AMD SEV Update Linux Security Summit 2018 David Kaplan, Security Architect WHY NOT TRUST THE HYPERVISOR? Guest Perspective o Hypervisor is code I don t control o I can t tell if the hypervisor is compromised

More information

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

Efficient use of Virtual Prototypes in HW/SW Development and Verification

Efficient use of Virtual Prototypes in HW/SW Development and Verification Efficient use of Virtual Prototypes in HW/SW Development and Verification Rocco Jonack, MINRES Technologies GmbH Eyck Jentzsch, MINRES Technologies GmbH Accellera Systems Initiative 1 Virtual prototype

More information

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand Device Programming Nima Honarmand read/write interrupt read/write Spring 2017 :: CSE 506 Device Interface (Logical View) Device Interface Components: Device registers Device Memory DMA buffers Interrupt

More information

VM Design and Tradeoffs

VM Design and Tradeoffs VM Design and Tradeoffs or: Why bother having VM statistics? CS 161 2013 A3 Section 1 cs161@eecs.harvard.edu Max Wang max.wang@college Synchronization Synchronization How do you synch access to a table

More information

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

Logical Partitions on Many-core Processors

Logical Partitions on Many-core Processors Logical Partitions on Many-core Processors Ramya Masti, Claudio Marforio, Kari Kostiainen, Claudio Soriente, Srdjan Capkun ETH Zurich ACSAC 2015 1 Infrastructure as a Service (IaaS) App App App App OS

More information

Distributed Operation Layer

Distributed Operation Layer Distributed Operation Layer Iuliana Bacivarov, Wolfgang Haid, Kai Huang, and Lothar Thiele ETH Zürich Outline Distributed Operation Layer Overview Specification Application Architecture Mapping Design

More information

When MPPDB Meets GPU:

When MPPDB Meets GPU: When MPPDB Meets GPU: An Extendible Framework for Acceleration Laura Chen, Le Cai, Yongyan Wang Background: Heterogeneous Computing Hardware Trend stops growing with Moore s Law Fast development of GPU

More information

Knut Omang Ifi/Oracle 6 Nov, 2017

Knut Omang Ifi/Oracle 6 Nov, 2017 Software and hardware support for Network Virtualization part 1 Knut Omang Ifi/Oracle 6 Nov, 2017 1 Motivation Goal: Introduction to challenges in providing fast networking to virtual machines Prerequisites:

More information

Introduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles

Introduction. Application Performance in the QLinux Multimedia Operating System. Solution: QLinux. Introduction. Outline. QLinux Design Principles Application Performance in the QLinux Multimedia Operating System Sundaram, A. Chandra, P. Goyal, P. Shenoy, J. Sahni and H. Vin Umass Amherst, U of Texas Austin ACM Multimedia, 2000 Introduction General

More information

OpenFlow Software Switch & Intel DPDK. performance analysis

OpenFlow Software Switch & Intel DPDK. performance analysis OpenFlow Software Switch & Intel DPDK performance analysis Agenda Background Intel DPDK OpenFlow 1.3 implementation sketch Prototype design and setup Results Future work, optimization ideas OF 1.3 prototype

More information

Designing High-Performance and Fair Shared Multi-Core Memory Systems: Two Approaches. Onur Mutlu March 23, 2010 GSRC

Designing High-Performance and Fair Shared Multi-Core Memory Systems: Two Approaches. Onur Mutlu March 23, 2010 GSRC Designing High-Performance and Fair Shared Multi-Core Memory Systems: Two Approaches Onur Mutlu onur@cmu.edu March 23, 2010 GSRC Modern Memory Systems (Multi-Core) 2 The Memory System The memory system

More information

Yet Another Implementation of CoRAM Memory

Yet Another Implementation of CoRAM Memory Dec 7, 2013 CARL2013@Davis, CA Py Yet Another Implementation of Memory Architecture for Modern FPGA-based Computing Shinya Takamaeda-Yamazaki, Kenji Kise, James C. Hoe * Tokyo Institute of Technology JSPS

More information

Predicting Program Phases and Defending against Side-Channel Attacks using Hardware Performance Counters

Predicting Program Phases and Defending against Side-Channel Attacks using Hardware Performance Counters Predicting Program Phases and Defending against Side-Channel Attacks using Hardware Performance Counters Junaid Nomani and Jakub Szefer Computer Architecture and Security Laboratory Yale University junaid.nomani@yale.edu

More information

ResQ: Enabling SLOs in Network Function Virtualization

ResQ: Enabling SLOs in Network Function Virtualization ResQ: Enabling SLOs in Network Function Virtualization Amin Tootoonchian* Aurojit Panda Chang Lan Melvin Walls Katerina Argyraki Sylvia Ratnasamy Scott Shenker *Intel Labs UC Berkeley ICSI NYU Nefeli EPFL

More information

Review for Midterm. Starring Ari and Tyler

Review for Midterm. Starring Ari and Tyler Review for Midterm Starring Ari and Tyler Basic OS structure OS has two chief goals: arbitrating access to resources, and exposing functionality. Often go together: we arbitrate hardware by wrapping in

More information

Introduction of AMD Advanced Virtual Interrupt Controller

Introduction of AMD Advanced Virtual Interrupt Controller Introduction of AMD Advanced Virtual Interrupt Controller XenSummit 2012 Wei Huang August 2012 What is AVIC? AVIC is Advanced Virtual Interrupt Controller A virtual APIC to guest OSs with hardware acceleration

More information

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved Siemens Corporate Technology August 2015 Real-Time KVM for the Masses Unrestricted Siemens AG 2015. All rights reserved Real-Time KVM for the Masses Agenda Motivation & requirements Reference architecture

More information

G-NET: Effective GPU Sharing In NFV Systems

G-NET: Effective GPU Sharing In NFV Systems G-NET: Effective Sharing In NFV Systems Kai Zhang*, Bingsheng He^, Jiayu Hu #, Zeke Wang^, Bei Hua #, Jiayi Meng #, Lishan Yang # *Fudan University ^National University of Singapore #University of Science

More information

LINUX CONTAINERS. Where Enterprise Meets Embedded Operating Environments WHEN IT MATTERS, IT RUNS ON WIND RIVER

LINUX CONTAINERS. Where Enterprise Meets Embedded Operating Environments WHEN IT MATTERS, IT RUNS ON WIND RIVER Where Enterprise Meets Embedded Operating Environments WHEN IT MATTERS, IT RUNS ON WIND RIVER EXECUTIVE SUMMARY Flexible and connected platforms are core components in leading computing fields, including

More information

RISC-V Core IP Products

RISC-V Core IP Products RISC-V Core IP Products An Introduction to SiFive RISC-V Core IP Drew Barbier September 2017 drew@sifive.com SiFive RISC-V Core IP Products This presentation is targeted at embedded designers who want

More information

Memory Management. Goals of Memory Management. Mechanism. Policies

Memory Management. Goals of Memory Management. Mechanism. Policies Memory Management Design, Spring 2011 Department of Computer Science Rutgers Sakai: 01:198:416 Sp11 (https://sakai.rutgers.edu) Memory Management Goals of Memory Management Convenient abstraction for programming

More information

Scalable Architectural Support for Trusted Software

Scalable Architectural Support for Trusted Software Scalable Architectural Support for Trusted Software David Champagne and Ruby B. Lee Princeton University Secure Processor Design 11/02/2017 Dimitrios Skarlatos Motivation Apps handle sensitive/secret information

More information

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,

More information

Simplify System Complexity

Simplify System Complexity 1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller

More information

FAQ. Release rc2

FAQ. Release rc2 FAQ Release 19.02.0-rc2 January 15, 2019 CONTENTS 1 What does EAL: map_all_hugepages(): open failed: Permission denied Cannot init memory mean? 2 2 If I want to change the number of hugepages allocated,

More information

SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator

SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator FPGA Kongress München 2017 Martin Heimlicher Enclustra GmbH Agenda 2 What is Visual System Integrator? Introduction Platform

More information

Programming Netronome Agilio SmartNICs

Programming Netronome Agilio SmartNICs WHITE PAPER Programming Netronome Agilio SmartNICs NFP-4000 AND NFP-6000 FAMILY: SUPPORTED PROGRAMMING MODELS THE AGILIO SMARTNICS DELIVER HIGH- PERFORMANCE SERVER- BASED NETWORKING APPLICATIONS SUCH AS

More information

Supercomputing and Mass Market Desktops

Supercomputing and Mass Market Desktops Supercomputing and Mass Market Desktops John Manferdelli Microsoft Corporation This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

More information

Support for Smart NICs. Ian Pratt

Support for Smart NICs. Ian Pratt Support for Smart NICs Ian Pratt Outline Xen I/O Overview Why network I/O is harder than block Smart NIC taxonomy How Xen can exploit them Enhancing Network device channel NetChannel2 proposal I/O Architecture

More information

Embedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.

Embedded processors. Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto. Embedded processors Timo Töyry Department of Computer Science and Engineering Aalto University, School of Science timo.toyry(at)aalto.fi Comparing processors Evaluating processors Taxonomy of processors

More information

Maximum Performance. How to get it and how to avoid pitfalls. Christoph Lameter, PhD

Maximum Performance. How to get it and how to avoid pitfalls. Christoph Lameter, PhD Maximum Performance How to get it and how to avoid pitfalls Christoph Lameter, PhD cl@linux.com Performance Just push a button? Systems are optimized by default for good general performance in all areas.

More information

Virtual Machine Virtual Machine Types System Virtual Machine: virtualize a machine Container: virtualize an OS Program Virtual Machine: virtualize a process Language Virtual Machine: virtualize a language

More information

CCIX: a new coherent multichip interconnect for accelerated use cases

CCIX: a new coherent multichip interconnect for accelerated use cases : a new coherent multichip interconnect for accelerated use cases Akira Shimizu Senior Manager, Operator relations Arm 2017 Arm Limited Arm 2017 Interconnects for different scale SoC interconnect. Connectivity

More information

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241

More information

Intel Atom Processor Based Platform Technologies. Intelligent Systems Group Intel Corporation

Intel Atom Processor Based Platform Technologies. Intelligent Systems Group Intel Corporation Intel Atom Processor Based Platform Technologies Intelligent Systems Group Intel Corporation Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS

More information

High Performance Computing Cloud - a PaaS Perspective

High Performance Computing Cloud - a PaaS Perspective a PaaS Perspective Supercomputer Education and Research Center Indian Institute of Science, Bangalore November 2, 2015 Overview Cloud computing is emerging as a latest compute technology Properties of

More information

Getting to Work with OpenPiton

Getting to Work with OpenPiton Getting to Work with OpenPiton Jonathan Balkind, Michael McKeown, Yaosheng Fu, Tri Nguyen, Yanqi Zhou, Alexey Lavrov, Mohammad Shahrad, Adi Fuchs, Samuel Payne, Xiaohua Liang, Matthew Matl, David Wentzlaff

More information

Are You Insured Against Your Noisy Neighbor Sunku Ranganath, Intel Corporation Sridhar Rao, Spirent Communications

Are You Insured Against Your Noisy Neighbor Sunku Ranganath, Intel Corporation Sridhar Rao, Spirent Communications Are You Insured Against Your Noisy Neighbor Sunku Ranganath, Intel Corporation Sridhar Rao, Spirent Communications @SunkuRanganath, @ngignir Legal Disclaimer 2018 Intel Corporation. Intel, the Intel logo,

More information

Bolt: I Know What You Did Last Summer In the Cloud

Bolt: I Know What You Did Last Summer In the Cloud Bolt: I Know What You Did Last Summer In the Cloud Christina Delimitrou 1 and Christos Kozyrakis 2 1 Cornell University, 2 Stanford University ASPLOS April 12 th 2017 Executive Summary Problem: cloud resource

More information

Xen Summit Spring 2007

Xen Summit Spring 2007 Xen Summit Spring 2007 Platform Virtualization with XenEnterprise Rich Persaud 4/20/07 Copyright 2005-2006, XenSource, Inc. All rights reserved. 1 Xen, XenSource and XenEnterprise

More information

QUESTION BANK UNIT-I. 4. With a neat diagram explain Von Neumann computer architecture

QUESTION BANK UNIT-I. 4. With a neat diagram explain Von Neumann computer architecture UNIT-I 1. Write the basic functional units of computer? (Nov/Dec 2014) 2. What is a bus? What are the different buses in a CPU? 3. Define multiprogramming? 4.List the basic functional units of a computer?

More information