<Insert Picture Here> Boost Linux Performance with Enhancements from Oracle

Size: px
Start display at page:

Download "<Insert Picture Here> Boost Linux Performance with Enhancements from Oracle"

Transcription

1 <Insert Picture Here> Boost Linux Performance with Enhancements from Oracle Chris Mason Director of Linux Kernel Engineering

2 Linux Performance on Large Systems Exadata Hardware How large systems are different Finding bottlenecks Optimizations in Oracle's Unbreakable Enterprise Kernel

3 Exadata Hardware X2-8 8 Sockets Intel X Cores per socket 2 threads per core 1TB of ram 8 IB QDR ports (40Gb/sec each) Other assorted slots, ports, cards

4 X2-8 NUMA Non Uniform Memory Access X2-8 consists of four blades Each blade has two CPU sockets Each blade has 256GB of ram Each blade has one or more IB cards Fast interconnect to the other blades The CPUs access resources on the same blade much faster than resources on remote blades NUMA lowers hardware costs but increases work that must be done in software to optimize the system Linux already includes extensive optimizations and frameworks to run well on NUMA systems

5 Finding Bottlenecks Are my CPUs idle? Am I waiting on the disk or the network? Am I bottlenecked on a single CPU? Where is my CPU spending all its time? Application System time (kernel overhead) Softirq processing (kernel overhead) Mpstat -P ALL 1 Gives us a per CPU report of time spent waiting for IO, busy in application or kernel code, doing interrupts etc. Large systems often have a small number of CPUs pegged at 100% while others are mostly idle

6 Finding Bottlenecks: Latencytop Latencytop Tracks why each process waits in the kernel Can quickly determine if you're waiting on: Disk, network, kernel locks, anything that sleeps GUI mode to select a specific process Latencytop -c mode to collect information on each process over a long period of time

7 Finding Bottlenecks: perf When the system is CPU bound, perf can tell us why Profiling can be limited to a single CPU Very useful when only one CPU is saturated Profiles can include full back traces Explains the full call chain that leads to lock contention Example usage: Perf record -g -C 16 Record profiles on CPU 16 with call trace Perf record -g -a Record profiles on all CPUs Perf report -g Produce call graph report based on the profile

8 Optimizing Workloads Fast networking and storage IO rates add contention in new areas Spread interrupts over CPUs local to the cards Push softirq handling out over all the CPUs Reduce lock contention both in the kernel and application Lock contention is much more expensive in NUMA Use cpusets to control CPU allocation to specific workloads

9 Interrupt Processing Interrupts process events from the hardware Receive network packets Disk IO completion Linux irqbalance daemon spreads interrupt processing over CPUs based on load Irqbalance modifications Only process Irqs on CPUs local to the card Usually hand tuned on NUMA systems, but we added code to do this automatically

10 Softirqs Softirqs handle portions of the interrupt processing Waking up processes Copying data from the kernel to application memory (networking receives) Various kernel data structure updates Softirqs normally run on the same CPU that received the interrupt, but they run slightly later Spreading interrupt processing across CPUs also spreads the resulting softirq work across CPUs Interrupts must be done on CPUs local to the card for performance, but softirqs can be spread farther away

11 Spreading Softirqs for Storage IO affinity Records the CPU that issued an IO When the IO completes, the softirq is sent to the issuing CPU Very effective for solid state storage on large systems Reduces contention on scheduler locks because wakeups are done on the same CPU where the process last ran Enabled by default in Oracle's Unbreakable Enterprise Kernel >2x Improvement in SSD IO/s in one OLTP based test Almost 5x faster after removing driver lock contention

12 Spreading Softirqs for Networking Receive Packet Steering Spreads softirqs for tcpip receives across a mask of CPUs selected by the admin /sys/class/net/xx/queues/rx-n/rps_cpus XX is the network interface N is the queue number (some cards have many) Contains a mask in the taskset format of cpus to use Shotgun style spreading Hash of network headers picks the CPU Fairly random CPU selection for the softirq Not optimal on the x2-8 due to poor locality

13 Receive Flow Steering Second stage of receive packet steering /sys/class/net/xx/queues/rx-n/rps_flow_cnt Size of the hash table for recording flows (ex 8192) As processes wait for packets the kernel remembers which sockets they are waiting on and which CPU they last used When packets come in the softirq is directed to the CPU where the process last slept More directed than receive packet steering alone Together with receive packet steering: 50% faster ipoib results on a two socket system % faster on x2-8

14 RDS Improvements RDS is one of the main network transports used in Exadata systems Reliable datagram services, optimized for Oracle use Enables network RDMA operations when used with Infiniband Original x2-8 target: 4x faster than a two socket system Original x2-8 numbers: slightly slower than a two socket system Final x2-8 numbers: 8x faster than the original two socket numbers

15 RDS Improvements RDS was heavily saturating one or two cores on the system, but leaving the rest of the x2-8 idle Allocate two MSI irqs for each RDS connection instead of two for the whole system Spreads interrupts across multiple CPUs Reduce lock contention in the RDS code Optimize RDMA key management for NUMA Reduce wakeups on remote CPUs Switch a number of data structures over to RCU Read, copy, update

16 IPC Semaphores Heavily used by Oracle to wakeup processes as database transactions commit Problematic for years due to high spinlock contention inside the kernel Problematic in almost every Unix as well Accounted for 90% of the system time during x2-8 database runs New code doesn't register in system profiles (<1% of the system time)

17 Cpusets Create simple containers associated with a set of CPUs and memory Can breakup large systems for a number of smaller workloads Example benchmark: High database lock contention on a single row Spreading across all the x2-8 CPUs is much slower than a simple two socket system Containing the workload to 32 CPUs is slightly faster than a simple two socket system (5-10%)

18 Optimization Summary Include a long series of optimizations between the and kernels Many NUMA targeted improvements Focused optimizations for the IO, networking and IPC stacks Extensive profiling with Exadata workloads Work effectively spread across all the CPUs, with less lock contention and system time overhead

19 2010 Oracle Corporation Resources Linux Home Page oracle.com/linux Follow us on Free Download: Oracle Linux edelivery.oracle.com/linux Read: Oracle Linux Blog blogs.oracle.com/linux Shop Online Oracle Unbreakable Linux Support oracle.com/store

Inside look at benchmarks Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering. Wednesday, August 17, 11

Inside look at benchmarks Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering. Wednesday, August 17, 11 Inside look at benchmarks Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering Overview Purpose of benchmarks Who is involved? What kind of benchmarks exist out there? Benchmarks are

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

Linux multi-core scalability

Linux multi-core scalability Linux multi-core scalability Oct 2009 Andi Kleen Intel Corporation andi@firstfloor.org Overview Scalability theory Linux history Some common scalability trouble-spots Application workarounds Motivation

More information

Oracle Linux Wim Coekaerts, Senior Vice President, Linux and Virtualization Engineering

Oracle Linux Wim Coekaerts, Senior Vice President, Linux and Virtualization Engineering Oracle Linux Wim Coekaerts, Senior Vice President, Linux and Virtualization Engineering Oracle Confidentiial Internal Use Only 1 The following is intended to outline our general product

More information

Evaluation of Chelsio Terminator 6 (T6) Unified Wire Adapter iscsi Offload

Evaluation of Chelsio Terminator 6 (T6) Unified Wire Adapter iscsi Offload November 2017 Evaluation of Chelsio Terminator 6 (T6) Unified Wire Adapter iscsi Offload Initiator and target iscsi offload improve performance and reduce processor utilization. Executive Summary The Chelsio

More information

Measuring the impacts of the Preempt-RT patch

Measuring the impacts of the Preempt-RT patch Measuring the impacts of the Preempt-RT patch maxime.chevallier@smile.fr October 25, 2017 RT Linux projects Simulation platform : bi-xeon, lots ot RAM 200µs wakeup latency, networking Test bench : Intel

More information

Multiprocessor Systems. Chapter 8, 8.1

Multiprocessor Systems. Chapter 8, 8.1 Multiprocessor Systems Chapter 8, 8.1 1 Learning Outcomes An understanding of the structure and limits of multiprocessor hardware. An appreciation of approaches to operating system support for multiprocessor

More information

AutoNUMA Red Hat, Inc.

AutoNUMA Red Hat, Inc. AutoNUMA Red Hat, Inc. Andrea Arcangeli aarcange at redhat.com 1 Apr 2012 AutoNUMA components knuma_scand If stopped, everything stops Triggers the chain reaction when started NUMA hinting page faults

More information

Oracle Exalogic Elastic Cloud Overview. Peter Hoffmann Technical Account Manager

Oracle Exalogic Elastic Cloud Overview. Peter Hoffmann Technical Account Manager Oracle Exalogic Elastic Cloud Overview Peter Hoffmann Technical Account Manager Engineered Systems Driving trend in IT for the next decade Oracle Exalogic Elastic Cloud Hardware and Software, Engineered

More information

Oracle Enterprise Architecture. Software. Hardware. Complete. Oracle Exalogic.

Oracle Enterprise Architecture. Software. Hardware. Complete. Oracle Exalogic. Oracle Enterprise Architecture Software. Hardware. Complete Oracle Exalogic edward.zhang@oracle.com Exalogic Exalogic Exalogic -- Exalogic Design Center Exalogic - Sun Oracle - - - CPU/Memory/Networking/Storage

More information

Realtime Tuning 101. Tuning Applications on Red Hat MRG Realtime Clark Williams

Realtime Tuning 101. Tuning Applications on Red Hat MRG Realtime Clark Williams Realtime Tuning 101 Tuning Applications on Red Hat MRG Realtime Clark Williams Agenda Day One Terminology and Concepts Realtime Linux differences from stock Linux Tuning Tools for Tuning Tuning Tools Lab

More information

Frits Hoogland - Oracle Usergroup Norway 2013 EXADATA AND OLTP. Thursday, April 18, 13

Frits Hoogland - Oracle Usergroup Norway 2013 EXADATA AND OLTP. Thursday, April 18, 13 Frits Hoogland - Oracle Usergroup Norway 2013 EXADATA AND OLTP Who am I? Frits Hoogland Working with Oracle products since 1996 Blog: http://fritshoogland.wordpress.com Twitter: @fritshoogland Email: fhoogland@vxcompany.com

More information

Performance Optimisations for HPC workloads. August 2008 Imed Chihi

Performance Optimisations for HPC workloads. August 2008 Imed Chihi Performance Optimisations for HPC workloads August 2008 Imed Chihi Agenda The computing model The assignment problem CPU sets Priorities Disk IO optimisations gettimeofday() Disabling services Memory management

More information

ò mm_struct represents an address space in kernel ò task represents a thread in the kernel ò A task points to 0 or 1 mm_structs

ò mm_struct represents an address space in kernel ò task represents a thread in the kernel ò A task points to 0 or 1 mm_structs Last time We went through the high-level theory of scheduling algorithms Scheduling Today: View into how Linux makes its scheduling decisions Don Porter CSE 306 Lecture goals Understand low-level building

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Spring 2018 Lecture 15: Multicore Geoffrey M. Voelker Multicore Operating Systems We have generally discussed operating systems concepts independent of the number

More information

Scheduling. Don Porter CSE 306

Scheduling. Don Porter CSE 306 Scheduling Don Porter CSE 306 Last time ò We went through the high-level theory of scheduling algorithms ò Today: View into how Linux makes its scheduling decisions Lecture goals ò Understand low-level

More information

Mark Falco Oracle Coherence Development

Mark Falco Oracle Coherence Development Achieving the performance benefits of Infiniband in Java Mark Falco Oracle Coherence Development 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy

More information

ò Paper reading assigned for next Tuesday ò Understand low-level building blocks of a scheduler User Kernel ò Understand competing policy goals

ò Paper reading assigned for next Tuesday ò Understand low-level building blocks of a scheduler User Kernel ò Understand competing policy goals Housekeeping Paper reading assigned for next Tuesday Scheduling Don Porter CSE 506 Memory Management Logical Diagram Binary Memory Formats Allocators Threads Today s Lecture Switching System to CPU Calls

More information

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services Presented by: Jitong Chen Outline Architecture of Web-based Data Center Three-Stage framework to benefit

More information

Automatic NUMA Balancing. Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP

Automatic NUMA Balancing. Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP Automatic NUMA Balancing Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP Automatic NUMA Balancing Agenda What is NUMA, anyway? Automatic NUMA balancing internals

More information

IBM POWER8 100 GigE Adapter Best Practices

IBM POWER8 100 GigE Adapter Best Practices Introduction IBM POWER8 100 GigE Adapter Best Practices With higher network speeds in new network adapters, achieving peak performance requires careful tuning of the adapters and workloads using them.

More information

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies Meltdown and Spectre Interconnect Evaluation Jan 2018 1 Meltdown and Spectre - Background Most modern processors perform speculative execution This speculation can be measured, disclosing information about

More information

Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Copyright 2011, Oracle and/or its affiliates. All rights reserved. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

A System-Level Optimization Framework For High-Performance Networking. Thomas M. Benson Georgia Tech Research Institute

A System-Level Optimization Framework For High-Performance Networking. Thomas M. Benson Georgia Tech Research Institute A System-Level Optimization Framework For High-Performance Networking Thomas M. Benson Georgia Tech Research Institute thomas.benson@gtri.gatech.edu 1 Why do we need high-performance networking? Data flow

More information

OpenVMS Scaling on Large Integrity Servers. Guy Peleg President Maklee Engineering

OpenVMS Scaling on Large Integrity Servers. Guy Peleg President Maklee Engineering OpenVMS Scaling on Large Integrity Servers Guy Peleg President Maklee Engineering guy.peleg@maklee.com Who we are What is Maklee? US Based consulting firm operating all over the world. Former members of

More information

Best Practices for Setting BIOS Parameters for Performance

Best Practices for Setting BIOS Parameters for Performance White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page

More information

CFS-v: I/O Demand-driven VM Scheduler in KVM

CFS-v: I/O Demand-driven VM Scheduler in KVM CFS-v: Demand-driven VM Scheduler in KVM Hyotaek Shim and Sung-Min Lee (hyotaek.shim, sung.min.lee@samsung.com) Software R&D Center, Samsung Electronics 2014. 10. 16 Problem in Server Consolidation 2/16

More information

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb. Messaging App IB Verbs NTRDMA dmaengine.h ntb.h DMA DMA DMA NTRDMA v0.1 An Open Source Driver for PCIe and DMA Allen Hubbe at Linux Piter 2015 1 INTRODUCTION Allen Hubbe Senior Software Engineer EMC Corporation

More information

Architectural Principles for Networked Solid State Storage Access

Architectural Principles for Networked Solid State Storage Access Architectural Principles for Networked Solid State Storage Access SNIA Legal Notice! The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted.! Member companies and individual

More information

Multiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.

Multiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8. Multiprocessor System Multiprocessor Systems Chapter 8, 8.1 We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than

More information

NUMA replicated pagecache for Linux

NUMA replicated pagecache for Linux NUMA replicated pagecache for Linux Nick Piggin SuSE Labs January 27, 2008 0-0 Talk outline I will cover the following areas: Give some NUMA background information Introduce some of Linux s NUMA optimisations

More information

Running High Performance Computing Workloads on Red Hat Enterprise Linux

Running High Performance Computing Workloads on Red Hat Enterprise Linux Running High Performance Computing Workloads on Red Hat Enterprise Linux Imed Chihi Senior Technical Account Manager Red Hat Global Support Services 21 January 2014 Agenda 2 The case of HPC on commodity

More information

Multiprocessor Systems. COMP s1

Multiprocessor Systems. COMP s1 Multiprocessor Systems 1 Multiprocessor System We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than one CPU to improve

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

Virtual SQL Servers. Actual Performance. 2016

Virtual SQL Servers. Actual Performance. 2016 @kleegeek davidklee.net heraflux.com linkedin.com/in/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture Health

More information

Voltaire. Fast I/O for XEN using RDMA Technologies. The Grid Interconnect Company. April 2005 Yaron Haviv, Voltaire, CTO

Voltaire. Fast I/O for XEN using RDMA Technologies. The Grid Interconnect Company. April 2005 Yaron Haviv, Voltaire, CTO Voltaire The Grid Interconnect Company Fast I/O for XEN using RDMA Technologies April 2005 Yaron Haviv, Voltaire, CTO yaronh@voltaire.com The Enterprise Grid Model and ization VMs need to interact efficiently

More information

HLD For SMP node affinity

HLD For SMP node affinity HLD For SMP node affinity Introduction Current versions of Lustre rely on a single active metadata server. Metadata throughput may be a bottleneck for large sites with many thousands of nodes. System architects

More information

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big

More information

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet

Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet WHITE PAPER Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet Contents Background... 2 The MapR Distribution... 2 Mellanox Ethernet Solution... 3 Test

More information

Private Cloud Database Consolidation Name, Title

Private Cloud Database Consolidation Name, Title Private Cloud Database Consolidation Name, Title Agenda Cloud Introduction Business Drivers Cloud Architectures Enabling Technologies Service Level Expectations Customer Case Studies Conclusions

More information

IsoStack Highly Efficient Network Processing on Dedicated Cores

IsoStack Highly Efficient Network Processing on Dedicated Cores IsoStack Highly Efficient Network Processing on Dedicated Cores Leah Shalev Eran Borovik, Julian Satran, Muli Ben-Yehuda Outline Motivation IsoStack architecture Prototype TCP/IP over 10GE on a single

More information

RxNetty vs Tomcat Performance Results

RxNetty vs Tomcat Performance Results RxNetty vs Tomcat Performance Results Brendan Gregg; Performance and Reliability Engineering Nitesh Kant, Ben Christensen; Edge Engineering updated: Apr 2015 Results based on The Hello Netflix benchmark

More information

Real Time Linux patches: history and usage

Real Time Linux patches: history and usage Real Time Linux patches: history and usage Presentation first given at: FOSDEM 2006 Embedded Development Room See www.fosdem.org Klaas van Gend Field Application Engineer for Europe Why Linux in Real-Time

More information

Background: I/O Concurrency

Background: I/O Concurrency Background: I/O Concurrency Brad Karp UCL Computer Science CS GZ03 / M030 5 th October 2011 Outline Worse Is Better and Distributed Systems Problem: Naïve single-process server leaves system resources

More information

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010 Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed

More information

MiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces

MiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces MiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces Hye-Churn Jang Hyun-Wook (Jin) Jin Department of Computer Science and Engineering Konkuk University Seoul, Korea {comfact,

More information

Performance Tuning Guidelines for Low Latency Response on AMD EPYC -Based Servers Application Note

Performance Tuning Guidelines for Low Latency Response on AMD EPYC -Based Servers Application Note Performance Tuning Guidelines for Low Latency Response on AMD EPYC -Based Servers Publication # 56263 Revision: 3.00 Issue Date: January 2018 Advanced Micro Devices 2018 Advanced Micro Devices, Inc. All

More information

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration Sanjay Rao, Principal Software Engineer Version 1.0 August 2011 1801 Varsity Drive Raleigh NC 27606-2072 USA

More information

Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects?

Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects? Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects? N. S. Islam, X. Lu, M. W. Rahman, and D. K. Panda Network- Based Compu2ng Laboratory Department of Computer

More information

Operating Systems. Overview Virtual memory part 2. Page replacement algorithms. Lecture 7 Memory management 3: Virtual memory

Operating Systems. Overview Virtual memory part 2. Page replacement algorithms. Lecture 7 Memory management 3: Virtual memory Operating Systems Lecture 7 Memory management : Virtual memory Overview Virtual memory part Page replacement algorithms Frame allocation Thrashing Other considerations Memory over-allocation Efficient

More information

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0 INFINIBAND OVERVIEW -, 2010 Page 1 Version 1.0 Why InfiniBand? Open and comprehensive standard with broad vendor support Standard defined by the InfiniBand Trade Association (Sun was a founder member,

More information

Motivation of Threads. Preview. Motivation of Threads. Motivation of Threads. Motivation of Threads. Motivation of Threads 9/12/2018.

Motivation of Threads. Preview. Motivation of Threads. Motivation of Threads. Motivation of Threads. Motivation of Threads 9/12/2018. Preview Motivation of Thread Thread Implementation User s space Kernel s space Inter-Process Communication Race Condition Mutual Exclusion Solutions with Busy Waiting Disabling Interrupt Lock Variable

More information

Oracle Database 11g Direct NFS Client Oracle Open World - November 2007

Oracle Database 11g Direct NFS Client Oracle Open World - November 2007 Oracle Database 11g Client Oracle Open World - November 2007 Bill Hodak Sr. Product Manager Oracle Corporation Kevin Closson Performance Architect Oracle Corporation Introduction

More information

Light & NOS. Dan Li Tsinghua University

Light & NOS. Dan Li Tsinghua University Light & NOS Dan Li Tsinghua University Performance gain The Power of DPDK As claimed: 80 CPU cycles per packet Significant gain compared with Kernel! What we care more How to leverage the performance gain

More information

EXPERIENCES WITH NVME OVER FABRICS

EXPERIENCES WITH NVME OVER FABRICS 13th ANNUAL WORKSHOP 2017 EXPERIENCES WITH NVME OVER FABRICS Parav Pandit, Oren Duer, Max Gurtovoy Mellanox Technologies [ 31 March, 2017 ] BACKGROUND: NVME TECHNOLOGY Optimized for flash and next-gen

More information

Was ist dran an einer spezialisierten Data Warehousing platform?

Was ist dran an einer spezialisierten Data Warehousing platform? Was ist dran an einer spezialisierten Data Warehousing platform? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Data warehousing, Exadata, specialized hardware proprietary hardware Introduction

More information

OneCore Storage Performance Tuning

OneCore Storage Performance Tuning OneCore Storage Performance Tuning Overview To improve Emulex adapter performance while using the OneCore Storage Linux drivers in a multi-core CPU environment, multiple performance tuning features can

More information

Exadata. Presented by: Kerry Osborne. February 23, 2012

Exadata. Presented by: Kerry Osborne. February 23, 2012 Exadata Presented by: Kerry Osborne February 23, 2012 whoami Worked with Oracle Since 1982 (V2) Working with Exadata since early 2010 Work for Enkitec (www.enkitec.com) (Enkitec owns a Half Rack V2/X2)

More information

Exploring System Challenges of Ultra-Low Latency Solid State Drives

Exploring System Challenges of Ultra-Low Latency Solid State Drives Exploring System Challenges of Ultra-Low Latency Solid State Drives Sungjoon Koh Changrim Lee, Miryeong Kwon, and Myoungsoo Jung Computer Architecture and Memory systems Lab Executive Summary Motivation.

More information

Review. Preview. Three Level Scheduler. Scheduler. Process behavior. Effective CPU Scheduler is essential. Process Scheduling

Review. Preview. Three Level Scheduler. Scheduler. Process behavior. Effective CPU Scheduler is essential. Process Scheduling Review Preview Mutual Exclusion Solutions with Busy Waiting Test and Set Lock Priority Inversion problem with busy waiting Mutual Exclusion with Sleep and Wakeup The Producer-Consumer Problem Race Condition

More information

Linux Kernel Architecture

Linux Kernel Architecture Professional Linux Kernel Architecture Wolf gang Mauerer WILEY Wiley Publishing, Inc. Introduction xxvii Chapter 1: Introduction and Overview 1 Tasks of the Kernel v -- 2 Implementation Strategies 3 Elements

More information

A Transport Friendly NIC for Multicore/Multiprocessor Systems

A Transport Friendly NIC for Multicore/Multiprocessor Systems ATransport FriendlyNICforMulticore/MultiprocessorSystems WenjiWu,MattCrawford,PhilDeMar Fermilab,P.O.Box500,Batavia,IL60510 Abstract Receive side scaling (RSS) is a network interface card (NIC) technology.

More information

Choosing Hardware and Operating Systems for MySQL. Apr 15, 2009 O'Reilly MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc

Choosing Hardware and Operating Systems for MySQL. Apr 15, 2009 O'Reilly MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc Choosing Hardware and Operating Systems for MySQL Apr 15, 2009 O'Reilly MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc -2- We will speak about Choosing Hardware Choosing Operating

More information

Multifunction Networking Adapters

Multifunction Networking Adapters Ethernet s Extreme Makeover: Multifunction Networking Adapters Chuck Hudson Manager, ProLiant Networking Technology Hewlett-Packard 2004 Hewlett-Packard Development Company, L.P. The information contained

More information

Software and Tools for HPE s The Machine Project

Software and Tools for HPE s The Machine Project Labs Software and Tools for HPE s The Machine Project Scalable Tools Workshop Aug/1 - Aug/4, 2016 Lake Tahoe Milind Chabbi Traditional Computing Paradigm CPU DRAM CPU DRAM CPU-centric computing 2 CPU-Centric

More information

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生.

打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生. 打造 Linux 下的高性能网络 北京酷锐达信息技术有限公司技术总监史应生 shiys@solutionware.com.cn BY DEFAULT, LINUX NETWORKING NOT TUNED FOR MAX PERFORMANCE, MORE FOR RELIABILITY Trade-off :Low Latency, throughput, determinism Performance

More information

InfiniBand Networked Flash Storage

InfiniBand Networked Flash Storage InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB

More information

When the OS gets in the way

When the OS gets in the way When the OS gets in the way (and what you can do about it) Mark Price @epickrram LMAX Exchange Linux When the OS gets in the way (and what you can do about it) Mark Price @epickrram LMAX Exchange It s

More information

Exploring mtcp based Single-Threaded and Multi-Threaded Web Server Design

Exploring mtcp based Single-Threaded and Multi-Threaded Web Server Design Exploring mtcp based Single-Threaded and Multi-Threaded Web Server Design A Thesis Submitted in partial fulfillment of the requirements for the degree of Master of Technology by Pijush Chakraborty (153050015)

More information

Scheduling. CS 161: Lecture 4 2/9/17

Scheduling. CS 161: Lecture 4 2/9/17 Scheduling CS 161: Lecture 4 2/9/17 Where does the first process come from? The Linux Boot Process Machine turned on; BIOS runs BIOS: Basic Input/Output System Stored in flash memory on motherboard Determines

More information

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time

More information

Building a High IOPS Flash Array: A Software-Defined Approach

Building a High IOPS Flash Array: A Software-Defined Approach Building a High IOPS Flash Array: A Software-Defined Approach Weafon Tsao Ph.D. VP of R&D Division, AccelStor, Inc. Santa Clara, CA Clarification Myth 1: S High-IOPS SSDs = High-IOPS All-Flash Array SSDs

More information

Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740

Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740 Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740 A performance study with NVDIMM-N Dell EMC Engineering September 2017 A Dell EMC document category Revisions Date

More information

Checkpointing with DMTCP and MVAPICH2 for Supercomputing. Kapil Arya. Mesosphere, Inc. & Northeastern University

Checkpointing with DMTCP and MVAPICH2 for Supercomputing. Kapil Arya. Mesosphere, Inc. & Northeastern University MVAPICH Users Group 2016 Kapil Arya Checkpointing with DMTCP and MVAPICH2 for Supercomputing Kapil Arya Mesosphere, Inc. & Northeastern University DMTCP Developer Apache Mesos Committer kapil@mesosphere.io

More information

Operating System Design Issues. I/O Management

Operating System Design Issues. I/O Management I/O Management Chapter 5 Operating System Design Issues Efficiency Most I/O devices slow compared to main memory (and the CPU) Use of multiprogramming allows for some processes to be waiting on I/O while

More information

Oracle Exadata: Strategy and Roadmap

Oracle Exadata: Strategy and Roadmap Oracle Exadata: Strategy and Roadmap - New Technologies, Cloud, and On-Premises Juan Loaiza Senior Vice President, Database Systems Technologies, Oracle Safe Harbor Statement The following is intended

More information

2009. October. Semiconductor Business SAMSUNG Electronics

2009. October. Semiconductor Business SAMSUNG Electronics 2009. October Semiconductor Business SAMSUNG Electronics Why SSD performance is faster than HDD? HDD has long latency & late seek time due to mechanical operation SSD does not have both latency and seek

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme VIRT1052BE Extreme Performance Series: Monster VM Database Performance Todd Muirhead, VMware David Morse, VMware #VMworld #VIRT1052BE Disclaimer This presentation may contain product features that are

More information

<Insert Picture Here> Exadata Hardware Configurations and Environmental Information

<Insert Picture Here> Exadata Hardware Configurations and Environmental Information Exadata Hardware Configurations and Environmental Information Revised July 1, 2011 Agenda Exadata Hardware Overview Environmental Information Power InfiniBand Network Ethernet Network

More information

URDMA: RDMA VERBS OVER DPDK

URDMA: RDMA VERBS OVER DPDK 13 th ANNUAL WORKSHOP 2017 URDMA: RDMA VERBS OVER DPDK Patrick MacArthur, Ph.D. Candidate University of New Hampshire March 28, 2017 ACKNOWLEDGEMENTS urdma was initially developed during an internship

More information

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona MySQL Performance Optimization and Troubleshooting with PMM Peter Zaitsev, CEO, Percona In the Presentation Practical approach to deal with some of the common MySQL Issues 2 Assumptions You re looking

More information

Profiling: Understand Your Application

Profiling: Understand Your Application Profiling: Understand Your Application Michal Merta michal.merta@vsb.cz 1st of March 2018 Agenda Hardware events based sampling Some fundamental bottlenecks Overview of profiling tools perf tools Intel

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

Kernel Korner What's New in the 2.6 Scheduler

Kernel Korner What's New in the 2.6 Scheduler Kernel Korner What's New in the 2.6 Scheduler When large SMP systems started spending more time scheduling processes than running them, it was time for a change. by Rick Lindsley As work began on the 2.5

More information

08:End-host Optimizations. Advanced Computer Networks

08:End-host Optimizations. Advanced Computer Networks 08:End-host Optimizations 1 What today is about We've seen lots of datacenter networking Topologies Routing algorithms Transport What about end-systems? Transfers between CPU registers/cache/ram Focus

More information

Jackson Marusarz Intel Corporation

Jackson Marusarz Intel Corporation Jackson Marusarz Intel Corporation Intel VTune Amplifier Quick Introduction Get the Data You Need Hotspot (Statistical call tree), Call counts (Statistical) Thread Profiling Concurrency and Lock & Waits

More information

SPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation

SPDK China Summit Ziye Yang. Senior Software Engineer. Network Platforms Group, Intel Corporation SPDK China Summit 2018 Ziye Yang Senior Software Engineer Network Platforms Group, Intel Corporation Agenda SPDK programming framework Accelerated NVMe-oF via SPDK Conclusion 2 Agenda SPDK programming

More information

LCA14-104: GTS- A solution to support ARM s big.little technology. Mon-3-Mar, 11:15am, Mathieu Poirier

LCA14-104: GTS- A solution to support ARM s big.little technology. Mon-3-Mar, 11:15am, Mathieu Poirier LCA14-104: GTS- A solution to support ARM s big.little technology Mon-3-Mar, 11:15am, Mathieu Poirier Today s Presentation: Things to know about Global Task Scheduling (GTS). MP patchset description and

More information

Preview. The Thread Model Motivation of Threads Benefits of Threads Implementation of Thread

Preview. The Thread Model Motivation of Threads Benefits of Threads Implementation of Thread Preview The Thread Model Motivation of Threads Benefits of Threads Implementation of Thread Implement thread in User s Mode Implement thread in Kernel s Mode CS 431 Operating System 1 The Thread Model

More information

Testing 6x DS-CAM-600. Gigabit-Ethernet Camera

Testing 6x DS-CAM-600. Gigabit-Ethernet Camera Gigabit-Ethernet Camera 1. System requirements o 6 x independent Gigabit-Ethernet ports Used network cards at the testing: Intel PRO/1000 PT Quad Port Low Profile Server Adapter Tenda TEL9901 o Good PC

More information

CS 326: Operating Systems. CPU Scheduling. Lecture 6

CS 326: Operating Systems. CPU Scheduling. Lecture 6 CS 326: Operating Systems CPU Scheduling Lecture 6 Today s Schedule Agenda? Context Switches and Interrupts Basic Scheduling Algorithms Scheduling with I/O Symmetric multiprocessing 2/7/18 CS 326: Operating

More information

Workload Optimized Systems: The Wheel of Reincarnation. Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013

Workload Optimized Systems: The Wheel of Reincarnation. Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013 Workload Optimized Systems: The Wheel of Reincarnation Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013 Outline Definition Technology Minicomputers Prime Workstations Apollo Graphics

More information

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1 What s New in VMware vsphere 4.1 Performance VMware vsphere 4.1 T E C H N I C A L W H I T E P A P E R Table of Contents Scalability enhancements....................................................................

More information

Measuring a 25 Gb/s and 40 Gb/s data plane

Measuring a 25 Gb/s and 40 Gb/s data plane Measuring a 25 Gb/s and 40 Gb/s data plane Christo Kleu Pervaze Akhtar 1 Contents Preliminaries Equipment Traffic generators Test topologies Host and VM configuration NUMA Architecture CPU allocation BIOS

More information

Innodb Performance Optimization

Innodb Performance Optimization Innodb Performance Optimization Most important practices Peter Zaitsev CEO Percona Technical Webinars December 20 th, 2017 1 About this Presentation Innodb Architecture and Performance Optimization 3h

More information

Low latency, high bandwidth communication. Infiniband and RDMA programming. Bandwidth vs latency. Knut Omang Ifi/Oracle 2 Nov, 2015

Low latency, high bandwidth communication. Infiniband and RDMA programming. Bandwidth vs latency. Knut Omang Ifi/Oracle 2 Nov, 2015 Low latency, high bandwidth communication. Infiniband and RDMA programming Knut Omang Ifi/Oracle 2 Nov, 2015 1 Bandwidth vs latency There is an old network saying: Bandwidth problems can be cured with

More information

Filesystem Performance on FreeBSD

Filesystem Performance on FreeBSD Filesystem Performance on FreeBSD Kris Kennaway kris@freebsd.org BSDCan 2006, Ottawa, May 12 Introduction Filesystem performance has many aspects No single metric for quantifying it I will focus on aspects

More information

CSCE Operating Systems Scheduling. Qiang Zeng, Ph.D. Fall 2018

CSCE Operating Systems Scheduling. Qiang Zeng, Ph.D. Fall 2018 CSCE 311 - Operating Systems Scheduling Qiang Zeng, Ph.D. Fall 2018 Resource Allocation Graph describing the traffic jam CSCE 311 - Operating Systems 2 Conditions for Deadlock Mutual Exclusion Hold-and-Wait

More information

YOUR machine and MY database a performing relationship!? (#141)

YOUR machine and MY database a performing relationship!? (#141) YOUR machine and MY database a performing relationship!? (#141) Martin Klier Senior / Lead DBA Klug GmbH integrierte Systeme Las Vegas, April 10th, 2014 Agenda Introduction NUMA + Huge Pages Disk IO Concurrency

More information

CSE 120. Overview. July 27, Day 8 Input/Output. Instructor: Neil Rhodes. Hardware. Hardware. Hardware

CSE 120. Overview. July 27, Day 8 Input/Output. Instructor: Neil Rhodes. Hardware. Hardware. Hardware CSE 120 July 27, 2006 Day 8 Input/Output Instructor: Neil Rhodes How hardware works Operating Systems Layer What the kernel does API What the programmer does Overview 2 Kinds Block devices: read/write

More information