Mark Falco Oracle Coherence Development

Similar documents
Craig Blitz Oracle Coherence Product Management

Oracle 1Z Exalogic Elastic Cloud X2-2 Essentials.

Solaris Engineered Systems

Oracle Coherence + Oracle Exalogic Elastic Cloud

Oracle Solaris - The Best Platform to run your Oracle Applications

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

Oracle Enterprise Architecture. Software. Hardware. Complete. Oracle Exalogic.

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0

Oracle Exalogic Elastic Cloud Overview. Peter Hoffmann Technical Account Manager

Application Acceleration Beyond Flash Storage

InfiniBand Networked Flash Storage

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD.

Multifunction Networking Adapters

Introduction to Infiniband

WebSphere MQ Low Latency Messaging V2.1. High Throughput and Low Latency to Maximize Business Responsiveness IBM Corporation

ETHOS A Generic Ethernet over Sockets Driver for Linux

NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications

ORACLE EXALOGIC ELASTIC CLOUD

DB2 purescale: High Performance with High-Speed Fabrics. Author: Steve Rees Date: April 5, 2011

Brent Callaghan Sun Microsystems, Inc. Sun Microsystems, Inc

Rapid database cloning using SMU and ZFS Storage Appliance How Exalogic tooling can help

Infiniband and RDMA Technology. Doug Ledford

Evaluating the Impact of RDMA on Storage I/O over InfiniBand

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Advanced Computer Networks. End Host Optimization

jverbs: Java/OFED Integration for the Cloud

Oracle Exalogic Elastic Cloud 2.x: System Administration

The NE010 iwarp Adapter

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies

Oracle Exalogic Elastic Cloud: X2-2 Hardware Overview

WebLogic & Oracle RAC Active GridLink for RAC

2-4 April 2019 Taets Art and Event Park, Amsterdam CLICK TO KNOW MORE

RoCE vs. iwarp Competitive Analysis

Introduction to High-Speed InfiniBand Interconnect

RDMA programming concepts

Memory Management Strategies for Data Serving with RDMA

Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects?

An Oracle White Paper December A Technical Overview of Oracle s SPARC SuperCluster T4-4

IsoStack Highly Efficient Network Processing on Dedicated Cores

QuickSpecs. HP Z 10GbE Dual Port Module. Models

Oracle Secure Backup 12.1 Technical Overview

The Case for RDMA. Jim Pinkerton RDMA Consortium 5/29/2002

FaRM: Fast Remote Memory

End-to-End Java Security Performance Enhancements for Oracle SPARC Servers Performance engineering for a revenue product

Oracle EXAM - 1Z Exalogic Elastic Cloud X2-2 Essentials. Buy Full Product.

To Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC

CERN openlab Summer 2006: Networking Overview

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Impact of Cache Coherence Protocols on the Processing of Network Traffic

Low latency, high bandwidth communication. Infiniband and RDMA programming. Bandwidth vs latency. Knut Omang Ifi/Oracle 2 Nov, 2015

<Insert Picture Here> Boost Linux Performance with Enhancements from Oracle

Asynchronous Peer-to-Peer Device Communication

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen

Learn Your Alphabet - SRIOV, NPIV, RoCE, iwarp to Pump Up Virtual Infrastructure Performance

SNIA Developers Conference - Growth of the iscsi RDMA (iser) Ecosystem

OFED Storage Protocols

HP Cluster Interconnects: The Next 5 Years

Oracle Fusion Middleware 12c Cloud Application Foundation You Video Series

Storage Protocol Offload for Virtualized Environments Session 301-F

Certified Platinum Configurations Last Updated: 3-November-2017

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

Containing RDMA and High Performance Computing

A RESTful Java Framework for Asynchronous High-Speed Ingest

Memcached Design on High Performance RDMA Capable Interconnects

Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect

CSE 461 Module 10. Introduction to the Transport Layer

Oracle Linux, Virtualization & OEM12 Discussion Sahil Mahajan / Sundeep Dhall

S. Narravula, P. Balaji, K. Vaidyanathan, H.-W. Jin and D. K. Panda. The Ohio State University

IBM z13. Frequently Asked Questions. Worldwide

Comparing Server I/O Consolidation Solutions: iscsi, InfiniBand and FCoE. Gilles Chekroun Errol Roberts

URDMA: RDMA VERBS OVER DPDK

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies

Study. Dhabaleswar. K. Panda. The Ohio State University HPIDC '09

Oracle Database Exadata Cloud Service Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE

Networking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ

Server Networking e Virtual Data Center

Elastic Data. Harvey Raja Principal Member Technical Staff Oracle Coherence

Oracle Spatial User Conference Presentation Template

Implementing Efficient and Scalable Flow Control Schemes in MPI over InfiniBand

Network Design Considerations for Grid Computing

Oracle Exadata: Strategy and Roadmap

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE

Pavel Anni Oracle Solaris 11 Feature Map. Slide 2

The Common Communication Interface (CCI)

High Performance Java Remote Method Invocation for Parallel Computing on Clusters

Key Measures of InfiniBand Performance in the Data Center. Driving Metrics for End User Benefits

Modern and Fast: A New Wave of Database and Java in the Cloud. Joost Pronk Van Hoogeveen Lead Product Manager, Oracle

Virtualization, Xen and Denali

iscsi or iser? Asgeir Eiriksson CTO Chelsio Communications Inc

HyPer on Cloud 9. Thomas Neumann. February 10, Technische Universität München

Voltaire. Fast I/O for XEN using RDMA Technologies. The Grid Interconnect Company. April 2005 Yaron Haviv, Voltaire, CTO

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

WLS Neue Optionen braucht das Land

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Cisco - Enabling High Performance Grids and Utility Computing

Welcome to the IBTA Fall Webinar Series

<Insert Picture Here> Exadata Hardware Configurations and Environmental Information

Application Container Cloud

Alan Bateman Java Platform Group, Oracle November Copyright 2018, Oracle and/or its affiliates. All rights reserved.!1

Distributing Computation to Large GPU Clusters

Transcription:

Achieving the performance benefits of Infiniband in Java Mark Falco Oracle Coherence Development 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

The following is intended to outline general product use and direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 2 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Exalogic / Exabus 3 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Exalogic - Hardware 24 cores 96GB RAM 30 compute nodes in a full rack QDR Infiniband 4 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Infiniband High throughput (~32gbs in QDR) Low latency (~1us) Super Jumbo Frames (MTU 64KB) Supports standard IP stack (UDP/TCP) Verbs based API Remote Direct Memory Access (RDMA) pre-registered memory accessible to remote machines operates without involving host CPU 5 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Exabus - Exalogic I/O and Network Design Eliminates cloud, cluster and network virtualization I/O bottlenecks Exalogic X2-2 Ethernet Gateway Switches Spine Switch IB Data Center Service Network (10GbE) Standard Oracle Database Data Center Mgmt Network (GbE) 10GbE GbE Management Switch Exabus (InfiniBand I/O Backplane) Compute Nodes Storage Exadata Exalogic SPARC SuperCluster Management Network (GbE) ZFS Storage 6 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8 Copyright 2011 Oracle Corpora4on

Exabus - Optimizations Direct Memory I/O for Java New Java APIs and Exalogic Elastic Cloud Software - Low Latency Java support for Infiniband - Optimized implementation for Exalogic Infiniband Surfacing low-level advanced networking capabilities 7 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Infiniband - Socket Direct Protocol Streaming sockets API, i.e. SOCK_STREAM Easily integrated into TCP based applications zero-copy or kernel-bypass Java availability Proprietary in JDK6 Standard in JDK7 8 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Infiniband - Coherence Integration Initially attempted over standard UDP Experimented with TCP/SDP Required many co-located nodes to utilize bandwidth Dozens in order to max out HCA Latencies Large objects: benefit from Infiniband without protocol change Small objects: on-par with standard ethernet (300-600us) 9 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Binary low-level message transport Multi-point addressing Reliable ordered delivery Asynchronous event based programming model Pluggable provider based framework SocketBus (TCP/SDP) Native RDMA Exabus 10 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Exabus - Next-generation of Exalogic performance optimization Coherence WebLogic IB Transport APIs SDP Tuxedo Any Linux or Solaris App. Na4ve RDMA EoIB TCP/IP IPoIB InfiniBand Core Hardware and Firmware New for Exalogic V1.1 Exalogic V1.0 11 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

- API public interface {! void seteventcollector(collector<event> collector);! void open();! void close();! void connect(endpoint peer);! void disconnect(endpoint peer);! void release(endpoint peer);! void flush();! void send(endpoint peer, BufferSequence buf, Object receipt);! }! 12 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

- Events Event OPEN CLOSE CONNECT DISCONNECT RELEASE MESSAGE RECEIPT BACKLOG_EXCESSIVE BACKLOG_NORMAL Indicates Start of bus event stream End of bus event stream Start of per- connec4on event stream End of confirmed delivery per- connec4on event stream End of per- connec4on event stream Local message delivery Message delivery confirma4on Start of backlog condi4on End of backlog condi4on 13 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

- Native RDMA Zero-copy and kernel-bypass Optimized for sender latency Predictive notifications avoid costly interrupts Asynchronous task based system manages protocol Custom DirectByteBuffer allows for zero-copy reduces GC pressure 14 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Message Transfer - Native RDMA Sender Receiver RDMA Write Header Ring Buffer Allocation Message RDMA Read Body Message Ring Buffer RDMA Write Receipt Delivery Delivery Collector Collector 15 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

- Coherence Integration Pluggable message transport per service Legacy system utilized a single transport for entire JVM Increased Parallel Processing Network I/O Message Deserialization Message Delivery - Java context switches 1 vs. 3 Potential for zero context switches 16 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

- Coherence Integration Member 1 Member 2 Member 3 PartitionedCache Service (Cache: A, B, C) PartitionedCache Service (Cache: D, E, F) InvocationService PartitionedCache Service (Cache: A, B, C) PartitionedCache Service (Cache: D, E, F) InvocationService PartitionedCache Service (Cache: A, B, C) PartitionedCache Service (Cache: D, E, F) InvocationService tmb:// 192.168.1.1:8000.1 tmb:// 192.168.1.2:8000.2 tmb:// 192.168.1.2:8000.3 tmb:// 192.168.1.1:8001.1 tmb:// 192.168.1.2:8001.2 tmb:// 192.168.1.2:8001.3 tmb:// 192.168.1.1:8002.1 tmb:// 192.168.1.2:8002.2 tmb:// 192.168.1.2:8002.3 Exabus RDMA 17 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

- Coherence Integration The network is no longer the bottleneck Measured Improvements small number of nodes can max out HCA latencies reduced to ~100us RDMA Bus, ~200us SocketBus Future direction more ses per service prototyped solution drops latency down to 70us designs to drop latency to 40us 18 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Q&A 19 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8