Coherence An Introduction. Shaun Smith Principal Product Manager

Similar documents
<Insert Picture Here>

XTP, Scalability and Data Grids An Introduction to Coherence

<Insert Picture Here> Oracle Application Cache Solution: Coherence

<Insert Picture Here>

TopLink Grid: Scaling JPA applications with Coherence

Coherence & WebLogic Server integration with Coherence (Active Cache)

Pimp My Data Grid. Brian Oliver Senior Principal Solutions Architect <Insert Picture Here>

Advanced HTTP session management with Oracle Coherence

<Insert Picture Here> Getting Coherence: Introduction to Data Grids Jfokus Conference, 28 January 2009

<Insert Picture Here> Oracle Coherence & Extreme Transaction Processing (XTP)

1 Markus Eisele, Insurance - Strategic IT-Architecture

Datacenter replication solution with quasardb

<Insert Picture Here> Oracle Cloud Strategy: Oracle s Vision for Next- Generation Application Grid, Virtualization and Cloud

Caching patterns and extending mobile applications with elastic caching (With Demonstration)

Craig Blitz Oracle Coherence Product Management

Cloud Programming on Java EE Platforms. mgr inż. Piotr Nowak

Gemeinsam mehr erreichen.

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

VERITAS Volume Replicator. Successful Replication and Disaster Recovery

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

<Insert Picture Here>

Inside GigaSpaces XAP Technical Overview and Value Proposition

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 1. Introduction

Scott Meder Senior Regional Sales Manager

<Insert Picture Here> A Brief Introduction to Live Object Pattern

Overview SENTINET 3.1

CPS 512 midterm exam #1, 10/7/2016

Pragmatic Clustering. Mike Cannon-Brookes CEO, Atlassian Software Systems

Data Management in Application Servers. Dean Jacobs BEA Systems

About Terracotta Ehcache. Version 10.1

Maximum Availability Architecture: Overview. An Oracle White Paper July 2002

A memcached implementation in Java. Bela Ban JBoss 2340

ECLIPSE PERSISTENCE PLATFORM (ECLIPSELINK) FAQ

Building a Scalable Architecture for Web Apps - Part I (Lessons Directi)

Introducing EclipseLink: The Eclipse Persistence Services Project

Postgres Plus and JBoss

TECHED USER CONFERENCE MAY 3-4, 2016

WebLogic & Oracle RAC Active GridLink for RAC


Distributed Caching: Essential Lessons. Philadelphia Java User Group April 17, Cameron Purdy President Tangosol, Inc.

Aurora, RDS, or On-Prem, Which is right for you

Scaling Out Tier Based Applications

Hive Metadata Caching Proposal

Creating Ultra-fast Realtime Apps and Microservices with Java. Markus Kett, CEO Jetstream Technologies

High Availability/ Clustering with Zend Platform

Transactum Business Process Manager with High-Performance Elastic Scaling. November 2011 Ivan Klianev

Developing Applications with Java EE 6 on WebLogic Server 12c

<Insert Picture Here> Value of TimesTen Oracle TimesTen Product Overview

VERITAS Volume Replicator Successful Replication and Disaster Recovery

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

White Paper. Major Performance Tuning Considerations for Weblogic Server

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Veritas Storage Foundation for Windows by Symantec

Overview p. 1 Server-side Component Architectures p. 3 The Need for a Server-Side Component Architecture p. 4 Server-Side Component Architecture

Database Server. 2. Allow client request to the database server (using SQL requests) over the network.

ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE WEBLOGIC SUITE

MySQL High Availability. Michael Messina Senior Managing Consultant, Rolta-AdvizeX /

~ Ian Hunneybell: CBSD Revision Notes (07/06/2006) ~

Achieving Horizontal Scalability. Alain Houf Sales Engineer

What every DBA needs to know about JDBC connection pools Bridging the language barrier between DBA and Middleware Administrators

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development

Principal Solutions Architect. Architecting in the Cloud

Polyglot Persistence. EclipseLink JPA for NoSQL, Relational, and Beyond. Shaun Smith Gunnar Wagenknecht

DQpowersuite. Superior Architecture. A Complete Data Integration Package

Advanced Topics in Operating Systems

HYBRID TRANSACTION/ANALYTICAL PROCESSING COLIN MACNAUGHTON

It also performs many parallelization operations like, data loading and query processing.

Announcements. me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris

Chapter 20: Database System Architectures

Bipul Sinha, Amit Ganesh, Lilian Hobbs, Oracle Corp. Dingbo Zhou, Basavaraj Hubli, Manohar Malayanur, Fannie Mae

Java Enterprise Edition

{brandname} 9.4 Glossary. The {brandname} community

Lecture 23 Database System Architectures

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

The Google File System (GFS)

02 - Distributed Systems

MapReduce. U of Toronto, 2014

Map-Reduce. Marco Mura 2010 March, 31th

BigData and Map Reduce VITMAC03

02 - Distributed Systems

extreme Scale vs. TayzGrid

Synergetics-Standard-SQL Server 2012-DBA-7 day Contents

TIBCO BusinessEvents Extreme. System Sizing Guide. Software Release Published May 27, 2012

<Insert Picture Here> QCon: London 2009 Data Grid Design Patterns

IBM Spectrum NAS. Easy-to-manage software-defined file storage for the enterprise. Overview. Highlights

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

Fast Track to EJB 3.0 and the JPA Using JBoss

A Guide to Architecting the Active/Active Data Center

Distributed Computation Models

MSMQ-MQSeries Bridge Configuration Guide White Paper

SQL Azure. Abhay Parekh Microsoft Corporation

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems

Intuitive distributed algorithms. with F#

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers

Spring & Hibernate. Knowledge of database. And basic Knowledge of web application development. Module 1: Spring Basics

How we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016

Exploiting peer group concept for adaptive and highly available services

Developing Microsoft Azure Solutions: Course Agenda

ebay Marketplace Architecture

Rediffmail Enterprise High Availability Architecture

Transcription:

Coherence An Introduction Shaun Smith Principal Product Manager

About Me Product Manager for Oracle TopLink Involved with object-relational and object-xml mapping technology for over 10 years. Co-Lead of Eclipse Dali JPA Tools Project Ecosystem Development Lead on proposed Eclipse Java Persistence Platform Project (EclipseLink)

Agenda Motivation for Coherence What is Coherence? and what Coherence isn t! How does Coherence work? the underlying design objectives at the network layer how data is managed Where to use Coherence? features architectural integration patterns Why Coherence? Questions

Motivation. Scalability General Rules for Scalability: #1: Make Applications Stateless (statefulness is hard) #2: Use Master/Worker Pattern (coordinate tasks) #3: Use Messaging (notify of state changes)

Motivation. General Observations: #1: Where does the data to process come from? #2: Imposes take a copy of the data with you #3: Imposes process it all in one place #4: Forced to introduce caching to keep data close #5: Introduces single points-of-failure & bottleneck #6: Applications may scale-out (inexpensive), but Data Sources have to scale-up (expensive) #7: Manual partitioning Data / Services increases complexity

Motivation. Coherence is designed to Resolve these challenges Extend the life of Data Sources & Architectures As a foundation for extreme processing systems Enterprise Applications Real Time Clients Web Services Application Tier Data Services Coherence Data Grid Data Sources Databases Mainframes Web Services

What is Coherence? and what it isn t! <Insert Picture Here>

What is Coherence? Cluster-based Data Management Solution for Applications

What is Coherence? Or Distributed Memory Data Management Solution (aka: Data Grid)

What is Coherence? It s Clustering! Coherence Clustering is special It means Collection of processes (members) that work together All members have equal responsibility* for the health of the cluster No static/tight coupling of responsibilities to hardware No masters and workers/slaves No centralized registries of data or services No single points of failure No single points of bottleneck *members may be asked to perform individual tasks eg: Grid Agents

What is Coherence? Data Data in Coherence Any serializable* Object No proprietary classes to extend** Fully native Java &.NET interoperability (soon C++) No byte-code instruction manipulation or weaving Not forced to use Relational Models, Object- Relational-Mapping, SQL etc Just real POJOs and PONOs (soon POCOs) Domain objects are fine! *serialization = writing to binary form ** implementing specialized interfaces improves performance!

What is Coherence? Data Data in Coherence Different topologies for your Data Simple API for all Data, regardless of Topology Things you can do Distributed Objects, Maps & Caching Real-Time Events, Listeners, Parallel Queries & Indexing Data Processing and Service Agents (Grid features), Continuous Views, Aggregation, Persistence, Sessions

What is Coherence? Management Solution Management Solution Responsible for Clustering, Data and Service management, including partitioning Ideally engineers should not have to design, specify and code how partitioning occurs in a solution handle Remote Exceptions manage the Cluster, either manually or in code shutdown the system to add new resources or repartition use consoles to recover or scale a system. These are impediments to scaling cost effectively As #members, management cost should be C Ideally log(#members) Clustering technology should invisible in your solution!

What is Coherence? For Applications It s for Applications! Coherence doesn t require a container / server A single library* No external / open source dependencies! Won t cause JAR / classloader hell (cf: DLL hell) Can be embedded or run standalone Runs where Java SE / EE,.NET runs Won t impose architectural patterns** * May require addition libraries depending on features required. EG: spring, web, hibernate, jta integration are bundled as separate libraries. ** Though we do make some suggestions

Clustered Hello World public static void main(string[] args) throws IOException { NamedCache nc = CacheFactory.getCache( test ); nc.put( message, Hello World ); System.out.println(nc.get( message )); } System.in.read(); //may throw exception Joins / Establishes a cluster Places an Entry (key, value) into the Cache test (notice no configuration) Retrieves the Entry from the Cache. Displays it. read at the end to keep the application (and Cluster) from terminating.

Clustered Hello World public static void main(string[] args) throws IOException { NamedCache nc = CacheFactory.getCache( test ); System.out.println(nc.get( message )); } Joins / Establishes a cluster Retrieves the Entry from the Cache. Displays it Start as many applications as you like they all cluster the are able to share the values in the cache

What Coherence isn t! It s not an in-memory-database Though it s often used for transactional state and as a transient system-of-record Used for extreme Transaction Processing You can however: do queries in a parallel but not restricted to relationalstyle (it s not a RDBMS) use SQL-like queries perform indexing (like a DB) do things like Stored Procedures establish real-time views (like Materialized Views).

What Coherence isn t! It s not a messaging system You can use Events and Listeners for data inserts, updates, deletes You can use Agents to handle data changes You can use Filters to filter events It s not just a Cache! Caches expire data! Customers actually turn expiry off! Why do they take that risk? Why do we let them? Data management is based on reliable clustering technology. It s stable, reliable and highly available.

What is Coherence? Dependable resilient scalable cluster technology that enables developers to effortlessly cluster stateful applications so they can dynamically and reliably share data, provide services and respond to events to scale-out solutions to meet business demand.

How does Coherence work? <Insert Picture Here>

Clustering is about Consensus! Coherence Clustering is very different! Goal: Maintain Cluster Membership Consensus all times Do it as fast as physically possible Do it without a single point of failure or registry of members Ensure all members have the same responsibility and work together maintain consensus Ensure that no voting occurs to determine membership

Clustering is about Consensus! Why: If all members are always known We can partition / load balance Data & Services We don t need to hold TCP/IP connections open (resource intensive) Any member can talk directly with any other member (peer-to-peer) The cluster can dynamically (while running) scale to any size

Partitioned Topology : Data Access Data Access Topologies Coherence provides many Topologies for Data Management Local, Near, Replicated, Overview, Disk, Off-Heap, Extend (WAN), Extend (Clients) Partitioned Topology Data spread and backed up across Members Transparent to developer Members have access to all Data All Data locations are known no lookup & no registry!

Partitioned Topology : Data Update Partitioned Topology Synchronous Update Avoids potential Data Loss & Corruption Predictable Performance Backup Partitions are partitioned away from Primaries for resilience No engineering requirement to setup Primaries or Backups Automatically and Dynamically Managed

Partitioned Topology : Recovery Partitioned Topology Membership changes (new members added or members leaving) Other members, in parallel, recover / repartition Noin-flight operations lost Some latencies (due to higher priority of recovery) Reliability, Availability, Scalability, Performance are the priority Degrade performance of some requests

Partitioned Topology Deterministic latencies for data access and update Linearly scalable by design Including Capacity All operations point-to-point (without multi-cast) No TCP/IP connections to create / maintain No loss of in-flight operations while repartitioning No requirement to shutdown cluster to recover from member failure add new members add named caches No network exceptions to catch during repartitioning Dynamic repartitioning means scale-out on demand

Partitioned Topology : Local Storage Partitioned Topology Some members are temporary in a cluster They should not cause repartitioning Repartitioning means work for the other members (and network traffic) So turn off storage!

Topology Composition : Near Topology Coherence allows Topologies to the Composed Base Topologies Local, Replicated, Partitioned / Distributed, Extend Composition Topologies Near, Overflow Near Topology Compose a Front and a Back Topology Permit L1 and L2 caching Both Front and Back may be completely different Both may have different Expiry Policies Expiry Policies LRU, LFU, Hybrid, Seppuku, Custom

Topology Composition : Near Topology

Replicated Topology : Data Access

Replicated Topology : Data Update

Using Coherence <Insert Picture Here>

Features : Traditional Implements Map interface Drop in replacement. Full concurrency control. Multithreaded. Scalable and resilient! get, put, putall, size, clear, lock, unlock Implements JCache interface Extensive support for a multitude of expiration policies, including none! More than just a Cache. More than just a Map

Features : Observable Interface Real-time filterable (bean) events for entry insert, update, delete Filters applied in parallel (in the Grid) Filters completely extensible A large range of filters out-of-the-box: All, Always, And, Any, Array, Between, Class, Comparison, ContainsAll, ContainsAny, Contains, Equals, GreaterEquals, Greater, In, InKeySet, IsNotNull, IsNull, LessEquals, Less, Like, Limit, Never, NotEquals, Not, Or, Present, Xor Events may be synchronous trades.addmaplistener( new StockEventFilter( ORCL ), new MyMapListener( ));

Features : Observable Interface

Features : QueryMap Interface Find Keys or Values that satisfy a Filter. entryset( ), keyset( ) Define indexes (on-the-fly) to extract and index any part of an Entry Executed in Parallel Create Continuous View of entries based on a Filter with real-time events dispatch Perfect for client applications watching data

Features : QueryMap Interface

Features : InvocableMap Interface Execute processors against an Entry, a Collection or a Filter Executions occur in parallel (aka: Grid-style) No workers to manage! Processors may return any value trades.invokeall( new EqualsFilter( getsecurity, ORCL ), new StockSplit(2.0)); Aggregate Entries based on a Filter positions.aggregate( new EqualsFilter( getsecurity, ORCL ), new SumFilter( amount ));

Features : InvocableMap Interface

Architectural Integration Possibilities! Data Source Integration (customizable per cache) Read Through: Read from CacheLoader when data not in grid Write Through: Write to CacheStore when data inserted, updated, removed in grid Write Behind: Asynchronous and coalesced updates to CacheStore when data inserted, updated, deleted in grid Session Management Coherence Web = drop-in replacement to reliably cluster and scale out session management (Java and.net) across a grid

Architectural Integration Possibilities! Spring Integration: Data Grid For Spring: Data Grid Spring Beans managed by Coherence. Beans become shared singletons and offer both Data and Services to Spring Applications Spring Applications naturally become clustered, including cluster events to determine cluster membership changes Expose Coherence Caches as Beans in Spring Configuration

Architectural Integration Possibilities! Provide: Push / Pull data model based on subscription and event notification Client / Data Grid model where clients connect to Coherence for data and services

Why Coherence? <Insert Picture Here>

Why Coherence? Scale-out stateful applications If you need business agility! Save resources! Avoid managing clusters Avoid designing systems around specialized cluster masters Avoid manually coding in data and service partitions If you want predictable scale-out costs, without recoding or reconfiguring! If you want truly native language support! No wrappers or embedded third-party libraries.

Why Coherence? Reliable Scalable Universal Data Built for continuous operation Data Fault Tolerance Self-Diagnosis and Healing Once and Only Once Processing Dynamically Expandable No data loss at any volume No interruption of service Leverage Commodity Hardware Single view of data Single management view Simple programming model Any Application Any Data Source Data Caching Analytics Transaction Processing Event Processing Cost Effective