Low Latency Data Grids in Finance

Similar documents
<Insert Picture Here> Oracle Coherence & Extreme Transaction Processing (XTP)

Oracle and Tangosol Acquisition Announcement

Achieving Horizontal Scalability. Alain Houf Sales Engineer

Top Trends in DBMS & DW

Designing for Scalability. Patrick Linskey EJB Team Lead BEA Systems

How Real Time Are Your Analytics?

Pimp My Data Grid. Brian Oliver Senior Principal Solutions Architect <Insert Picture Here>

Craig Blitz Oracle Coherence Product Management

VOLTDB + HP VERTICA. page

BUILT FOR THE SPEED OF BUSINESS

<Insert Picture Here> Getting Coherence: Introduction to Data Grids Jfokus Conference, 28 January 2009

Toward a Memory-centric Architecture

HYBRID TRANSACTION/ANALYTICAL PROCESSING COLIN MACNAUGHTON

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Developing Microsoft Azure Solutions (70-532) Syllabus

Datacenter replication solution with quasardb

RA-GRS, 130 replication support, ZRS, 130

Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform

Monitoring & Tuning Azure SQL Database

Unified Management for Virtual Storage

I/O Buffering and Streaming

Scaling Without Sharding. Baron Schwartz Percona Inc Surge 2010

App Servers NG: Characteristics of The Next Generation Application Servers. Guy Nirpaz, VP R&D and Chief Architect GigaSpaces Technologies

<Insert Picture Here> Enterprise Data Management using Grid Technology

PRESENTATION TITLE GOES HERE

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor

Postgres Plus and JBoss

Massive Scalability With InterSystems IRIS Data Platform

Web Serving Architectures

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

Assignment 5. Georgia Koloniari


<Insert Picture Here> QCon: London 2009 Data Grid Design Patterns

<Insert Picture Here> Oracle Application Cache Solution: Coherence

EBOOK DATABASE CONSIDERATIONS FOR DOCKER

Performance and Scalability with Griddable.io

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

Caching patterns and extending mobile applications with elastic caching (With Demonstration)

An Introduction to GPFS

Storage Systems for Serverless Analytics

Azure Scalability Prescriptive Architecture using the Enzo Multitenant Framework

<Insert Picture Here> Value of TimesTen Oracle TimesTen Product Overview

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment

Oracle Database 10G. Lindsey M. Pickle, Jr. Senior Solution Specialist Database Technologies Oracle Corporation

Oracle Exadata: Strategy and Roadmap

Designing Modern Apps Using New Capabilities in Microsoft Azure SQL Database. Bill Gibson, Principal Program Manager, SQL Database

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Fast Innovation requires Fast IT

Increasing Performance of Existing Oracle RAC up to 10X

Enterprise Planning Large Scale

Lessons Learned Operating Active/Active Data Centers Ethan Banks, CCIE

Automating Information Lifecycle Management with

Enterprise Planning Large Scale

Improve Web Application Performance with Zend Platform

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Chapter 20: Database System Architectures

Distributed and Fault-Tolerant Execution Framework for Transaction Processing

OLAP Introduction and Overview

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters

Take control of storage performance

Distributed File Systems II

SQL Server in Azure. Marek Chmel. Microsoft MVP: Data Platform Microsoft MCSE: Data Management & Analytics Certified Ethical Hacker

Storage Optimization with Oracle Database 11g

Embedded Technosolutions

CHAPTER 3 GRID MONITORING AND RESOURCE SELECTION

Pragmatic Clustering. Mike Cannon-Brookes CEO, Atlassian Software Systems

HCI: Hyper-Converged Infrastructure

Architecting Microsoft Azure Solutions (proposed exam 535)

Identifying Workloads for the Cloud

Service Mesh and Microservices Networking

L7: Performance. Frans Kaashoek Spring 2013

CS4513 Distributed Computer Systems

NAS for Server Virtualization Dennis Chapman Senior Technical Director NetApp

S-Store: Streaming Meets Transaction Processing

Cloud Computing. What is cloud computing. CS 537 Fall 2017

Cloud Programming on Java EE Platforms. mgr inż. Piotr Nowak

Architecting & Tuning IIB / extreme Scale for Maximum Performance and Reliability

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

SALES PORTAL USER GUIDE. Last Updated: 6/23/2015

In-Memory Technology in Life Sciences

Common Design Principles for kdb+ Gateways

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

IOTA ARCHITECTURE: DATA VIRTUALIZATION AND PROCESSING MEDIUM DR. KONSTANTIN BOUDNIK DR. ALEXANDRE BOUDNIK

Distributed KIDS Labs 1

Grid Computing Systems: A Survey and Taxonomy

When, Where & Why to Use NoSQL?

Chapter 18: Database System Architectures.! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems!

Azure SQL Database. Indika Dalugama. Data platform solution architect Microsoft datalake.lk

White Paper. Major Performance Tuning Considerations for Weblogic Server

Small verse Large. The Performance Tester Paradox. Copyright 1202Performance

Coherence & WebLogic Server integration with Coherence (Active Cache)

Configuration changes such as conversion from a single instance to RAC, ASM, etc.

Randy Pagels Sr. Developer Technology Specialist DX US Team AZURE PRIMED

The Software Driven Datacenter

Multiprocessor Scheduling. Multiprocessor Scheduling

Multiprocessor Scheduling

Disruptor Using High Performance, Low Latency Technology in the CERN Control System

ArcGIS Enterprise: Performance and Scalability Best Practices. Darren Baird, PE, Esri

Transcription:

Low Latency Data Grids in Finance Jags Ramnarayan Chief Architect GemStone Systems jags.ramnarayan@gemstone.com Copyright 2006, GemStone Systems Inc. All Rights Reserved.

Background on GemStone Systems Known for its Object Database technology since 1982 Now specializes in memory-oriented distributed data management Over 200 installed customers in global 2000 Grid focus driven by: Very high performance with predictable throughput, latency and availability Capital markets Large e-commerce portals real time fraud Federal intelligence

Use of Grid computing in finance Two primary areas in tier 1 investment banks Risk Analytics Pricing

State of affairs Risk Analytics Deluge of data (market data, trade data, etc) Overnight batch job doesn t cut it Want intra-day risk metrics In some cases, real-time risk Explosion in simulation scenarios More accurate risk exposure Compliance Increasing number of smaller calculations

State of affairs Pricing (derivatives) Too many products Increasing complexity in products Too many underliers Many relationships Hunger for latency reduction Calculating the new price with lowest possible latency Pushing the prices to distributed applications

Where is the problem? Grid Scheduler Compute farm Data warehouses Rational databases File system Database/file access contention Too many concurrent connections Large database server bottlenecks on network Queries results are large causing CPU bottlenecks Even a parallel file system throttled by disk speeds Too much data transfer Between tasks, Jobs Between Grid and file systems, databases Data consistency issues CPU bound job turns into a IO bound Job

Data Fabric for Risk Analytics When data is stored, it is transparently replicated and/or partitioned; Redundant storage can be in memory and/or on disk ensures continuous availability Keep reference data replicated on many; partition trade data Pool memory (and disk) across cluster ; parallelize data access and computation to achieve very high aggregate throughput Machine nodes can be added dynamically to expand storage capacity or to handle increased client load

Data Fabric for Risk Analytics TaskFlow - As results are generated push events to compute nodes to initiate subsequent computation Avoid bulk data transfer across tasks or Jobs Thousands of compute nodes can maintain local cache of most frequently used data; Optionally use local disk for overflow Move reference data to local cache Synchronous read through, write through or Asynchronous write-behind to other data sources and sinks

Move business logic to data Principle: Move task to computational resource with most of the relevant data before considering other nodes where data transfer becomes necessary Parallel function execution service ( Map Reduce ) Data dependency hints Routing key, collection of keys, where clause(s) Serial or parallel execution Exec functions FIFO Queue f, f 1 2, f n Function (f2) Submit (f1) -> AggregateHighValueTrades(<input data>, where trades.month= Sept Sept ) Sept Trades Function (f1) Data fabric Resources

Key lessons Apps should think about capitalizing memory across Grid (it is abundant) Keep IO cycles to minimum through main memory caching of operational data sets Scavange Grid memory and avoid data source access Achieve linear scaling for your Grid apps by horizontally partitioning your data and behavior Read Pat helland s Life beyond Distributed transactions (http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf) Get more info on the GemFire data fabric http://www.gemstone.com/gemfire