MarkLogic Technology Briefing

Size: px
Start display at page:

Download "MarkLogic Technology Briefing"

Transcription

1 MarkLogic Technology Briefing Edd Patterson CTO/VP Systems Engineering, Americas Slide 1

2 Agenda Introductions About MarkLogic MarkLogic Server Deep Dive Slide 2

3 MarkLogic Overview Company Highlights Headquartered in Silicon Valley Founded in % CAGR revenue growth Privately held Patented, award-winning technology Slide 3 1

4 Select Customers Financial Services and Other Customers Government Customers Media Customers Slide 4

5 Inside MarkLogic Server Slide 5

6 MarkLogic Powers the World s Big Data Applications Use all your data to make your organization smarter Analyze a wide variety of structured, unstructured and semistructured data together to gain actionable insights Bring these insights into operational business processes via realtime Big Data Applications A unified Big Data platform for both analytics and applications Any data any volume any structure real-time. E.g. Derivatives contracts, customer records, social media, medical records, intelligence assets, journal articles, etc. Slide 6 Copyright MarkLogic Corporation. All rights reserved.

7 Elements of a Big Data Platform Tools / APIs Visualization Data Mining / Analytics Event Processing Metadata Search Analytic DB Operational DB Unstructured Content Ingest / Batch Analytics / Enrichment Archive / Warm Long Tail Data Store Slide 7

8 How is this usually implemented? BI Tools Applications Stitched together from multiple technologies: Stats (SPSS, SAS, R, ) Stream / Event Processing Search Search Index Each line is an opportunity for latency, ETL bugs, etc. Each component is managed, scaled, supported, etc. separately Analytic DB Operational DB Metadata Unstructured Content Store Applications interface with many technologies, sometimes managed by different groups Bottom Line: Batch Analytics (Hadoop MR) Archive (HDFS) Data governance is compromised Impossible to react in real time Agility is lost

9 MarkLogic Unified Platform for Big Data BI Tools Applications Stats (SPSS, SAS, R, ) Analytic DB Stream / Event Processing Operational DB Metadata Search Unstructured Content Store MarkLogic Server is An operational DBMS An analytic DBMS An unstructured DBMS A search engine An event processor all in one technology Search Index Batch Analytics (Hadoop MR) Archive (HDFS)

10 How MarkLogic Server Works Schema-Agnostic Design Slide 10

11 Data Model MarkLogic Server is a document-centric database Supports any-structured data via hierarchical (XML) data model Document fpml Title Author Trade Product First Last Section Metadata Cashflow Amount ID TradeLeg TradeLeg Trade ID TradeLeg Section Section Section Section Event Event Event Event Slide 11

12 MarkLogic is Schema Agnostic XML is self-describing <article> <title>marklogic Server:...</title> <author> <first-name>dale</first-name> <last-name>kim</last-name> </author> <abstract>.... <company>mark Logic</company> </abstract> <body> <section> <section>...</section> </section> <section>... index... </section> </body> <copyright>copyright... </copyright> </article> Slide 12

13 MarkLogic is Schema Agnostic XML is self-describing <article> <article> <title>marklogic Server:....</title> <author> <first-name>dale</first-name> <last-name>kim</last-name> </author> <author> <title> <abstract> <abstract>.... <company>marklogic</company> "MarkLogic Server:..." </abstract> <body> <section> <first-name> "... " <company> "... " <section>...</section> </section> <last-name> <section>.... index.... </section> "Dale" "MarkLogic" </body> <copyright>copyright "Kim"... </copyright> Slide 13 </article> No Schema Needed! <body> <copyright> <section> <section> <section>... " "... index... " "... "

14 How MarkLogic Server Works Indexing and Query Slide 14

15 Universal Index UNIVERSAL INDEX Term data base data base STEM be STEM data be Term List 123, 127, 129, 152, 344, , 125, 126, 129, 130, , 126, 130, 142, 143, , 130, 131, 135, 162, , 130, 167, 212, 219, Document References 126, 130, 167, <article>... <article>/<abstract>... <section>/<product>... <product>ims</product>... <title> contains "data"... Collection(Red)... Role:Editor + Action:Read... MarkLogic indexes Words Phrases Stemming Structure Values Collections Security Permissions Slide 15

16 Collections and Security Directories Exclusive, hierarchical, analogous to file system, based on URI Collections Set-based, N:N relationship Security Invisible to your app Slide 16

17 Scalars How many of the articles that contain data base were written in each of the last 5 decades? UNIVERSAL INDEX data base data base STEM be STEM data be <article> <article>/<abstract> 123, 127, 129, 152, 344, , 125, 126, 129, 130, , 126, 130, 142, 143, , 130, 131, 135, 162, , 130, 167, 212, 219, Document References 126, 130, 167, YEAR <section>/<product>... <product>ims</product>... Volume <title> contains "data"... Collection(Red)... Role:Editor + Action:Read... Slide 17

18 Range Indexes Maps document ids to values, and values to document ids In a compact memory representation DOC ID VALUE VALUE DOC ID Slide 18

19 Geospatial Index: A 2-Dimensional Range Index Built-in support for: Point Box Circle Polygon Complex Polygon Polygon Intersection Polygon Containment Fully composable with all other indexes! Slide 19

20 How MarkLogic Server Works Event Processing Slide 20

21 Reverse Indexes (Alerting) 1. Load serialized queries as query documents 2. For a given data document, find all queries that match Can provide real-time alerts during loads With no significant performance impact! Can let documents store values as "ranges" Documents about cities self-defining their geo boundaries Person documents defining birthdays as ranges, sequences Can power classifiers and "matchmaker" queries Slide 21

22 How MarkLogic Server Works Scale-out Slide 22

23 Databases Scale Out Database of documents Stored in partitions Database Partition 1 Partition 2 Partition 3 Slide 23

24 Shared-Nothing Architecture E-Node E-Node E-Node D-Node 1 D-Node 2 D-Node 3 D-Node k Forest 1 Forest 2 Forest 3 Forest 4 Forest m Slide 24

25 How MarkLogic Server Works Analytics Slide 25

26 Range Indexes: A Built-In In-Memory Column Store Maps document ids to values, and values to document ids In a compact memory representation DOC ID VALUE VALUE DOC ID Range Indexes are equivalent to a built-in in-memory column store Slide 26

27 Scalar Queries and Aggregation Slide 27

28 In-Database MapReduce E-Node start encode decode reduce finish decode map reduce encode D-Node 1 D-Node 2 D-Node 3 D-Node k Forest 1 Forest 2 Forest 3 Forest 4 Forest m Slide 28

29 Hadoop MapReduce via Bi-Directional Hadoop Connector Raw Data Hadoop? Intermediate Intelligence 3 1 Operational Applications Bulk Loading Progressive 2 Enhancement MarkLogic + Connector for Hadoop Slide 29

30 Co-Occurrence Slide 30

31 SQL and BI Tools ODBC SQL Range Indexes Slide 31

32 How MarkLogic Server Works Transactions Slide 32

33 MVCC /articles/codd.xml /articles/codd.xml Document Document Title First Author Last Metadata Section Title First Author Last Metadata Section Year Section Section Section Section Section Section Section Section Section Section c d Creation Timestamp Deleted Timestamp Timestamps can be: Increasing integers (Before MarkLogic 5) Increasing wall time (Starting with MarkLogic 5) Slide 33

34 MVCC Benefits /articles/codd.xml Very High Throughput Read queries don t require locks Queries and updates do not conflict Title First Document Author Metadata Section Last Year ACID Transactions Internal 2-phase commit between hosts (forest partitions) Section Section Section Section Section 628 Zero-latency between ingestion and indexing Slide 34

35 The Four Forest Operations Create a new document Into the in-memory stand buffer Mark a document as expired A memory-mapped timestamps document per stand Write buffer out to disk (checkpoint) Our buffers are 100s of megabytes For performance, double buffer Merge A background process Optimization: reduces number of stands in forest Slide 35

36 Consistency And Throughput 2-phase commit Transactions span forests Recovery Forest Journals Lock-free read queries Query at a point-in-time Repeatable reads Increased throughput Time travel (and near-instant DB rollback) Slide 36

37 HA/DR Features of MarkLogic Feature Function Benefit Use Case Database backup/restore Make a backup of your database, then restore it Recover from complete data loss Disaster Recovery Journal Archiving/Point-In-Time Recovery Snapshot backup Make a continuous backup; restore to a point in time, or to the point of failure Recover from complete data loss; recover all your data, or to just before a Bad Event Very fast backup using mirrored disk Recover from complete data loss; take a backup in seconds Disaster Recovery Disaster Recovery High Availability Database rollback Roll back to a point in time before a Bad Thing happened Recover in seconds from human error or a rogue application Disaster Recovery High Availability Automatic Failover Using Shared-Disk Local-Disk If a node fails, automatically failover to another node Recover from failure of a data node in a cluster High Availability Flexible Replication (part of Replication option) Maintain a hot copy of (part of) a database in another data center Move parts of a database, parts of documents, closer to users for improved performance Information Sharing Database Replication (part of Replication option) Maintain a hot copy of a database in another data center Recover from loss of a Data Center Disaster Recovery High Availability Distributed Transactions Slide 37 XA support for transactions that go across MarkLogic and other repositories Copyright MarkLogic Corporation. All rights All rights reserved. reserved. Keep an exact (synchronous) copy of your data in more than one place Disaster Recovery High Availability Information Sharing

38 OTC: Derivatives and Exotics Repository for Derivatives and other exotic products (trades, options, swaps, etc). Key Requirements - Native JSON support - Real time queries on semi-structured dat. - 7 year retention - Replication -BAR Slide 38 Customers in Production JP Morgan Chase - Derivatives Core Processing Platform enables risk management for $78 Trillion dollars in derivatives Relevant Features Native JSON Support Value based Lookups Tiered Storage Fine Grained Partition Management Enterprise Class Backup and Restore Features Replication Clustering

39 Equity Risk Systems Currently using traditional RDMS servers and file systems to store intraday and time series data. Customers in Production Where MarkLogic is Providing Similar Solutions Requirements - Scale out on commodity servers. - Ease of data modeling for BLOB s (unstructured data) -Handle Complex Data Slide 39 Relevant Features Schema Agnostic w/ Optional Validation Bi-temporal API User Defined Functions Linear Scalabiluty on commodity hardware Clustering Binary Support Fast native XPath Result and Data Caching

40 Document Modeling Enterprise Data Group (EDG) is revamping the workflow for the creation and management of the negotiated documents. Instead of capturing the end image and some metadata, we are modeling the creation of the document through templates, xml or similar documents. The business groups need to dynamically change the agreements, and constantly add new information to meet the Slide 40 day to day needs. C tl k Customers in Production MorganStanley Citi Docgenix JetBlue LexisNexis Relevant Features Native support for XML, JSON and Binary Schema Agnostic w/ Optional Validation ACID Compliant CRUD Value based lookups Document Library Services Search Indexing LDAP and Kerberos Integration Clustering Range Indexes

41 Log Analysis Enable analysis of system and user logs to evaluate user behavior and provide BI. Automatically capture and analyze logs from many sources.(web, DB, DW) Perform correlation of events and performance within time-slices Normalize and enhance data with metadata from applications. Analytics pipeline to compute agregates and statistical data. Dashboards for each source High Data Volumes 1TB/day for 45 days. ~45TB of total data 20M Requests/day Customers in Production Bank of America Enabled Bank of America to map their internal reference data architecture through log analysis Relevant Features Schema Agnostic Data Model Hadoop Integration BI Tool Integration Range Indexes User Defined Functions Transformation Capabilities Processing Framework Visualization Framework Application Server Linear Scalability Slide 41

42 In Conclusion Slide 42

43 MarkLogic Server is An operational DBMS with MVCC-based transaction model, with high throughput An analytic DBMS with in-memory column store with in-database MapReduce An unstructured DBMS with XML data model and ad-hoc schema A high-performance search engine with transactional universal index An event processor with serialized queries and alerting A unified Big Data platform Slide 43

44 Questions?? Slide 44

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED PLATFORM Executive Summary Financial institutions have implemented and continue to implement many disparate applications

More information

MarkLogic 8 Overview of Key Features COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

MarkLogic 8 Overview of Key Features COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. MarkLogic 8 Overview of Key Features Enterprise NoSQL Database Platform Flexible Data Model Store and manage JSON, XML, RDF, and Geospatial data with a documentcentric, schemaagnostic database Search and

More information

VOLTDB + HP VERTICA. page

VOLTDB + HP VERTICA. page VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics

More information

Study Guide. MarkLogic Professional Certification. Taking a Written Exam. General Preparation. Developer Written Exam Guide

Study Guide. MarkLogic Professional Certification. Taking a Written Exam. General Preparation. Developer Written Exam Guide Study Guide MarkLogic Professional Certification Taking a Written Exam General Preparation Developer Written Exam Guide Administrator Written Exam Guide Example Written Exam Questions Hands-On Exam Overview

More information

5 Fundamental Strategies for Building a Data-centered Data Center

5 Fundamental Strategies for Building a Data-centered Data Center 5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse

More information

Building a Data Strategy for a Digital World

Building a Data Strategy for a Digital World Building a Data Strategy for a Digital World Jason Hunter, CTO, APAC Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub 100 s of Service

More information

BEYOND THE RDBMS: WORKING WITH RELATIONAL DATA IN MARKLOGIC

BEYOND THE RDBMS: WORKING WITH RELATIONAL DATA IN MARKLOGIC BEYOND THE RDBMS: WORKING WITH RELATIONAL DATA IN MARKLOGIC Rob Rudin, Solutions Specialist, MarkLogic Agenda Introduction The problem getting relational data into MarkLogic Demo how to do this SLIDE:

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

How Insurers are Realising the Promise of Big Data

How Insurers are Realising the Promise of Big Data How Insurers are Realising the Promise of Big Data Jason Hunter, CTO Asia-Pacific, MarkLogic A Big Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies

More information

Scott Meder Senior Regional Sales Manager

Scott Meder Senior Regional Sales Manager www.raima.com Scott Meder Senior Regional Sales Manager scott.meder@raima.com Short Introduction to Raima What is Data Management What are your requirements? How do I make the right decision? - Architecture

More information

A Single Source of Truth

A Single Source of Truth A Single Source of Truth is it the mythical creature of data management? In the world of data management, a single source of truth is a fully trusted data source the ultimate authority for the particular

More information

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may

More information

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice 2014 年 3 月 13 日星期四 From Big Data to Big Value Infrastructure Needs and Huawei Best Practice Data-driven insight Making better, more informed decisions, faster Raw Data Capture Store Process Insight 1 Data

More information

IBM Data Replication for Big Data

IBM Data Replication for Big Data IBM Data Replication for Big Data Highlights Stream changes in realtime in Hadoop or Kafka data lakes or hubs Provide agility to data in data warehouses and data lakes Achieve minimum impact on source

More information

MAPR TECHNOLOGIES, INC. TECHNICAL BRIEF APRIL 2017 MAPR SNAPSHOTS

MAPR TECHNOLOGIES, INC. TECHNICAL BRIEF APRIL 2017 MAPR SNAPSHOTS MAPR TECHNOLOGIES, INC. TECHNICAL BRIEF APRIL 2017 MAPR SNAPSHOTS INTRODUCTION The ability to create and manage snapshots is an essential feature expected from enterprise-grade storage systems. This capability

More information

From Data Challenge to Data Opportunity

From Data Challenge to Data Opportunity From Data Challenge to Data Opportunity Jason Hunter, CTO Asia-Pacific, MarkLogic A Big Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub

More information

The Technology of the Business Data Lake. Appendix

The Technology of the Business Data Lake. Appendix The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform

More information

Modern Data Warehouse The New Approach to Azure BI

Modern Data Warehouse The New Approach to Azure BI Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics

More information

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 From Single Purpose to Multi Purpose Data Lakes Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 Agenda Data Lakes Multiple Purpose Data Lakes Customer Example Demo Takeaways

More information

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera, How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS

More information

Data Analytics at Logitech Snowflake + Tableau = #Winning

Data Analytics at Logitech Snowflake + Tableau = #Winning Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief

More information

<Insert Picture Here> Introduction to Big Data Technology

<Insert Picture Here> Introduction to Big Data Technology Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

More information

Lambda Architecture for Batch and Stream Processing. October 2018

Lambda Architecture for Batch and Stream Processing. October 2018 Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.

More information

5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992

5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992 2014-05-20 MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992 @SoQooL http://blog.mssqlserver.se Mattias.Lind@Sogeti.se 1 The evolution of the Microsoft data platform

More information

Implementing a Big Data Strategy PRASA Passenger Rail Agency of South Africa

Implementing a Big Data Strategy PRASA Passenger Rail Agency of South Africa Implementing a Big Data Strategy PRASA Passenger Rail Agency of South Africa MarkLogic World 2016 San Francisco AGENDA Agenda Introduction About the customer Project Goals Challenges The Solution Demo

More information

Modern Stream Processing with Apache Flink

Modern Stream Processing with Apache Flink 1 Modern Stream Processing with Apache Flink Till Rohrmann GOTO Berlin 2017 2 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager 3 What changes faster? Data

More information

Microsoft SQL Server

Microsoft SQL Server Microsoft SQL Server Abstract This white paper outlines the best practices for Microsoft SQL Server Failover Cluster Instance data protection with Cohesity DataPlatform. December 2017 Table of Contents

More information

RA-GRS, 130 replication support, ZRS, 130

RA-GRS, 130 replication support, ZRS, 130 Index A, B Agile approach advantages, 168 continuous software delivery, 167 definition, 167 disadvantages, 169 sprints, 167 168 Amazon Web Services (AWS) failure, 88 CloudTrail Service, 21 CloudWatch Service,

More information

Acquiring Big Data to Realize Business Value

Acquiring Big Data to Realize Business Value Acquiring Big Data to Realize Business Value Agenda What is Big Data? Common Big Data technologies Use Case Examples Oracle Products in the Big Data space In Summary: Big Data Takeaways

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data

More information

WHITEPAPER. MemSQL Enterprise Feature List

WHITEPAPER. MemSQL Enterprise Feature List WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure

More information

Introduction to Big-Data

Introduction to Big-Data Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,

More information

The Power of Snapshots Stateful Stream Processing with Apache Flink

The Power of Snapshots Stateful Stream Processing with Apache Flink The Power of Snapshots Stateful Stream Processing with Apache Flink Stephan Ewen QCon San Francisco, 2017 1 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager

More information

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024 Current support level End Mainstream End Extended SQL Server 2005 SQL Server 2008 and 2008 R2 SQL Server 2012 SQL Server 2005 SP4 is in extended support, which ends on April 12, 2016 SQL Server 2008 and

More information

BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved.

BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved. BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST 1 UNSTRUCTURED DATA GROWTH 75% 78% 80% 2015 71 EB 2016 106 EB 2017 133 EB Total Capacity Shipped, Worldwide % of Unstructured Data

More information

Sub Meter Data Import & Storage Platform RFP Questions/Answers

Sub Meter Data Import & Storage Platform RFP Questions/Answers Sub Meter Data Import & Storage Platform RFP Questions/Answers ADDED 10/12/2015 Q: The latter sections of the RFP indicate that you are looking for dashboarding features. Will VEIC accept a proposal that

More information

REGULATORY REPORTING FOR FINANCIAL SERVICES

REGULATORY REPORTING FOR FINANCIAL SERVICES REGULATORY REPORTING FOR FINANCIAL SERVICES Gordon Hughes, Global Sales Director, Intel Corporation Sinan Baskan, Solutions Director, Financial Services, MarkLogic Corporation Many regulators and regulations

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

Schema-Agnostic Indexing with Azure Document DB

Schema-Agnostic Indexing with Azure Document DB Schema-Agnostic Indexing with Azure Document DB Introduction Azure DocumentDB is Microsoft s multi-tenant distributed database service for managing JSON documents at Internet scale Multi-tenancy is an

More information

Security and Performance advances with Oracle Big Data SQL

Security and Performance advances with Oracle Big Data SQL Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,

More information

Convergence and Collaboration: Transforming Business Process and Workflows

Convergence and Collaboration: Transforming Business Process and Workflows Convergence and Collaboration: Transforming Business Process and Workflows Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Convergence & Collaboration:

More information

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX 1 Successful companies know that analytics are key to winning customer loyalty, optimizing business processes and beating their

More information

EMC Documentum xdb. High-performance native XML database optimized for storing and querying large volumes of XML content

EMC Documentum xdb. High-performance native XML database optimized for storing and querying large volumes of XML content DATA SHEET EMC Documentum xdb High-performance native XML database optimized for storing and querying large volumes of XML content The Big Picture Ideal for content-oriented applications like dynamic publishing

More information

Oracle Database 18c and Autonomous Database

Oracle Database 18c and Autonomous Database Oracle Database 18c and Autonomous Database Maria Colgan Oracle Database Product Management March 2018 @SQLMaria Safe Harbor Statement The following is intended to outline our general product direction.

More information

Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools

Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools SAP Technical Brief Data Warehousing SAP HANA Data Warehousing Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools A data warehouse for the modern age Data warehouses have been

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

HAWQ: A Massively Parallel Processing SQL Engine in Hadoop

HAWQ: A Massively Parallel Processing SQL Engine in Hadoop HAWQ: A Massively Parallel Processing SQL Engine in Hadoop Lei Chang, Zhanwei Wang, Tao Ma, Lirong Jian, Lili Ma, Alon Goldshuv Luke Lonergan, Jeffrey Cohen, Caleb Welton, Gavin Sherry, Milind Bhandarkar

More information

April Copyright 2013 Cloudera Inc. All rights reserved.

April Copyright 2013 Cloudera Inc. All rights reserved. Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on

More information

Technical Sheet NITRODB Time-Series Database

Technical Sheet NITRODB Time-Series Database Technical Sheet NITRODB Time-Series Database 10X Performance, 1/10th the Cost INTRODUCTION "#$#!%&''$!! NITRODB is an Apache Spark Based Time Series Database built to store and analyze 100s of terabytes

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development June14, 2012 1 Copyright 2012, Oracle and/or its affiliates. All rights Agenda Big Data Overview Oracle NoSQL Database Architecture Technical

More information

In-Memory Data Management

In-Memory Data Management In-Memory Data Management Martin Faust Research Assistant Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University of Potsdam Agenda 2 1. Changed Hardware 2.

More information

Cisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage

Cisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage White Paper Cisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage What You Will Learn A Cisco Tetration Analytics appliance bundles computing, networking, and storage resources in one

More information

Hortonworks DataFlow Sam Lachterman Solutions Engineer

Hortonworks DataFlow Sam Lachterman Solutions Engineer Hortonworks DataFlow Sam Lachterman Solutions Engineer 1 Hortonworks Inc. 2011 2017. All Rights Reserved Disclaimer This document may contain product features and technology directions that are under development,

More information

MarkLogic Server. Database Replication Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Database Replication Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved. Database Replication Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-3, September, 2017 Copyright 2017 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Database Replication

More information

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud Microsoft Azure Databricks for data engineering Building production data pipelines with Apache Spark in the cloud Azure Databricks As companies continue to set their sights on making data-driven decisions

More information

HYBRID TRANSACTION/ANALYTICAL PROCESSING COLIN MACNAUGHTON

HYBRID TRANSACTION/ANALYTICAL PROCESSING COLIN MACNAUGHTON HYBRID TRANSACTION/ANALYTICAL PROCESSING COLIN MACNAUGHTON WHO IS NEEVE RESEARCH? Headquartered in Silicon Valley Creators of the X Platform - Memory Oriented Application Platform Passionate about high

More information

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb The following is intended to outline our general product direction. It is intended for information purposes only,

More information

Flash Storage Complementing a Data Lake for Real-Time Insight

Flash Storage Complementing a Data Lake for Real-Time Insight Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum

More information

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM Note: Before you use this information

More information

Best practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP

Best practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP Best practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP 07.29.2015 LANDING STAGING DW Let s start with something basic Is Data Lake a new concept? What is the closest we can

More information

Achieving Horizontal Scalability. Alain Houf Sales Engineer

Achieving Horizontal Scalability. Alain Houf Sales Engineer Achieving Horizontal Scalability Alain Houf Sales Engineer Scale Matters InterSystems IRIS Database Platform lets you: Scale up and scale out Scale users and scale data Mix and match a variety of approaches

More information

NPP & Blockchain Have you thought about the data? Ken Krupa, CTO, MarkLogic

NPP & Blockchain Have you thought about the data? Ken Krupa, CTO, MarkLogic NPP & Blockchain Have you thought about the data? Ken Krupa, CTO, MarkLogic Hello SLIDE: 2 14 COPYRIGHT November 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. A QUICK LOOK New Payments Platform Open

More information

Oracle Big Data Connectors

Oracle Big Data Connectors Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process

More information

DocAve 6 Software Platform Service Pack 1

DocAve 6 Software Platform Service Pack 1 DocAve 6 Software Platform Service Pack 1 Release Notes For Microsoft SharePoint Release Date: September 25, 2012 1 New Features and Improvements General The new Plan Groups feature helps organize and

More information

Rickard Linck Client Technical Professional Core Database and Lifecycle Management Common Analytic Engine Cloud Data Servers On-Premise Data Servers

Rickard Linck Client Technical Professional Core Database and Lifecycle Management Common Analytic Engine Cloud Data Servers On-Premise Data Servers Rickard Linck Client Technical Professional Core Database and Lifecycle Management Common Analytic Engine Cloud Data Servers On-Premise Data Servers Watson Data Platform Reference Architecture Business

More information

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera

More information

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale

More information

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack Chief Architect RainStor Agenda Importance of Hadoop + data compression Data compression techniques Compression,

More information

Microsoft Big Data and Hadoop

Microsoft Big Data and Hadoop Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common

More information

Craig Blitz Oracle Coherence Product Management

Craig Blitz Oracle Coherence Product Management Software Architecture for Highly Available, Scalable Trading Apps: Meeting Low-Latency Requirements Intentionally Craig Blitz Oracle Coherence Product Management 1 Copyright 2011, Oracle and/or its affiliates.

More information

MarkLogic Server. Database Replication Guide. MarkLogic 6 September, Copyright 2012 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Database Replication Guide. MarkLogic 6 September, Copyright 2012 MarkLogic Corporation. All rights reserved. Database Replication Guide 1 MarkLogic 6 September, 2012 Last Revised: 6.0-1, September, 2012 Copyright 2012 MarkLogic Corporation. All rights reserved. Database Replication Guide 1.0 Database Replication

More information

SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less

SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less Dipl.- Inform. Volker Stöffler Volker.Stoeffler@DB-TecKnowledgy.info Public Agenda Introduction: What is SAP IQ - in a

More information

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon. THE EMC ISILON STORY Big Data In The Enterprise Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon August, 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology

More information

AnyMiner 3.0, Real-time Big Data Analysis Solution for Everything Data Analysis. Mar 25, TmaxSoft Co., Ltd. All Rights Reserved.

AnyMiner 3.0, Real-time Big Data Analysis Solution for Everything Data Analysis. Mar 25, TmaxSoft Co., Ltd. All Rights Reserved. AnyMiner 3.0, Real-time Big Analysis Solution for Everything Analysis Mar 25, 2015 2015 TmaxSoft Co., Ltd. All Rights Reserved. Ⅰ Ⅱ Ⅲ Platform for Net IT AnyMiner, Real-time Big Analysis Solution AnyMiner

More information

Cloud Analytics and Business Intelligence on AWS

Cloud Analytics and Business Intelligence on AWS Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse

More information

Delivering a 360 o View in Healthcare and Life Sciences With Agile Data

Delivering a 360 o View in Healthcare and Life Sciences With Agile Data Delivering a 360 o View in Healthcare and Life Sciences With Agile Data Imran Chaudhri, @imrantech, Solutions Director, Healthcare & Life Sciences Mark Ferneau, @ferneau, Practice Manager, Healthcare &

More information

What is MarkLogic Server? An overview

What is MarkLogic Server? An overview An overview By Jason Hunter October 2010 Table of Contents 3 What is MarkLogic Server? 3 Document Centric 3 Transactional 4 Search-Centric 4 Structure Aware 5 Schema Agnostic 5 XQuery and XSLT Driven 6

More information

Part 1: Indexes for Big Data

Part 1: Indexes for Big Data JethroData Making Interactive BI for Big Data a Reality Technical White Paper This white paper explains how JethroData can help you achieve a truly interactive interactive response time for BI on big data,

More information

Data Movement & Tiering with DMF 7

Data Movement & Tiering with DMF 7 Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why

More information

Map-Reduce. Marco Mura 2010 March, 31th

Map-Reduce. Marco Mura 2010 March, 31th Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of

More information

SQL Server New innovations. Ivan Kosyakov. Technical Architect, Ph.D., Microsoft Technology Center, New York

SQL Server New innovations. Ivan Kosyakov. Technical Architect, Ph.D.,  Microsoft Technology Center, New York 2016 New innovations Ivan Kosyakov Technical Architect, Ph.D., http://biz-excellence.com Microsoft Technology Center, New York The explosion of data sources... 25B 1.3B 4.0B There s an opportunity to drive

More information

Storage for HPC, HPDA and Machine Learning (ML)

Storage for HPC, HPDA and Machine Learning (ML) for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect mailto:kraemerf@de.ibm.com IBM Data Management for Autonomous Driving (AD) significantly increase development efficiency by

More information

<Insert Picture Here> Value of TimesTen Oracle TimesTen Product Overview

<Insert Picture Here> Value of TimesTen Oracle TimesTen Product Overview Value of TimesTen Oracle TimesTen Product Overview Shig Hiura Sales Consultant, Oracle Embedded Global Business Unit When You Think Database SQL RDBMS Results RDBMS + client/server

More information

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved. Monitoring MarkLogic Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-2, July, 2017 Copyright 2017 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Monitoring MarkLogic Guide

More information

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM Note: Before you use this

More information

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers A Distributed System Case Study: Apache Kafka High throughput messaging for diverse consumers As always, this is not a tutorial Some of the concepts may no longer be part of the current system or implemented

More information

Microsoft SQL Server Database Administration

Microsoft SQL Server Database Administration Address:- #403, 4 th Floor, Manjeera Square, Beside Prime Hospital, Ameerpet, Hyderabad 500038 Contact: - 040/66777220, 9177166122 Microsoft SQL Server Database Administration Course Overview This is 100%

More information

Data Management Glossary

Data Management Glossary Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative

More information

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015 Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document

More information

RealTime. RealTime. Real risks. Data recovery now possible in minutes, not hours or days. A Vyant Technologies Product. Situation Analysis

RealTime. RealTime. Real risks. Data recovery now possible in minutes, not hours or days. A Vyant Technologies Product. Situation Analysis RealTime A Vyant Technologies Product Real risks Data recovery now possible in minutes, not hours or days RealTime Vyant Technologies: data recovery in minutes Situation Analysis It is no longer acceptable

More information

Automating Information Lifecycle Management with

Automating Information Lifecycle Management with Automating Information Lifecycle Management with Oracle Database 2c The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

MySQL Cluster for Real Time, HA Services

MySQL Cluster for Real Time, HA Services MySQL Cluster for Real Time, HA Services Bill Papp (bill.papp@oracle.com) Principal MySQL Sales Consultant Oracle Agenda Overview of MySQL Cluster Design Goals, Evolution, Workloads,

More information

Approaching the Petabyte Analytic Database: What I learned

Approaching the Petabyte Analytic Database: What I learned Disclaimer This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may

More information

MarkLogic Server. Administrator s Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Administrator s Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved. Administrator s Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-3, September, 2017 Copyright 2017 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Administrator s Guide 1.0

More information

Microsoft SQL Server Fix Pack 15. Reference IBM

Microsoft SQL Server Fix Pack 15. Reference IBM Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Note Before using this information and the product it supports, read the information in Notices

More information

LazyBase: Trading freshness and performance in a scalable database

LazyBase: Trading freshness and performance in a scalable database LazyBase: Trading freshness and performance in a scalable database (EuroSys 2012) Jim Cipar, Greg Ganger, *Kimberly Keeton, *Craig A. N. Soules, *Brad Morrey, *Alistair Veitch PARALLEL DATA LABORATORY

More information

Solution Brief. Bridging the Infrastructure Gap for Unstructured Data with Object Storage. 89 Fifth Avenue, 7th Floor. New York, NY 10003

Solution Brief. Bridging the Infrastructure Gap for Unstructured Data with Object Storage. 89 Fifth Avenue, 7th Floor. New York, NY 10003 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 Solution Brief Bridging the Infrastructure Gap for Unstructured Data with Object Storage Printed in the United

More information

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time

More information