MarkLogic Technology Briefing
|
|
- Lionel Foster
- 5 years ago
- Views:
Transcription
1 MarkLogic Technology Briefing Edd Patterson CTO/VP Systems Engineering, Americas Slide 1
2 Agenda Introductions About MarkLogic MarkLogic Server Deep Dive Slide 2
3 MarkLogic Overview Company Highlights Headquartered in Silicon Valley Founded in % CAGR revenue growth Privately held Patented, award-winning technology Slide 3 1
4 Select Customers Financial Services and Other Customers Government Customers Media Customers Slide 4
5 Inside MarkLogic Server Slide 5
6 MarkLogic Powers the World s Big Data Applications Use all your data to make your organization smarter Analyze a wide variety of structured, unstructured and semistructured data together to gain actionable insights Bring these insights into operational business processes via realtime Big Data Applications A unified Big Data platform for both analytics and applications Any data any volume any structure real-time. E.g. Derivatives contracts, customer records, social media, medical records, intelligence assets, journal articles, etc. Slide 6 Copyright MarkLogic Corporation. All rights reserved.
7 Elements of a Big Data Platform Tools / APIs Visualization Data Mining / Analytics Event Processing Metadata Search Analytic DB Operational DB Unstructured Content Ingest / Batch Analytics / Enrichment Archive / Warm Long Tail Data Store Slide 7
8 How is this usually implemented? BI Tools Applications Stitched together from multiple technologies: Stats (SPSS, SAS, R, ) Stream / Event Processing Search Search Index Each line is an opportunity for latency, ETL bugs, etc. Each component is managed, scaled, supported, etc. separately Analytic DB Operational DB Metadata Unstructured Content Store Applications interface with many technologies, sometimes managed by different groups Bottom Line: Batch Analytics (Hadoop MR) Archive (HDFS) Data governance is compromised Impossible to react in real time Agility is lost
9 MarkLogic Unified Platform for Big Data BI Tools Applications Stats (SPSS, SAS, R, ) Analytic DB Stream / Event Processing Operational DB Metadata Search Unstructured Content Store MarkLogic Server is An operational DBMS An analytic DBMS An unstructured DBMS A search engine An event processor all in one technology Search Index Batch Analytics (Hadoop MR) Archive (HDFS)
10 How MarkLogic Server Works Schema-Agnostic Design Slide 10
11 Data Model MarkLogic Server is a document-centric database Supports any-structured data via hierarchical (XML) data model Document fpml Title Author Trade Product First Last Section Metadata Cashflow Amount ID TradeLeg TradeLeg Trade ID TradeLeg Section Section Section Section Event Event Event Event Slide 11
12 MarkLogic is Schema Agnostic XML is self-describing <article> <title>marklogic Server:...</title> <author> <first-name>dale</first-name> <last-name>kim</last-name> </author> <abstract>.... <company>mark Logic</company> </abstract> <body> <section> <section>...</section> </section> <section>... index... </section> </body> <copyright>copyright... </copyright> </article> Slide 12
13 MarkLogic is Schema Agnostic XML is self-describing <article> <article> <title>marklogic Server:....</title> <author> <first-name>dale</first-name> <last-name>kim</last-name> </author> <author> <title> <abstract> <abstract>.... <company>marklogic</company> "MarkLogic Server:..." </abstract> <body> <section> <first-name> "... " <company> "... " <section>...</section> </section> <last-name> <section>.... index.... </section> "Dale" "MarkLogic" </body> <copyright>copyright "Kim"... </copyright> Slide 13 </article> No Schema Needed! <body> <copyright> <section> <section> <section>... " "... index... " "... "
14 How MarkLogic Server Works Indexing and Query Slide 14
15 Universal Index UNIVERSAL INDEX Term data base data base STEM be STEM data be Term List 123, 127, 129, 152, 344, , 125, 126, 129, 130, , 126, 130, 142, 143, , 130, 131, 135, 162, , 130, 167, 212, 219, Document References 126, 130, 167, <article>... <article>/<abstract>... <section>/<product>... <product>ims</product>... <title> contains "data"... Collection(Red)... Role:Editor + Action:Read... MarkLogic indexes Words Phrases Stemming Structure Values Collections Security Permissions Slide 15
16 Collections and Security Directories Exclusive, hierarchical, analogous to file system, based on URI Collections Set-based, N:N relationship Security Invisible to your app Slide 16
17 Scalars How many of the articles that contain data base were written in each of the last 5 decades? UNIVERSAL INDEX data base data base STEM be STEM data be <article> <article>/<abstract> 123, 127, 129, 152, 344, , 125, 126, 129, 130, , 126, 130, 142, 143, , 130, 131, 135, 162, , 130, 167, 212, 219, Document References 126, 130, 167, YEAR <section>/<product>... <product>ims</product>... Volume <title> contains "data"... Collection(Red)... Role:Editor + Action:Read... Slide 17
18 Range Indexes Maps document ids to values, and values to document ids In a compact memory representation DOC ID VALUE VALUE DOC ID Slide 18
19 Geospatial Index: A 2-Dimensional Range Index Built-in support for: Point Box Circle Polygon Complex Polygon Polygon Intersection Polygon Containment Fully composable with all other indexes! Slide 19
20 How MarkLogic Server Works Event Processing Slide 20
21 Reverse Indexes (Alerting) 1. Load serialized queries as query documents 2. For a given data document, find all queries that match Can provide real-time alerts during loads With no significant performance impact! Can let documents store values as "ranges" Documents about cities self-defining their geo boundaries Person documents defining birthdays as ranges, sequences Can power classifiers and "matchmaker" queries Slide 21
22 How MarkLogic Server Works Scale-out Slide 22
23 Databases Scale Out Database of documents Stored in partitions Database Partition 1 Partition 2 Partition 3 Slide 23
24 Shared-Nothing Architecture E-Node E-Node E-Node D-Node 1 D-Node 2 D-Node 3 D-Node k Forest 1 Forest 2 Forest 3 Forest 4 Forest m Slide 24
25 How MarkLogic Server Works Analytics Slide 25
26 Range Indexes: A Built-In In-Memory Column Store Maps document ids to values, and values to document ids In a compact memory representation DOC ID VALUE VALUE DOC ID Range Indexes are equivalent to a built-in in-memory column store Slide 26
27 Scalar Queries and Aggregation Slide 27
28 In-Database MapReduce E-Node start encode decode reduce finish decode map reduce encode D-Node 1 D-Node 2 D-Node 3 D-Node k Forest 1 Forest 2 Forest 3 Forest 4 Forest m Slide 28
29 Hadoop MapReduce via Bi-Directional Hadoop Connector Raw Data Hadoop? Intermediate Intelligence 3 1 Operational Applications Bulk Loading Progressive 2 Enhancement MarkLogic + Connector for Hadoop Slide 29
30 Co-Occurrence Slide 30
31 SQL and BI Tools ODBC SQL Range Indexes Slide 31
32 How MarkLogic Server Works Transactions Slide 32
33 MVCC /articles/codd.xml /articles/codd.xml Document Document Title First Author Last Metadata Section Title First Author Last Metadata Section Year Section Section Section Section Section Section Section Section Section Section c d Creation Timestamp Deleted Timestamp Timestamps can be: Increasing integers (Before MarkLogic 5) Increasing wall time (Starting with MarkLogic 5) Slide 33
34 MVCC Benefits /articles/codd.xml Very High Throughput Read queries don t require locks Queries and updates do not conflict Title First Document Author Metadata Section Last Year ACID Transactions Internal 2-phase commit between hosts (forest partitions) Section Section Section Section Section 628 Zero-latency between ingestion and indexing Slide 34
35 The Four Forest Operations Create a new document Into the in-memory stand buffer Mark a document as expired A memory-mapped timestamps document per stand Write buffer out to disk (checkpoint) Our buffers are 100s of megabytes For performance, double buffer Merge A background process Optimization: reduces number of stands in forest Slide 35
36 Consistency And Throughput 2-phase commit Transactions span forests Recovery Forest Journals Lock-free read queries Query at a point-in-time Repeatable reads Increased throughput Time travel (and near-instant DB rollback) Slide 36
37 HA/DR Features of MarkLogic Feature Function Benefit Use Case Database backup/restore Make a backup of your database, then restore it Recover from complete data loss Disaster Recovery Journal Archiving/Point-In-Time Recovery Snapshot backup Make a continuous backup; restore to a point in time, or to the point of failure Recover from complete data loss; recover all your data, or to just before a Bad Event Very fast backup using mirrored disk Recover from complete data loss; take a backup in seconds Disaster Recovery Disaster Recovery High Availability Database rollback Roll back to a point in time before a Bad Thing happened Recover in seconds from human error or a rogue application Disaster Recovery High Availability Automatic Failover Using Shared-Disk Local-Disk If a node fails, automatically failover to another node Recover from failure of a data node in a cluster High Availability Flexible Replication (part of Replication option) Maintain a hot copy of (part of) a database in another data center Move parts of a database, parts of documents, closer to users for improved performance Information Sharing Database Replication (part of Replication option) Maintain a hot copy of a database in another data center Recover from loss of a Data Center Disaster Recovery High Availability Distributed Transactions Slide 37 XA support for transactions that go across MarkLogic and other repositories Copyright MarkLogic Corporation. All rights All rights reserved. reserved. Keep an exact (synchronous) copy of your data in more than one place Disaster Recovery High Availability Information Sharing
38 OTC: Derivatives and Exotics Repository for Derivatives and other exotic products (trades, options, swaps, etc). Key Requirements - Native JSON support - Real time queries on semi-structured dat. - 7 year retention - Replication -BAR Slide 38 Customers in Production JP Morgan Chase - Derivatives Core Processing Platform enables risk management for $78 Trillion dollars in derivatives Relevant Features Native JSON Support Value based Lookups Tiered Storage Fine Grained Partition Management Enterprise Class Backup and Restore Features Replication Clustering
39 Equity Risk Systems Currently using traditional RDMS servers and file systems to store intraday and time series data. Customers in Production Where MarkLogic is Providing Similar Solutions Requirements - Scale out on commodity servers. - Ease of data modeling for BLOB s (unstructured data) -Handle Complex Data Slide 39 Relevant Features Schema Agnostic w/ Optional Validation Bi-temporal API User Defined Functions Linear Scalabiluty on commodity hardware Clustering Binary Support Fast native XPath Result and Data Caching
40 Document Modeling Enterprise Data Group (EDG) is revamping the workflow for the creation and management of the negotiated documents. Instead of capturing the end image and some metadata, we are modeling the creation of the document through templates, xml or similar documents. The business groups need to dynamically change the agreements, and constantly add new information to meet the Slide 40 day to day needs. C tl k Customers in Production MorganStanley Citi Docgenix JetBlue LexisNexis Relevant Features Native support for XML, JSON and Binary Schema Agnostic w/ Optional Validation ACID Compliant CRUD Value based lookups Document Library Services Search Indexing LDAP and Kerberos Integration Clustering Range Indexes
41 Log Analysis Enable analysis of system and user logs to evaluate user behavior and provide BI. Automatically capture and analyze logs from many sources.(web, DB, DW) Perform correlation of events and performance within time-slices Normalize and enhance data with metadata from applications. Analytics pipeline to compute agregates and statistical data. Dashboards for each source High Data Volumes 1TB/day for 45 days. ~45TB of total data 20M Requests/day Customers in Production Bank of America Enabled Bank of America to map their internal reference data architecture through log analysis Relevant Features Schema Agnostic Data Model Hadoop Integration BI Tool Integration Range Indexes User Defined Functions Transformation Capabilities Processing Framework Visualization Framework Application Server Linear Scalability Slide 41
42 In Conclusion Slide 42
43 MarkLogic Server is An operational DBMS with MVCC-based transaction model, with high throughput An analytic DBMS with in-memory column store with in-database MapReduce An unstructured DBMS with XML data model and ad-hoc schema A high-performance search engine with transactional universal index An event processor with serialized queries and alerting A unified Big Data platform Slide 43
44 Questions?? Slide 44
CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM
CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED PLATFORM Executive Summary Financial institutions have implemented and continue to implement many disparate applications
More informationMarkLogic 8 Overview of Key Features COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic 8 Overview of Key Features Enterprise NoSQL Database Platform Flexible Data Model Store and manage JSON, XML, RDF, and Geospatial data with a documentcentric, schemaagnostic database Search and
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationStudy Guide. MarkLogic Professional Certification. Taking a Written Exam. General Preparation. Developer Written Exam Guide
Study Guide MarkLogic Professional Certification Taking a Written Exam General Preparation Developer Written Exam Guide Administrator Written Exam Guide Example Written Exam Questions Hands-On Exam Overview
More information5 Fundamental Strategies for Building a Data-centered Data Center
5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse
More informationBuilding a Data Strategy for a Digital World
Building a Data Strategy for a Digital World Jason Hunter, CTO, APAC Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub 100 s of Service
More informationBEYOND THE RDBMS: WORKING WITH RELATIONAL DATA IN MARKLOGIC
BEYOND THE RDBMS: WORKING WITH RELATIONAL DATA IN MARKLOGIC Rob Rudin, Solutions Specialist, MarkLogic Agenda Introduction The problem getting relational data into MarkLogic Demo how to do this SLIDE:
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationHow Insurers are Realising the Promise of Big Data
How Insurers are Realising the Promise of Big Data Jason Hunter, CTO Asia-Pacific, MarkLogic A Big Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies
More informationScott Meder Senior Regional Sales Manager
www.raima.com Scott Meder Senior Regional Sales Manager scott.meder@raima.com Short Introduction to Raima What is Data Management What are your requirements? How do I make the right decision? - Architecture
More informationA Single Source of Truth
A Single Source of Truth is it the mythical creature of data management? In the world of data management, a single source of truth is a fully trusted data source the ultimate authority for the particular
More informationBring Context To Your Machine Data With Hadoop, RDBMS & Splunk
Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may
More information2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice
2014 年 3 月 13 日星期四 From Big Data to Big Value Infrastructure Needs and Huawei Best Practice Data-driven insight Making better, more informed decisions, faster Raw Data Capture Store Process Insight 1 Data
More informationIBM Data Replication for Big Data
IBM Data Replication for Big Data Highlights Stream changes in realtime in Hadoop or Kafka data lakes or hubs Provide agility to data in data warehouses and data lakes Achieve minimum impact on source
More informationMAPR TECHNOLOGIES, INC. TECHNICAL BRIEF APRIL 2017 MAPR SNAPSHOTS
MAPR TECHNOLOGIES, INC. TECHNICAL BRIEF APRIL 2017 MAPR SNAPSHOTS INTRODUCTION The ability to create and manage snapshots is an essential feature expected from enterprise-grade storage systems. This capability
More informationFrom Data Challenge to Data Opportunity
From Data Challenge to Data Opportunity Jason Hunter, CTO Asia-Pacific, MarkLogic A Big Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies Data Hub
More informationThe Technology of the Business Data Lake. Appendix
The Technology of the Business Data Lake Appendix Pivotal data products Term Greenplum Database GemFire Pivotal HD Spring XD Pivotal Data Dispatch Pivotal Analytics Description A massively parallel platform
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationFrom Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019
From Single Purpose to Multi Purpose Data Lakes Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 Agenda Data Lakes Multiple Purpose Data Lakes Customer Example Demo Takeaways
More informationHow Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,
How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS
More informationData Analytics at Logitech Snowflake + Tableau = #Winning
Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief
More information<Insert Picture Here> Introduction to Big Data Technology
Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
More informationLambda Architecture for Batch and Stream Processing. October 2018
Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.
More information5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992
2014-05-20 MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992 @SoQooL http://blog.mssqlserver.se Mattias.Lind@Sogeti.se 1 The evolution of the Microsoft data platform
More informationImplementing a Big Data Strategy PRASA Passenger Rail Agency of South Africa
Implementing a Big Data Strategy PRASA Passenger Rail Agency of South Africa MarkLogic World 2016 San Francisco AGENDA Agenda Introduction About the customer Project Goals Challenges The Solution Demo
More informationModern Stream Processing with Apache Flink
1 Modern Stream Processing with Apache Flink Till Rohrmann GOTO Berlin 2017 2 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager 3 What changes faster? Data
More informationMicrosoft SQL Server
Microsoft SQL Server Abstract This white paper outlines the best practices for Microsoft SQL Server Failover Cluster Instance data protection with Cohesity DataPlatform. December 2017 Table of Contents
More informationRA-GRS, 130 replication support, ZRS, 130
Index A, B Agile approach advantages, 168 continuous software delivery, 167 definition, 167 disadvantages, 169 sprints, 167 168 Amazon Web Services (AWS) failure, 88 CloudTrail Service, 21 CloudWatch Service,
More informationAcquiring Big Data to Realize Business Value
Acquiring Big Data to Realize Business Value Agenda What is Big Data? Common Big Data technologies Use Case Examples Oracle Products in the Big Data space In Summary: Big Data Takeaways
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data
More informationWHITEPAPER. MemSQL Enterprise Feature List
WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure
More informationIntroduction to Big-Data
Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,
More informationThe Power of Snapshots Stateful Stream Processing with Apache Flink
The Power of Snapshots Stateful Stream Processing with Apache Flink Stephan Ewen QCon San Francisco, 2017 1 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager
More informationSQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024
Current support level End Mainstream End Extended SQL Server 2005 SQL Server 2008 and 2008 R2 SQL Server 2012 SQL Server 2005 SP4 is in extended support, which ends on April 12, 2016 SQL Server 2008 and
More informationBUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST. Copyright 2016 EMC Corporation. All rights reserved.
BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST 1 UNSTRUCTURED DATA GROWTH 75% 78% 80% 2015 71 EB 2016 106 EB 2017 133 EB Total Capacity Shipped, Worldwide % of Unstructured Data
More informationSub Meter Data Import & Storage Platform RFP Questions/Answers
Sub Meter Data Import & Storage Platform RFP Questions/Answers ADDED 10/12/2015 Q: The latter sections of the RFP indicate that you are looking for dashboarding features. Will VEIC accept a proposal that
More informationREGULATORY REPORTING FOR FINANCIAL SERVICES
REGULATORY REPORTING FOR FINANCIAL SERVICES Gordon Hughes, Global Sales Director, Intel Corporation Sinan Baskan, Solutions Director, Financial Services, MarkLogic Corporation Many regulators and regulations
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationSchema-Agnostic Indexing with Azure Document DB
Schema-Agnostic Indexing with Azure Document DB Introduction Azure DocumentDB is Microsoft s multi-tenant distributed database service for managing JSON documents at Internet scale Multi-tenancy is an
More informationSecurity and Performance advances with Oracle Big Data SQL
Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,
More informationConvergence and Collaboration: Transforming Business Process and Workflows
Convergence and Collaboration: Transforming Business Process and Workflows Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Convergence & Collaboration:
More informationUNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX
UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX 1 Successful companies know that analytics are key to winning customer loyalty, optimizing business processes and beating their
More informationEMC Documentum xdb. High-performance native XML database optimized for storing and querying large volumes of XML content
DATA SHEET EMC Documentum xdb High-performance native XML database optimized for storing and querying large volumes of XML content The Big Picture Ideal for content-oriented applications like dynamic publishing
More informationOracle Database 18c and Autonomous Database
Oracle Database 18c and Autonomous Database Maria Colgan Oracle Database Product Management March 2018 @SQLMaria Safe Harbor Statement The following is intended to outline our general product direction.
More informationCombine Native SQL Flexibility with SAP HANA Platform Performance and Tools
SAP Technical Brief Data Warehousing SAP HANA Data Warehousing Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools A data warehouse for the modern age Data warehouses have been
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationHAWQ: A Massively Parallel Processing SQL Engine in Hadoop
HAWQ: A Massively Parallel Processing SQL Engine in Hadoop Lei Chang, Zhanwei Wang, Tao Ma, Lirong Jian, Lili Ma, Alon Goldshuv Luke Lonergan, Jeffrey Cohen, Caleb Welton, Gavin Sherry, Milind Bhandarkar
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationTechnical Sheet NITRODB Time-Series Database
Technical Sheet NITRODB Time-Series Database 10X Performance, 1/10th the Cost INTRODUCTION "#$#!%&''$!! NITRODB is an Apache Spark Based Time Series Database built to store and analyze 100s of terabytes
More informationSafe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
More informationOracle NoSQL Database Overview Marie-Anne Neimat, VP Development
Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development June14, 2012 1 Copyright 2012, Oracle and/or its affiliates. All rights Agenda Big Data Overview Oracle NoSQL Database Architecture Technical
More informationIn-Memory Data Management
In-Memory Data Management Martin Faust Research Assistant Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University of Potsdam Agenda 2 1. Changed Hardware 2.
More informationCisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage
White Paper Cisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage What You Will Learn A Cisco Tetration Analytics appliance bundles computing, networking, and storage resources in one
More informationHortonworks DataFlow Sam Lachterman Solutions Engineer
Hortonworks DataFlow Sam Lachterman Solutions Engineer 1 Hortonworks Inc. 2011 2017. All Rights Reserved Disclaimer This document may contain product features and technology directions that are under development,
More informationMarkLogic Server. Database Replication Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved.
Database Replication Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-3, September, 2017 Copyright 2017 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Database Replication
More informationMicrosoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud
Microsoft Azure Databricks for data engineering Building production data pipelines with Apache Spark in the cloud Azure Databricks As companies continue to set their sights on making data-driven decisions
More informationHYBRID TRANSACTION/ANALYTICAL PROCESSING COLIN MACNAUGHTON
HYBRID TRANSACTION/ANALYTICAL PROCESSING COLIN MACNAUGHTON WHO IS NEEVE RESEARCH? Headquartered in Silicon Valley Creators of the X Platform - Memory Oriented Application Platform Passionate about high
More information<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store
Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb The following is intended to outline our general product direction. It is intended for information purposes only,
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationIBM Spectrum Protect Version Introduction to Data Protection Solutions IBM
IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM Note: Before you use this information
More informationBest practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP
Best practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP 07.29.2015 LANDING STAGING DW Let s start with something basic Is Data Lake a new concept? What is the closest we can
More informationAchieving Horizontal Scalability. Alain Houf Sales Engineer
Achieving Horizontal Scalability Alain Houf Sales Engineer Scale Matters InterSystems IRIS Database Platform lets you: Scale up and scale out Scale users and scale data Mix and match a variety of approaches
More informationNPP & Blockchain Have you thought about the data? Ken Krupa, CTO, MarkLogic
NPP & Blockchain Have you thought about the data? Ken Krupa, CTO, MarkLogic Hello SLIDE: 2 14 COPYRIGHT November 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. A QUICK LOOK New Payments Platform Open
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationDocAve 6 Software Platform Service Pack 1
DocAve 6 Software Platform Service Pack 1 Release Notes For Microsoft SharePoint Release Date: September 25, 2012 1 New Features and Improvements General The new Plan Groups feature helps organize and
More informationRickard Linck Client Technical Professional Core Database and Lifecycle Management Common Analytic Engine Cloud Data Servers On-Premise Data Servers
Rickard Linck Client Technical Professional Core Database and Lifecycle Management Common Analytic Engine Cloud Data Servers On-Premise Data Servers Watson Data Platform Reference Architecture Business
More informationHadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationMaking the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor
Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack Chief Architect RainStor Agenda Importance of Hadoop + data compression Data compression techniques Compression,
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationCraig Blitz Oracle Coherence Product Management
Software Architecture for Highly Available, Scalable Trading Apps: Meeting Low-Latency Requirements Intentionally Craig Blitz Oracle Coherence Product Management 1 Copyright 2011, Oracle and/or its affiliates.
More informationMarkLogic Server. Database Replication Guide. MarkLogic 6 September, Copyright 2012 MarkLogic Corporation. All rights reserved.
Database Replication Guide 1 MarkLogic 6 September, 2012 Last Revised: 6.0-1, September, 2012 Copyright 2012 MarkLogic Corporation. All rights reserved. Database Replication Guide 1.0 Database Replication
More informationSAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less
SAP IQ - Business Intelligence and vertical data processing with 8 GB RAM or less Dipl.- Inform. Volker Stöffler Volker.Stoeffler@DB-TecKnowledgy.info Public Agenda Introduction: What is SAP IQ - in a
More informationTHE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.
THE EMC ISILON STORY Big Data In The Enterprise Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon August, 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology
More informationAnyMiner 3.0, Real-time Big Data Analysis Solution for Everything Data Analysis. Mar 25, TmaxSoft Co., Ltd. All Rights Reserved.
AnyMiner 3.0, Real-time Big Analysis Solution for Everything Analysis Mar 25, 2015 2015 TmaxSoft Co., Ltd. All Rights Reserved. Ⅰ Ⅱ Ⅲ Platform for Net IT AnyMiner, Real-time Big Analysis Solution AnyMiner
More informationCloud Analytics and Business Intelligence on AWS
Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse
More informationDelivering a 360 o View in Healthcare and Life Sciences With Agile Data
Delivering a 360 o View in Healthcare and Life Sciences With Agile Data Imran Chaudhri, @imrantech, Solutions Director, Healthcare & Life Sciences Mark Ferneau, @ferneau, Practice Manager, Healthcare &
More informationWhat is MarkLogic Server? An overview
An overview By Jason Hunter October 2010 Table of Contents 3 What is MarkLogic Server? 3 Document Centric 3 Transactional 4 Search-Centric 4 Structure Aware 5 Schema Agnostic 5 XQuery and XSLT Driven 6
More informationPart 1: Indexes for Big Data
JethroData Making Interactive BI for Big Data a Reality Technical White Paper This white paper explains how JethroData can help you achieve a truly interactive interactive response time for BI on big data,
More informationData Movement & Tiering with DMF 7
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
More informationMap-Reduce. Marco Mura 2010 March, 31th
Map-Reduce Marco Mura (mura@di.unipi.it) 2010 March, 31th This paper is a note from the 2009-2010 course Strumenti di programmazione per sistemi paralleli e distribuiti and it s based by the lessons of
More informationSQL Server New innovations. Ivan Kosyakov. Technical Architect, Ph.D., Microsoft Technology Center, New York
2016 New innovations Ivan Kosyakov Technical Architect, Ph.D., http://biz-excellence.com Microsoft Technology Center, New York The explosion of data sources... 25B 1.3B 4.0B There s an opportunity to drive
More informationStorage for HPC, HPDA and Machine Learning (ML)
for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect mailto:kraemerf@de.ibm.com IBM Data Management for Autonomous Driving (AD) significantly increase development efficiency by
More information<Insert Picture Here> Value of TimesTen Oracle TimesTen Product Overview
Value of TimesTen Oracle TimesTen Product Overview Shig Hiura Sales Consultant, Oracle Embedded Global Business Unit When You Think Database SQL RDBMS Results RDBMS + client/server
More informationMarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved.
Monitoring MarkLogic Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-2, July, 2017 Copyright 2017 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Monitoring MarkLogic Guide
More informationThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
More informationIBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM
IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM Note: Before you use this
More informationA Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers
A Distributed System Case Study: Apache Kafka High throughput messaging for diverse consumers As always, this is not a tutorial Some of the concepts may no longer be part of the current system or implemented
More informationMicrosoft SQL Server Database Administration
Address:- #403, 4 th Floor, Manjeera Square, Beside Prime Hospital, Ameerpet, Hyderabad 500038 Contact: - 040/66777220, 9177166122 Microsoft SQL Server Database Administration Course Overview This is 100%
More informationData Management Glossary
Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationRealTime. RealTime. Real risks. Data recovery now possible in minutes, not hours or days. A Vyant Technologies Product. Situation Analysis
RealTime A Vyant Technologies Product Real risks Data recovery now possible in minutes, not hours or days RealTime Vyant Technologies: data recovery in minutes Situation Analysis It is no longer acceptable
More informationAutomating Information Lifecycle Management with
Automating Information Lifecycle Management with Oracle Database 2c The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
More informationMySQL Cluster for Real Time, HA Services
MySQL Cluster for Real Time, HA Services Bill Papp (bill.papp@oracle.com) Principal MySQL Sales Consultant Oracle Agenda Overview of MySQL Cluster Design Goals, Evolution, Workloads,
More informationApproaching the Petabyte Analytic Database: What I learned
Disclaimer This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may
More informationMarkLogic Server. Administrator s Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved.
Administrator s Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-3, September, 2017 Copyright 2017 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Administrator s Guide 1.0
More informationMicrosoft SQL Server Fix Pack 15. Reference IBM
Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Microsoft SQL Server 6.3.1 Fix Pack 15 Reference IBM Note Before using this information and the product it supports, read the information in Notices
More informationLazyBase: Trading freshness and performance in a scalable database
LazyBase: Trading freshness and performance in a scalable database (EuroSys 2012) Jim Cipar, Greg Ganger, *Kimberly Keeton, *Craig A. N. Soules, *Brad Morrey, *Alistair Veitch PARALLEL DATA LABORATORY
More informationSolution Brief. Bridging the Infrastructure Gap for Unstructured Data with Object Storage. 89 Fifth Avenue, 7th Floor. New York, NY 10003
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 Solution Brief Bridging the Infrastructure Gap for Unstructured Data with Object Storage Printed in the United
More informationIncrease Value from Big Data with Real-Time Data Integration and Streaming Analytics
Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time
More information