Sub-Second Response Times with New In-Memory Analytics in MicroStrategy 10 Onur Kahraman
High Performance Is No Longer A Nice To Have In Analytical Applications Users expect Google Like performance from analytic applications, especially on mobile devices Exploding data volumes & variety require In- Memory consolidation and aggregation Modern analytical applications contain 100 s of viz, distributed to 1000 s of users daily Drivers of High Performance Drastic drop in cost of memory combined with parallel processing delivers cost effective performance 2
MicroStrategy s New In-Memory Architecture Combines 3 Breakthroughs In-Memory Data Store Massively Parallel Processing on Commodity Hardware Look-Ahead Analytics Integrated Data and Visualization Layers Interactive Exploration of Terabyte Datasets by 100,000s of Users 3
Parallel Partitioned In-Memory Cubes Parallel Relational In-memory Engine Linear scalability to 1000s of CPUS Flexible schema and partitioned data 3x to 10x faster 7x to 20x more users Tightly-coupled interactive exploration Parallel data connections for higher fetch rate. Parallel rendering of visualizations from in-memory cubes. Much more flexible cube schema. No unnecessary prejoins. Better memory management Highly improved inmemory layer. Support more than 2B rows per cube. Spread data into multiple 2B chunks Tightly coupled inmemory engine with visualization engine. Fast response times for applications. 4
MicroStrategy Parallel Partitioned Cubes Co-exists With Existing Databases Parallel Partitioned Cubes Does not replace databases Functions as Hot data layer for apps requiring high performance Drill through to databases for detail Load from databases or directly from files and Hadoop SOURCE DATA
Massively Parallel Processing On Commodity Hardware Traditional BI Parallel Execution with MSTR 10 Query Engines Bottleneck Parallel Execution Shared Memory Distributed Data Memory Memory Memory Parallel Query execution and loading Inexpensive Commodity Hardware
Look Ahead Analytics Tightly Integrated Data & Visualization Layers Traditional BI Look Ahead Analytics Visualization Layer Analytics layer optimizes queries for data Visualization Layer Loosely Coupled Data Layer Data layer analyzes dashboard and optimizes structures Data Layer Data layer has no knowledge of analytics layer design Connections Optimized for the lowest common denominator Tightly integrated layers enable optimization Analytics layer globally optimizes queries sent to data based on data structures Data layer looks ahead and plans based on knowledge of dashboard
Faster Data Loads with Parallel Partitioned In-Memory Cubes Bottleneck Intelligent cubes take really long time because ODBC single thread data fetch is too slow. VISUALIZATION API Data can now be loaded in parallel leading to faster cube publishing. Application Engines Tune the number of connections through VLDB. Default being 2. Analytics Engines Analytics Engines DATA DATA DATA DATA DATA 8
Broader Analytical Capabilities with Parallel Partitioned In-Memory Cubes Bottleneck Subset reports were limited to single pass analytics Parallel partitioning will support the generation of multi pass CSI to be able to support full range of analytics and be at par with SQL Engine. Multi pass analytics will include support for metric qualifications, relationship filters etc. 9
Larger Data Volumes with Parallel Partitioned In-Memory Cubes Bottleneck Data limitation of 2B rows per cube. Overcome the 2B row limitation by enabling data partitioning across several cores of a CPU and to be able to split the data within the cube into multiple 2B chunks. Leverage existing CPU cores more efficiently. 9.4.1 OLAP Cube 2B MSTR 10 Parallel Partitioned Cube. 2B 2B 2B. DATA DATA DATA DATA DATA. 1 2 3. 16 1 2 3. 16 16 core CPU 16 core CPU 10
In-Memory OLAP Services vs. Parallel Partitioned In-Memory Cubes Access the Database With Higher Throughout Create and Publish the Cube With Higher Data Scalability Analyze the Data With Faster Response Time 9.4.1: OLAP Services Data: 5M rows Fetch Rate: 5074kb/sec 9.4.1: OLAP Services Data: 2.35B rows Failed due to 2 billion row limit 9.4.1: OLAP Services Data: 8M rows Response Time: 0:06:33 MSTR 10: 8Thread Parallel Load Data: 5M rows Fetch Rate: 22454kb/sec MSTR 10: 8Thread Partitioning Data: 2.35B rows Publish time: 5:14:23 Cube size: 265GB MSTR 10: 8Thread Parallel Access Data: 8M rows Response Time: 0:04:25 Upload data 4 times faster Increase the data scalability up to 80 times 50% Faster Data Interactions 11
In-Memory OLAP Services vs. Parallel Partitioned In-Memory Cubes Parallel non-partitioned cube has up to 20% higher capacity than OLAP Services Average Response Time (sec) 0 1 2 3 4 5 0 5 10 15 Power Rating (KiloCycles) Average Response Time (sec) 1 2 3 4 5 6 7 0 5 10 Power Rating (KiloCycles) MSTR 10 Cube 8-partition on 8-core Linux MSTR 10 Cube 1-partition on 8-core Linux MSTR 10 Cube non-partition on 8-core Linux MSTR 10 Cube 8-partition on 8-core Win64 MSTR 10 Cube 1-partition on 8-core Win64 MSTR 10 Cube non-partition on 8-core Win64 12
In-Memory OLAP Services vs. Parallel Partitioned In-Memory Cubes 35 490 30 390 Response Time (sec) 25 20 15 10 5 Response Time (sec) 290 190 90 0 R1 R3 R5 R7 R9 R11 R13 R15 R17 R19 R21 R23 R25 Customer A Reports 9.4.1 Cube MSTR10 Cube Non Partition MSTR10 Cube 32-Partition -10 R1 R3 R5 R7 R9 R11 R13 R15 R17 R19 R21 R23 R25 R27 R29 Customer B Reports 9.4.1 Cube MSTR10 Cube Non Partition MSTR10 Cube 32-Partition The partitioned in-memory cube has significant performance gain in almost all cases for both Customer A and B Non partitioned and OLAP cubes have almost similar response time 13