Oracle Database In- Memory Implementation Best Practices and Deep Dive [TRN4014] Andy Rivenes Database In-Memory Product Management Oracle Corporation
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle.
What is Database In-Memory
Oracle Database In-Memory Real-Time Analytics 100X 100X Accelerate Mixed Workload Risk-Free Trivial to Implement Transactions Analytics Enable Real-Time Business Decisions Run analytics on Operational Systems Proven Scale-Out, Availability, Security No Application Changes Not Limited by Memory
Breakthrough: Dual Format Database Buffer Cache SALES Row Format SALES New In-Memory Column Store SALES Column Format BOTH row and column formats for same table Simultaneously active and transactionally consistent Analytics & reporting use new in-memory Column format OLTP uses proven row format
Oracle In-Memory: Simple to Implement 1. Configure Memory Capacity inmemory_size = XXX GB 2. Configure tables or partitions to be in memory alter table partition inmemory; 3. Later drop analytic indexes to speed up OLTP
Where Is It Available 8
Where Database In-Memory Is Available Database In-Memory is an option for Oracle Database Enterprise Edition Database In-Memory was included in the first patchset (12.1.0.2) for 12.1 and all subsequent Oracle Database releases Available: Exadata Express Cloud Service X1000IM up to 10GB Column Store Database Cloud Service: Extreme Performance Exadata Cloud Service Exadata Cloud at Customer On-premises 9
How Do I Install & Configure The In-Memory Column Store?
Installing : Oracle Database In-Memory Automatically installed as part of Oracle Database (12.1.0.2, 12.2, 18c) No additional steps required Note: Database In- Memory is not enabled by default
Installing: Apply the Latest Database Proactive Bundle Patch Database In-Memory fixes and enhancements are only distributed through Database Proactive Bundle Patches or Release Updates See MOS Notes: 2337415.1 Overview of Database Patch Delivery Methods for 12.2.0.1 and greater 1962125.1 Overview of Database Patch Delivery Methods for 12.1.0.2 and older Starting with the latest patches avoids reinventing the wheel discovering bugs that have already been fixed!
That s all you need to know, but wait 13
What Workloads Benefit From Database In-Memory
Where to use In-Memory OLTP System Reporting Data Warehouse Reporting In-Memory Column Store In-Memory Column Store ETL Staging Layer Foundation Layer 3 rd Normal Form Performance Layer STAR SCHEMA SALES ODS Pre-Cal KPIs Enables real-time reporting directly on OLTP data Speeds data extraction part of ETL process Removes need for separate ODS Speeds up mixed workload Star-schema and pre-calculated KPIs Improves performance of dash-boards All or a subset of Foundation Layer For time-sensitive analytics on 3rd normal form Staging/ETL/Temp not good candidates Write once, read once 15
What are Analytics? Source: Google Search Our definition: Using aggregation to find patterns and trends in the data 16
When Database In-Memory Helps For a non-trivial amount of rows and execution time, when a significant amount of time is spent accessing data is spent joining data is spent on scan and group by operations Majority of time spent joining data Majority of time spent on scan and group by operations Majority of time spent accessing data
Scanning & Filtering Query: Traditional data access Majority of time spent accessing data 18
Join Query Traditional hash join Majority of time spent joining data 19
Aggregation Traditional Group By Query bottlenecked on scan and group by operation 20
How to Compare Benefit Establish a baseline with no in-memory on the destination database If migrating or upgrading then ensure that the new environment runs the workload as well as the old environment Run with In-Memory enabled Ensure a steady state fully adapted plans, multiple executions Use time based comparison SQL Monitor active reports Prevent regressions SQL Plan Baselines Remember that Database In-Memory includes Hash Joins with Bloom filters and Vector Group By Verify that they re being used See the Database In-Memory Implementation and Usage Whitepaper
How does it work 22
Oracle In-Memory Columnar Technology Pure In-Memory Columnar SALES Pure in-memory columnar format Not persistent, and no logging Quick to change data: fast OLTP Enabled at table or partition Only active data in-memory 2x to 20x compression typical SALES Available on all hardware platforms 23
In-Memory A Store Not A Cache Buffer Cache A B C E A A C B F What is a cache? A pool of memory Data automatically brought into memory based on access Data automatically aged out Good example: Oracle Database Buffer Cache 24
In-Memory Area: Static Area within SGA System Global Area SGA Buffer Cache Large Pool Shared Pool Other Log Buffer In- Memory Area Contains data in the new In-Memory Column Format Controlled by INMEMORY_SIZE parameter Minimum size of 100MB SGA_TARGET must be large enough to accommodate this area 25
Composition of In-Memory Area V$INMEMORY_AREA: Current sizes of pools and pool statuses SQL> SELECT * from V$INMEMORY_AREA; POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS ---------- ----------- ---------- --------------- 1MB POOL 1307574272 135266304 DONE 64KB POOL 318767104 524288 DONE V$IM_HEADER: List of IMCUs currently in the In-Memory column store SQL> SELECT OBJD, TSN, ALLOCATED_LEN, NUM_ROWS, NUM_COLS FROM V$IM_HEADER; OBJD TSN ALLOCATED_LEN NUM_ROWS NUM_COLS ---------- ---------- ------------- ---------- ---------- 92918 5 19922944 532246 60 92918 5 19922944 532480 60 92918 5 19922944 522496 60 92918 5 11534336 532480 60 92918 5 19922944 530176 60 92918 5 19922944 532480 60 92918 5 15728640 419562 60
Composition of In-Memory Area IMCU IMCU IMCU IMCU SMU SMU SMU SMU IMCU IMCU SMU SMU SMU SMU IMCU IMCU In Memory Area Column Format Data Metadata Contains two subpools: IMCU pool: Stores In Memory Compression Units (IMCUs) SMU pool: Stores Snapshot Metadata Units (SMUs) IMCUs contain column formatted data SMUs contain metadata and transactional information 27
Composition of In-Memory Compression Unit (IMCU) ROWID EMPID IMCU header Column CUs NAME DEPT SALARY Unit of column store allocation Columnar representation of a large number of rows from an object Rows from one or more table extents Actual size depends on size of rows, compression factor, etc. Extent #13 Blocks 20-120 Extent #14 Blocks 122-182 Extent #15 Blocks 201-301 Each column stored as a separate contiguous Column Compression Unit (column CU) Rowids also stored as a Column CU 28
Why not just cache the table in the row store 29
Compare Column-store to Row-store 30
Why Are Analytic Queries Faster In The In-Memory Column Store? 31
Database In-Memory Technology Scanning and filtering data more efficiently Columnar Format Compression Storage Indexes SIMD Vector Processing Min 1 Max 3 Min 4 Max 7 CPU Load multiple region values Vector Register CA CA CA CA Vector Compare all values an 1 cycle Min 8 Max 12 Access only the columns you need Scan & filter data in compressed format Prune out any unnecessary data from the column Process multiple column values in a single CPU instruction 32
Optimizer Leverages In-Memory Technology Improves key aspects of analytic queries Data Scans Joins In-Memory Aggregation CPU SALES STATE = CA Vector Register CA CA CA CA Table A HASH JOIN Table B Speed of memory Scan and Filter only the needed Columns Vector Instructions Convert Star Joins into 10X Faster Column Scans Search large table for values that match small table Create In-Memory Report Outline that is Populated during Fast Scan Runs Reports Instantly
New in 12.2 for Database In-Memory Real-Time Analytics Mixed Workload Massive Capacity Multi-model Automation Column { } JSON 2X Faster Joins 5X Faster Expressions Active Data Guard Support In-Memory on Exadata Flash Native support for JSON Data type Dynamic Data Movement Between Storage & Memory IM FastStart IM Column Store Re-sizing 34
New in 18c for Database In-Memory Further Performance Gains Optimized Arithmetic Dynamic Scans External Tables Automatic In-Memory 2X Query Performance Gains In-Memory Optimized Arithmetic In-Memory Dynamic Scans In-Memory External Tables Automatic Data Movement Between Storage & Memory 35
How Does The Optimizer Know
Optimizer : New In-Memory Statistics Automatically computed on the fly during hard parse Computed at the segment level table or partition/subpartition # IMCUs # IM Blocks IM Quotient Fraction of table populated in In-Memory column store Value between 0 and 1 # IM Rows # IM Transaction Journal Rows In-Memory statistics are RAC-aware (DUPLICATE and DISTRIBUTE) In 12.2 IM stats are available in the ALL_TAB_STATISTICS view
Optimizer : OPTIMIZER_ADAPTIVE_FEATURES OPTIMIZER_ADAPTIVE_FEATURES has been made obsolete in 12.2 Starting in 12.2: OPTIMIZER_ADAPTIVE_PLANS defaults to true OPTIMIZER_ADAPTIVE_STATISTICS defaults to false In 12.1.0.2: Patches for bugs 21171382 and 22652097 provide 12.2 behavior See MOS Note 2187449.1 Recommendations for Adaptive Features in Oracle Database 12c See the Optimizer blog for more information: blogs.oracle.com/optimizer
Optimizer : Ensure Plan Stability Unpredictable changes to execution plans can destroy performance Avoiding plan changes is the only method to avoid performance regression Solution Optimizer automatically manages execution plans Only known and verified plans are used SQL Plan Management is controlled plan performance
How does Database In-Memory Scale Out
RAC : Scale-Out In-Memory Database to Any Size Scale-Out across servers to grow memory and CPUs Shared nothing architecture IMCUs not shipped across interconnect cache fusion is not in play! In-Memory queries are parallelized across servers to access local column data
RAC : In-Memory and Distribution of Data ALTER TABLE sales INMEMORY; ALTER TABLE sales INMEMORY DISTRIBUTE BY PARTITION; ALTER TABLE sales INMEMORY DISTRIBUTE ROWID RANGE; Distribution allows in memory segments larger than individual instance memory Policy is automatic (Distribute AUTO) or user-specifiable Controlled by DISTRIBUTE subclause Distribute by rowid range Distribute by partition Distribute by subpartition Goal: Ensure Even Distribution
RAC : Database In-Memory Queries in a RAC Environment Shared nothing architecture means Parallel Query must be used to access data Must have a DOP greater than or equal to the number of column stores Query coordinator automatically starts parallel server processes on the correct nodes (Requires Auto DOP in 12.1.0.2)
In-Memory Analytics Extended into Exadata Flash Storage Exadata automatically transforms table data into In-Memory DB columnar formats in Exadata Flash Cache Enables fast vector processing for storage server queries Additional compression for OLTP compressed or uncompressed tables in flash new in 18.1 Enables dictionary lookup and avoids processing unnecessary rows Smart Scan results sent back to database in In-Memory Columnar format Reduces Database node CPU utilization Uniquely optimizes next generation Flash as memory In-Memory Columnar scans Up to 1.5 TB DRAM per Server In-Flash Columnar scans Up to 25.6 TB Flash per Server 44
Engineered Systems: Unique Fault Tolerance Similar to storage mirroring Duplicate in-memory columns on another node Enabled per table/partition Application transparent Performance preserved by using duplicate during a node failure Only Available on Engineered Systems Performance can be improved by performing joins within each node (partial partition wise joins)
Troubleshooting
Troubleshooting Additional Views Confirm all data is fully populated! v$inmemory_area is there enough memory? v$im_segments are all objects fully populated? v$imeu_header are IMEs populated? user_joingroups have join groups been created and populated? v$im_header are all rows populated per object? v$im_smu_head are changes affecting journal space?
Performance History Make sure AWR is available AWR is valuable to measure system level workload AWR is valuable to see how the database is set up Initialization parameters Memory allocation Resource utilization And don't forget the Advisors section Recommend increasing the default AWR retention (7 days) But... AWR is useless for fixing problem SQL
Troubleshooting Analyzing SQL Statements SQL Monitor active reports Shows how SQL was executed and where time was spent See blogs.oracle.com/in-memory for a technical brief on creating SQL Monitor active reports Extended SQL Trace Data Captures where all time was spent Use TKPROF or a third party profiler to identify performance problems Optimizer trace (10053) Shows why the execution plan was chosen See blogs.oracle.com/in-memory for a technical brief on creating an Optimizer trace
Where can I get more information
Additional Resources Join the Conversation https://twitter.com/db_inmemory https://twitter.com/theinmemoryguy https://blogs.oracle.com/in-memory/ https://www.facebook.com/oracledatabase http://www.oracle.com/goto/dbim.html White Papers (otn.com) Oracle Database In-Memory White Paper Oracle Database Implementation and Usage White Paper Oracle Database In-Memory Aggregation Paper When to use Oracle Database In-Memory Oracle Database In-Memory Advisor Videos Oracle Database In-Memory YouTube Channel oracle.com Powering the Real-Time Enterprise oracle.com/us/corporate/events/dbim/index.html YouTube - Juan Loaiza: DBIM: What's new in 12.2 Additional Questions In-Memory blog: blogs.oracle.com/in-memory My email: andy.rivenes@oracle.com
Your Chance to Influence Product Direction and Design! The Cloud Platform UX team would love to get your feedback on new designs. Sign up here: https://tinyurl.com/oracleuserresearch