Understanding (12.1.0.2) Internals: The Cache Fusion Edition Subtitle Markus Michalewicz Director of Product Management Oracle Real Application Clusters (RAC) November 19th, 2014 @OracleRACpm http://www.linkedin.com/in/markusmichalewicz http://www.slideshare.net/markusmichalewicz
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 3
The Secret to Horizontal Scalability of Any System A straight vertical line Simplification 1 Simplification 2 4
Three Alternative Architectures The Same Idea Applies Note: germany, argentina and brazil are node names. They do not indicate a stretched cluster. 6
Program Agenda 1 2 3 4 5 Cache Fusion Overview (Dynamic Re-) Mastering Handling Contentions RAC Meets Multitenant RAC Meets In-Memory 7
Program Agenda 1 2 3 4 5 Cache Fusion Overview (Dynamic Re-) Mastering Handling Contentions RAC Meets Multitenant RAC Meets In-Memory 8
Combines it All & Adds Services Requester Listener Service Holder Block, unless otherwise stated. Instance Master Session PQ/PX: 9
Listeners and Services Direct Work to an Instance In an Oracle environment, listeners establish connections with an instance. In an environment two types of listeners are used (for this purpose): 1. SCAN and SCAN Listener 2. Local Listeners per node Services are used to load balance work. Local listeners will establish connections on instances where the service is offered. Once a session is established within an instance, the sessions stays until closed. 10
Scaling Principles Does the application scale on a single (SMP) system? If it does, it is likely to scale horizontally. Scalability is measured considering the whole system. 11
Getting Access to Data Local Cache Hit Access time: nanoseconds Remote Cache Hit Access time: microseconds Whether local or remote, cache hits are always faster than reading from spinning disks Data is on disk Flash cache access time: microseconds Disk controller cache: access time: micros. Spinning disk access time: milliseconds 12
Dynamic Retrieval of Data 1 2 + + In order to fulfill a given request, can decide to: Assemble data spread across instances and disks Perform a (full) disk read for (parts of) the data Use assembly and (full) disk reads dynamically 3 The decision which access path to use is dependent on various factors (e.g. IO capacity, network utilization) Respective parameters are monitored The access path can change accordingly and dynamically. Once data has been shipped to an instance, it resides in the cache for further (local) access. Updates will be communicated as required (via messages). 13
Maximum Three Way Communication Message Message Block Case 1 (all local) All entities in one instance Data holding instance Session holding instance Mastering Instance Case 2 (local / remote) Some entities in one instance Data holding instance Session holding instance Two way communication Case 3 (all distributed) All entities are dispersed Three way communication to obtain a given block of data Message operations have been subject to improvements since 10g times. 14
Program Agenda 1 2 3 4 5 Cache Fusion Overview (Dynamic Re-) Mastering Handling Contentions RAC Meets Multitenant RAC Meets In-Memory 15
Mastering and Dynamic Re-Mastering Oracle GI HUB germany Oracle GI HUB argentina Oracle GI HUB brazil Oracle GI HUB germany Oracle GI HUB argentina Oracle GI HUB brazil Oracle germany GI HUB Oracle argentina GI HUB Oracle brazil GI HUB In, every object is mastered (as part of Global Cache Services / GCS) by an instance. The Global Resource Directory (GRD) is used to keep track of mastering. Illustration above is symbolic for this matter. If a session requests access to a block that is not mastered by the instance hosting the session, the master instance needs to be contacted (via message). Based on user access patterns, a Dynamic Re-Mastering (DRM) can be triggered, changing the master for an object from one instance to another. Relocating the master to the instance where data is requested most reduces the messaging. 16
Dynamic Re-Mastering (DRM) Recommendations DRM is used internally and externally. Externally means access patterns can influence the mastering of user data. Internally DRM is used for the management of UNDO data across instances for example. DRM-like activity is also performed upon instances leaving or joining the cluster, in course of which it is a.k.a. recovery Do not turn DRM off (_gc_policy_time=0)! Oracle GI HUB germany Oracle GI HUB argentina brazil Oracle GI HUB Optimize _gc_policy_minimum to run / trigger DRM at a convenient moment under normal operations. Consider using smaller SGA sizes. See MOS note 1619155.1 Best Practices and Recommendations for RAC databases using SGA larger than 300GB as applicable dbms_cacheutil can be used to manually set and release affinity under well understood circumstances. Contact Oracle Support for details as needed. 17
New in 12.1.0.2: Cache Fusion Accelerator The Cache Fusion Accelerator (CFA) is an OS kernel (Linux & Solaris only) module which can respond directly to certain lock requests via RDSv3. The lock state is saved in memory shared by the database and the kernel. Oracle GI germany Oracle GI argentina Oracle GI brazil CFA saves user/kernel context switches, frees up CPU cycles in LMS, and speeds up messages. CFA will be activated on Engineered Systems over time, including the Oracle Database Appliance. CFA contributes to a better, linear scaling. CFA is one of a long list of improvements. LMS process Accelerator 19
Program Agenda 1 2 3 4 5 Cache Fusion Overview (Dynamic Re-) Mastering Handling Contentions RAC Meets Multitenant RAC Meets In-Memory 20
How To Handle Contention Basics Contention can occur in any multiuser system (even in SI databases) User data as well as metadata (e.g. an index) can be subject to contention You can still guide users to only one instance in an environment. As soon as you scale out, contention can occur between instances (not only within an instance). From a contention perspective, a 2-node setup is basically the same as a 3 or more nodes setup. The difference is in the required messagebased communication to obtain a block considering the mastering. Note: for scalability, only / contention needs to be considered. 21
How To Handle Contention Considerations Sequence REDO Frequent transactional changes to the same data blocks in all instances may result in hot spots. Block with pending changes may be pinged by other instances. Pending redo must be written to log before the block can be transferred Latency for a deferred block transfer becomes dependent on delay for log IO Contention can affect related data as much as it can affect the actual user data. Right growing indexes and index contention are common. In 99% of OLTP performance issues, hot spots occur on indexes. 22
How To Handle Contention General Solutions Sequence REDO Avoid frequent transactional changes to the same data blocks in all instances using partitioning and services. Logically partition data so that subdata is handled in one instance only. Guide users via services accordingly. Place redo logs on fast storage if performance critical; e.g. SSDs Implemented in 11.2.2.4 of Exadata and Oracle Database Appliance by default (Smart Logs and SSDs, respectively) Separate disks for logs from other IO busy disks Use either for better cache locality: Global hash partitioned indexes Locally partitioned indexes Drop unused indexes 23
How To Handle Contention Use Connection Pools Connection Pool Connection Pool busy Connection Pool idle UCP supports Connection Affinity : Transaction-Based Affinity Web Session Affinity http://docs.oracle.com/database/121/jjucp/rac.htm#jjucp8197 Connection Pools limit the number of connections to the database example: Oracle Universal Connection Pool (UCP) Use Database Resident Connection Pooling when no. of active sessions much smaller then no. of open session. Fast Application Notification (FAN) enabled connection pools receive Load Balancing information based on the Workload Repository and on a per-service basis. 24
Program Agenda 1 2 3 4 5 Cache Fusion Overview (Dynamic Re-) Mastering Handling Contentions RAC Meets Multitenant RAC Meets In-Memory 25
Oracle Multitenant and a Symbiosis cons_1 cons_2 germany Oracle GI HUB argentina Oracle GI HUB REDO Pluggable Databases (PDBs) represent themselves as services in an Multitenant Database, ensuring extremely fast failover. PDBs can be used to conveniently align users to instances with all the benefits, providing higher consolidation benefit and agility. Multitenant Databases help to consolidate. Future improvements will ensure greater efficiency. Multithreaded Redo Log r already implemented. 26
Oracle Multitenant and a Symbiosis cons_1 cons_2 germany Oracle GI HUB argentina Oracle GI HUB REDO Pluggable Databases (PDBs) represent themselves as services in an Oracle More RAC information, Multitenant see: Database, Oracle Multitenant ensuring extremely meets fast Oracle failover. RAC - PDBs http://www.slideshare.net/markusmichalewicz/oraclemultitenant-meets-oracle-rac-ioug-2014-version providing higher consolidation benefit and can be used to conveniently align users to instances with all the benefits, agility. Multitenant Databases help to consolidate. Future improvements will ensure greater efficiency. Multithreaded Redo Log r already implemented. Per-PDB/service optimized GCS operations coming in future. 27
Program Agenda 1 2 3 4 5 Cache Fusion Overview (Dynamic Re-) Mastering Handling Contentions RAC Meets Multitenant RAC Meets In-Memory 28
Oracle In-Memory and Breakthrough: Dual Format Database Generate Reports Instantly Column Store Replaces Analytic Indexes In-Memory Column Store Full HA & Integration with Industry Standards 29
Oracle In-Memory and A Dream-Team Breakthrough: Dual Format Database Addresses Column Store Replaces Analytic Indexes In-Memory Column Store Addresses 30
Full support for RAC scale-out means Oracle Database In-Memory can be used on our largest Data Warehouse, enabling more near real-time analytics. Sudhi Vijayakumar, Senior Oracle DBA Yahoo Inc. 31
Oracle Database In-Memory: Unique Fault Tolerance Similar to storage mirroring Duplicate in-memory columns on another node Enabled per table/partition Application transparent Only Available on Engineered Systems Downtime eliminated by using duplicate after failure 32
Downtime is extremely costly for our business. Oracle s In-Memory architecture takes the right approach to balancing real-time speed with continuous availability. Jens-Christian Pokolm Analyst IT-DB Architecture & Engineering Postbank Systems AG 33
Optimized Used of Memory in Every Layer In-Memory Format Compressed Lock mgmt. simplified In-Memory Format Normal Buffer Cache Normal Buffer Cache?? Oracle GI HUB germany Oracle GI HUB argentina Coming soon Oracle GI HUB brazil In-Memory compresses data, optimizing the use of memory. Memory allocated for In-Memory is subject to simplified lock mgmt. In-Memory Speed and Capacity of Low Cost Disk PCI Flash works mostly locally. Shared solutions require RAC external synchronization. Coming soon on ODA: a fully integrated shared flash cache. 35
Oracle Learning Streams: http://education.oracle.com/streams/ Database and Middleware streams available Videos Feature different groups and presenters Cover a broad range of topics and products FREE OF CHARGE PM / Dev contributions: Oracle Flex Cluster: Optimized Resource Management for the Cloud - Ian Cookson Oracle Grid Infrastructure 12c Bundled Agents Shankar Iyer ACFS Product Overview and Use Cases - Ara Shakian The Oracle Real Application Clusters (RAC) Family of Solutions - A User Guide Markus Michalewicz Next Generation Oracle Automatic Storage Management - Jim Williams Implementing DBaaS with 12c and Quality of Service Management - Mark Scardina Practical Performance Management and Tuning Markus Michalewicz
37