Large Memory Pages Part 2

Similar documents
DB2 and Memory Exploitation. Fabio Massimo Ottaviani - EPV Technologies. It s important to be aware that DB2 memory exploitation can provide:

z/os V1R10 64-bit Architecture z/os v1r10 Enhancements GSE z/os Workgroup 18/03/ IBM Corporation

IBM Education Assistance for z/os V2R2

HiperDispatch Logical Processors and Weight Management

z/os 1.11 and z196 Capacity Planning Issues (Part 2) Fabio Massimo Ottaviani EPV Technologies White paper

Scalability, Performance, and Productivity Benefits of Large Memory

First z/os Knights tournament

Measuring the WebSphere Message Broker - Part 2

Practical Capacity Planning in 2010 zaap and ziip

White Paper. 1 Introduction. Managing z/os costs with capping: what s new with zec12 GA2 and z/os 2.1? Fabio Massimo Ottaviani - EPV Technologies

CS 333 Introduction to Operating Systems. Class 11 Virtual Memory (1) Jonathan Walpole Computer Science Portland State University

z10 Capacity Planning Issues Fabio Massimo Ottaviani EPV Technologies White paper

CS399 New Beginnings. Jonathan Walpole

Configuring and Using SMF Logstreams with zedc Compression

The Present and Future of Large Memory in DB2

z/os 1.11 and z196 Capacity Planning Issues

IBM Mobile Workload Pricing Opportunity or Problem?

From SMF to Excel: graphs and reports in one click Part 1

Why is the CPU Time For a Job so Variable?

CPU and ziip usage of the DB2 system address spaces Part 2

Flash Express on z Systems. Jan Tits IBM Belgium 9 December 2015

IBM Education Assistance for z/os V2R2

- Benchmark White Paper - Java CICS TS V2.2 Application

DB2 for z/os in the Big Memory Era

IBM Technical Brief. IBM zenterprise System : DB2 11 for z/os with SAP Performance Report. Version 1.0. December 16, 2013.

Session 8861: What s new in z/os Performance Share 116 Anaheim, CA 02/28/2011

Websphere and Enclaves

LECTURE 12. Virtual Memory

CS 153 Design of Operating Systems Winter 2016

z/os Performance Hot Topics Bradley Snyder 2014 IBM Corporation

Managing CPU Utilization with WLM Resource Groups Part 2

Main Memory (Part II)

Computer Memory. Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1

CSE 4/521 Introduction to Operating Systems. Lecture 14 Main Memory III (Paging, Structure of Page Table) Summer 2018

Measuring VMware Environments

IBM MQ for z/os Deep Dive on new features

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

The Major CPU Exceptions in EPV Part 2

CPU MF Counters Enablement Webinar

Memory Management. Goals of Memory Management. Mechanism. Policies

Memory Management. Today. Next Time. Basic memory management Swapping Kernel memory allocation. Virtual memory

Virtual Memory. Kevin Webb Swarthmore College March 8, 2018

CS420: Operating Systems. Paging and Page Tables

ziip and zaap Software Update

Caching and Buffering in HDF5

Non IMS Performance PARMS

LAPI on HPS Evaluating Federation

System z Flash Express

This calculation converts 3562 from base 10 to base 8 octal. Digits are produced right to left, so the final answer is 6752.

Sysplex: Key Coupling Facility Measurements Cache Structures. Contact, Copyright, and Trademark Notices

Performance Sentry VM Provider Objects April 11, 2012

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

Basic Memory Management. Basic Memory Management. Address Binding. Running a user program. Operating Systems 10/14/2018 CSC 256/456 1

The Present and Future of Large Memory in DB2. Jay Yothers DB2 for z/os Development, IBM

Memory Management. An expensive way to run multiple processes: Swapping. CPSC 410/611 : Operating Systems. Memory Management: Paging / Segmentation 1

1. Creates the illusion of an address space much larger than the physical memory

Tivoli Productivity Center for Replication (TPC-R) Benchmark on system z TECHNICAL REPORT

Performance of Various Levels of Storage. Movement between levels of storage hierarchy can be explicit or implicit

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

System z13: First Experiences and Capacity Planning Considerations

First-In-First-Out (FIFO) Algorithm

Paging. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

EView/390z Insight for Splunk v7.1

Basic Memory Management

CS370 Operating Systems

CS399 New Beginnings. Jonathan Walpole

Evolution of CPU and ziip usage inside the DB2 system address spaces

OS-caused Long JVM Pauses - Deep Dive and Solutions

z990 and z9-109 Performance and Capacity Planning Issues

Virtual Memory Virtual memory first used to relive programmers from the burden of managing overlays.

Windows Java address space

Memory Management Part 1. Operating Systems in Depth XX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Chapter 8. Operating System Support. Yonsei University

z990 Performance and Capacity Planning Issues

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras

OpenVMS Alpha 64-bit Very Large Memory Design

Software Migration Capacity Planning Aid IBM Z

CS3600 SYSTEMS AND NETWORKS

10/26/2017 Universal Java GC analysis tool - Java Garbage collection log analysis made easy

Practical, transparent operating system support for superpages

COMPARISON OF ORACLE APPLICATION SERVER, WEBLOGIC AND WEBSPHERE USING PEOPLESOFT ENTERPRISE CAMPUS SOLUTIONS 8.9

CS 390 Chapter 8 Homework Solutions

Paging. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

ECE331: Hardware Organization and Design

CSE 120 Principles of Operating Systems

Hard- and Software Requirements

IBM Daeja ViewONE Virtual Performance and Scalability

Heap Compression for Memory-Constrained Java

Common z/os Problems You Can Avoid (Final Version)

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Chapter 10: Virtual Memory. Lesson 05: Translation Lookaside Buffers

6 - Main Memory EECE 315 (101) ECE UBC 2013 W2

Learning Outcomes. An understanding of page-based virtual memory in depth. Including the R3000 s support for virtual memory.

Memory for MIPS: Leveraging Big Memory on System z to Enhance DB2 CPU Efficiency

IBM Optim Performance Manager Extended Edition What s New. Ute Baumbach September 6, IBM Corporation

Memory Management Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University

Chapter 9 Memory Management

Virtual Memory. Virtual Memory

Optimal Algorithm. Replace page that will not be used for longest period of time Used for measuring how well your algorithm performs

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Transcription:

Large Memory Pages Part 2 Fabio Massimo Ottaviani EPV Technologies May 2013 3 Measuring TLB effectiveness Direct measurements of TLB1 and TLB2 effectiveness are provided in the extended counters collected in SMF 113 1. The relevant extended counters for TLB analysis of zec12 machines are 2 : E128 Data TLB1 miss cycles E129 Instruction TLB1 miss cycles E133 Translation entry written to the Data TLB1 E139 Translation entry written to the Data TLB1 for 1MB page E140 Translation entry written to the Instruction TLB1 E141 Translation entry written to the Page Table Entry in TLB2 E142 Translation entry written to the Common Region Segment Table Entry in TLB2 for 1MB page E143 Translation entry written to the Common Region Segment Table Entry in TLB2 By looking at E139 and E142 you can evaluate TLB miss activity for 1MB pages. By using the following formulas you can evaluate the percentage of CPU cycles 3 used to satisfy TLB1 misses over the total CPU cycles used (a) and the average number of CPU cycles needed to satisfy one TLB1 miss (b): a) %CPU cycles due to TLB1 miss = (E128+E129) / B0 * 100 * 0,65 b) CPU cycles/tlb1 miss = (E128+E129) / (E133+E140) * 0,65 Note that 0,65 is a correction coefficient used by IBM Washington Systems Center and also that both the formulas and coefficient may change in the future. A graph tracking both values, hour by hour for a full week, for a production system on a zec12 machine is shown in Figure 4. You can see that the CPU cycles used to satisfy TLB1 misses accounts to 5-6% of the total CPU cycles used, while between 20 and 30 CPU cycles are needed, on average, to satisfy a TLB1 miss. 1 All the SMF 113 counters are collected in the EPV for z/os data base. 2 Extended counters meaning is different depending on the machine type. 3 B0 is the first basic counter collected in SMF 113; it measures the number of used CPU cycles. Large Memory Pages 1

10,0% 35 9,0% 8,0% 7,0% 30 25 6,0% 20 5,0% 4,0% 15 3,0% 10 2,0% 1,0% 5 0,0% 06MAY13 07MAY13 08MAY13 09MAY13 10MAY13 %CPU cycles due to TLB1 miss CPU cycles/tlb1 miss Figure 4 - Large Memory Pages 2

4 Using large memory pages In z/os, 64-Bit virtual storage above the 2GB bar can only be allocated by using memory objects. A memory object is a contiguous range of virtual addresses which is allocated in units of megabytes on a megabyte boundary. One of the important attributes of a memory object is the page size; it can be written to 4K, 1MB and, starting with zec12, 2GB pages. When a memory object uses 1MB (or 2GB) pages it is called a large memory object. Currently 2GB memory pages are non-pageable while 1MB memory pages are pageable only if Flash Express (available in zec12) is enabled. 4.1 LFAREA To enable the usage of large memory pages you have to allow a certain amount of real storage, to be used for them, by setting the LFAREA parameter in IEASYS. Default is LFAREA=0 which means that no large memory pages can be used. LFAREA can be set up to 80% of the online storage available at IPL minus 2 GB 4. If the system becomes constrained of 4KB pages it will automatically react by using free large frames to back 4KB page requests while if the large pages demand increases these frames can also be recombined (coalesced) and used to support large pages again. It s very important to set an appropriate LFAREA value because: if the value is too small, there may be no available 1MB frames for applications that could benefit from large memory pages utilization; if the value is too large, such that the system does not have enough real storage to back 4 KB pages, conversion of large frames will occur with consequent performance degradation and CPU increase. LFAREA parameter syntax can be complex especially if you want to reserve memory both for 1MB and 2GB pages. Note that if you simply code LFAREA=xG you are reserving x GB of real memory for 1MB pages only; no memory will be reserved for 2GB pages. Please refer to MVS Initialization and Tuning Reference for more details. To check if the LFAREA value is appropriate you can use the DISPLAY VIRTSTOR,LFAREA system command. An example is provided in Figure 5. 4 The system requires an IPL before the new LFAREA value takes effect. Large Memory Pages 3

Figure 5 In the above example: SOURCE = 00 means the LFAREA value has been taken from the IEASYS00 member; TOTAL LFAREA = 2048M is the LFAREA size, in megabytes; LFAREA AVAILABLE = 78M is the amount of LFAREA available to 1MB pages, in megabytes; LFAREA ALLOCATED (1M) = 41M is the amount of LFAREA allocated to 1MB pages on behalf of non-pageable 1MB page requests, in megabytes; LFAREA ALLOCATED (4K) = 0M is the amount of LFAREA allocated to 1MB pages on behalf of 4KB page requests, in megabytes; LFAREA ALLOCATED (PAGEABLE1M) = 1929M is the amount of LFAREA allocated to 1MB pages on behalf of pageable 1MB page requests, in megabytes; MAX LFAREA ALLOCATED (1M) = 41M is the high water mark of LFAREA allocated to 1MB pages on behalf of 1MB page requests, in megabytes; MAX LFAREA ALLOCATED (4K) = 0M is the high water mark of LFAREA allocated to 1MB pages on behalf of 4KB page requests, in megabytes; MAX LFAREA ALLOCATED (PAGEABLE1M) = is the high water mark of LFAREA allocated to 1MB pages on behalf of pageable 1MB page requests, in megabytes. The following considerations apply here: most of 1MB pages are pageable (the example refers to a zec12 machine with the Flash Express feature active); Large Memory Pages 4

MAX LFAREA ALLOCATED (4K) is 0; it means that there is no constraint on real memory used for 4KB pages; if the high water mark for the number of 1MB frames used on behalf of 4KB page requests would be greater than zero, a reduction of the LFAREA to avoid the CPU cost for the system to convert large frames to back 4KB pages, should be considered; MAX LFAREA ALLOCATED (PAGEABLE1M) is very close to TOTAL LFAREA; it may indicate that the specified LFAREA value is too small. Finally you should also note that no information is available for 2GB pages. 4.2 Current exploiters Large Memory Pages are not suitable for all applications. As a general rule they may provide performance value to long-running memory access-intensive applications. Current large memory pages exploiters are: part of the z/os nucleus; starting in z/os V1R12, most of the READONLY nucleus is backed by 1MB pages 5 ; DB2 V10 buffer pools when the PGFIX=YES parameter is specified; JVM, when the Xlp option is specified; more recent JVM versions will automatically use large memory pages if they are available. The buffer pool PGFIX(YES) option has been available since DB2 V8; it allows you to (almost) permanently fix a pool's buffers in memory to save CPU by eliminating the need to fix in memory and then release a buffer every time a page is read in from disk or is written out to disk. In DB2 V10, 1MB page frames can be used for page-fixed buffer pools providing additional benefits in terms of CPU savings and performance. Large memory pages are non-pageable in this case. Some -Xlp sub-options are available to request the JVM to allocate the Java object heap or the JIT code cache using large memory pages. These options are shown in the next table, together with the large page size supported by 31-bit and 64-bit JVMs 6. Large Page Size -Xlp:codecache -Xlp:objectheap -Xlp 2GB non-pageable Not supported 64-bit JVM only 64-bit JVM only 1MB pageable Not supported 64-bit JVM only 64-bit JVM only 1MB non-pageable 31-bit and 64-bit JVM 31-bit and 64-bit JVM Not supported Figure 6 5 Large memory pages are used in this case independently of the LFAREA settings. 6 IBM User Guide for Java V7 on z/os Large Memory Pages 5

5 Available metrics and tools General information about 1MB memory pages are provided by RMF Monitor I and collected in SMF 71 records. This information is also reported in EPV for z/os. Details about large memory pages exploiters are provided by RMF Monitor III. No information is available at the moment about 2GB memory page usage. 5.1 SMF 71 and RMF Monitor I The following metrics, collected in SMF 71, provides information about 1MB large memory pages: SMF71LFA Large Frame Area size in bytes, SMF71LOM Minimum number of 1 MB memory objects allocated in the system, SMF71LOX Maximum number of 1 MB memory objects allocated in the system, SMF71LOA Average number of 1 MB memory objects allocated in the system, SMF71L1M Minimum total number of 1 MB frames that can be used by fixed memory SMF71L1X Maximum total number of 1 MB frames that can be used by fixed memory SMF71L1A Average total number of 1 MB fixed frames that can be used by fixed memory SMF71L2M Minimum number of 1 MB frames in the LFAREA that are not in-use, SMF71L2X Maximum number of 1 MB frames in the LFAREA that are not in-use, SMF71L2A Average number of 1 MB frames in the LFAREA that are not in-use, SMF71L3M Minimum number of 1 MB frames in the LFAREA that are in-use by fixed SMF71L3X Maximum number of 1 MB frames in the LFAREA that are in-use by fixed SMF71L3A Average number of 1 MB frames in the LFAREA that are in-use by fixed SMF71L4M Minimum total number of 1 MB frames that can be used by pageable/dref SMF71L4X Maximum total number of 1 MB frames that can be used by pageable/dref SMF71L4A Average total number of 1 MB frames that can be used by pageable/dref SMF71L5M Minimum number of 1 MB frames that are not used by pageable/dref Large Memory Pages 6

SMF71L5X Maximum number of 1 MB frames that are not used by pageable/dref SMF71L5A Average number of 1 MB frames that are not used by pageable/dref memory SMF71L6M Minimum number of 1 MB frames that are used by pageable/dref memory SMF71L6X Maximum number of 1 MB frames that are used by pageable/dref memory SMF71L6A Average number of 1 MB frames that are used by pageable/dref memory objects. These metrics can be analysed by using the RMF Monitor I Paging Activity report; the relevant report section is shown in this example 7. MEMORY OBJECTS COMMON SHARED 1 MB ------------------ --------- --------- --------- MIN 37 16 57 MAX 37 16 57 AVG 37 16 57 1 MB FRAMES ------------ FIXED ------------ ----------- PAGEABLE ---------- ------------------ TOTAL AVAILABLE IN-USE TOTAL AVAILABLE IN-USE MIN 2,048 0 2,048 3,972 1 3,968 MAX 2,048 0 2,048 3,972 4 3,971 AVG 2,048 0 2,048 3,972 4 3,968 Figure 7 First section says that the amount of 1MB memory objects used but it accounts only the objects requesting fixed frames. No information is provided about the number of objects using pageable frames. In the second section you can see that on average 2048 fixed (size of the LFAREA) plus 3968 pageable 1MB pages are used. 5.2 RMF monitor III But who s using large memory pages? The answer to this question can be obtained through RMF Monitor III. From the main menu you have to choose 3 RESOURCES and then 7A STORM to get the Storage Memory Objects online report. An example is provided in Figure 8. 7 The LFAREA size is reported in the report header. Large Memory Pages 7

Command ===> RMF V1R13 Storage Memory Objects Line 1 of 353 Scroll ===> CSR Samples: 100 System: SYS1 Date: 05/02/13 Time: 09.20.00 Range: 100 Sec ------------------------------- System Summary -------------------------------- ---MemObj--- ---Frames--- -1MB MemObj- --1MB Fixed-- -1MB Pageable- Shared 16 Shared 303K Total 57 Total 2048 Initial 3972 Common 37 Common 13993 Common 2 Common 9 Dynamic 1983 %Used 31.2 %Used 100 %Used 99.9 ------------------------------------------------------------------------------- Service ---- Memory Objects --- -1MB Frames- ----- Bytes ----- Jobname C Class ASID Total Comm Shr 1 MB Fixed Pgable Total Comm Shr MBN1BRK S BWMBEXGR 0363 1625 0 0 0 0 0 29.3G 0 0 MBV1BRK S BWMBEXGR 0158 1508 0 0 0 0 0 29.1G 0 0 MBB1BRK S BWMBEXGR 0343 1482 0 0 0 0 0 27.8G 0 0 DBB1DBM1 S BIMDBHI 0170 924 0 2 3 3 0 1186G 0 160G WASBV21A S BWASSRV 0390 662 0 1 0 0 265 28.7G 0 75.0M WASBVDM S SYSSTC 0191 586 0 1 0 0 133 25.0G 0 50.0M WASBN21A S BWASSRV 0345 565 0 1 0 0 0 24.4G 0 75.0M WASBVN1 S SYSSTC 0349 453 0 1 0 0 112 19.3G 0 50.0M WASBVDMS S BWASSRV 0291 448 0 1 0 0 256 19.3G 0 50.0M WASBV21 S SYSSTC 0384 425 0 1 0 0 244 19.6G 0 50.0M Figure 8 The system summary provides roughly the same information as the RMF Monitor I Paging Activity report; the most interesting part is that related to the address spaces. In this example you can see that WebSphere address spaces (jobname starting with WAS) are using a lot of 1MB pageable frames (Pgable column). Note that the 1 MB column under the Memory Objects section refers only to objects requesting fixed frame so it s always zero when pageable frames are used. You can scroll down the screen to see all the address spaces; unfortunately the report is not sortable. Large Memory Pages 8

6 Conclusions Performance of memory-intensive applications can often be improved by increasing the used page size. Large memory pages can also provide CPU savings by exploiting the TLB architecture designed to reduce the overhead of virtual-to-real address translation provided in the more recent IBM machines. You have to allow large memory pages usage by assigning a portion of the real memory to them setting the LFAREA parameter. Current metrics and tools provide information to analyse 1MB page usage but they are not complete and nothing is provided about 2GB pages. Large Memory Pages 9