System z13: First Experiences and Capacity Planning Considerations
|
|
- Dwain Holmes
- 6 years ago
- Views:
Transcription
1 System z13: First Experiences and Capacity Planning Considerations Robert Vaupel IBM R&D, Germany Many Thanks to: Martin Recktenwald, Matthias Bangert and Alain Maneville for information to this presentation Wednesday 04/11/2015 Session BK
2 Content z13: Installations Planning for z13 Preparation for the migration, upgrade CPU Measurement Facility and SMF 113 For what is it good and what can be done with it? Processor Topology What else to consider? Service Levels Store Into Instruction Stream code (assembler) Appendix Resources Real value of z13 comes with
3 z13: On one page Newest System z processor with Much more memory than previous systems up to 10TB A new processor and mainboard design SCM versus MCM More cache Improved Out-of-Order processing Completely new features: SIMD & SMT The real value of z13 comes with exploiting its features
4 z13: Installations More than 400 so far Most installations, migrations and upgrades run smoothly w/o any problems What we know Most installation are within -1% to +8% of the zpcr expectation Expectation is ~10% better than ITR than zec12 This in fact is not much compared to previous upgrades Therefore it is worth to plan the upgrade carefully So far ~3% of the installations experience some problems with the migration This is less compared to previous significant processor design changes (for example z10) But it is enough to talk about some planning considerations
5 z13: Planning: To Start with A successful migration starts with a thoroughly planning Start with zpcr: Based on IBM provided LSPR data This gives you the right size for your CEC
6 z13 Planning: What else can be done? Plan your z13 LPARs carefully Why is this important? We will see that the LPAR layout matters That means LPAR weight, # of logical processors and the resulting numbers of vertical high, medium and low processors (VH, VM, VL) A good starting point here is the LPAR Design tool available from the WLM homepage: What does it do? It allows you to layout your LPARs Examine the VH, VM, VL processors Provides guidelines on how to set up the LPARs efficiently
7 Preparation for z13: CPU MF Use the CPU Measurement Facility (Hardware Instrumentation Sampling) to obtain insight into the processor and cache architecture Value of CPU Measurement Facility (CPU MF) Recommended methodology for successful z Systems processor capacity planning Need on Before processor to determine LSPR workload Validate achieved z Systems processor performance Needed on Before and After processors Provide insights for workload pattern, behavior, new features and functions Continuously running on all LPARs Capturing CPU MF data is an industry Best Practice 7
8 Preparation for z13: CPU MF Introduced in z10 and later processors Facility that provides hardware instrumentation data for production systems Two major components Counters for capacity planning Cache and memory hierarchy information SCPs supported include z/os and zvm Sampling for detailed, module level analysis z/os HIS started task Gathered on an LPAR basis Writes SMF 113 records z/vm Monitor Records Gathered on an LPAR basis all guests are aggregated Writes new Domain 5 (Processor) Records 13 (CPU MF Counters) records Minimal Overhead 8
9 Preparation for z13: Why is CPU MF Important? z13 provides lower single thread improvements than previous processor changes, e.g. zec12 versus z196 z13 provides more variability in capacity improvement Capacity projections and expectations must be as accurate as possible Annotation: RNI: Relative Nest Intensity is a metric which describes the access to various cache levels of the processor architecture 9
10 Preparation for z13: Enabling CPU MF Configure the System z server to collect CPU MF data (Image Profile on HMC/SE) Setup HIS Procedure in SYS1.PROCLIB (if not already available) Specify where to store the HIS output file (setup step also required for COUNTER data) Setup SMFPRMxx member to collect SMF 113 records Start HIS, start measurements, synchronize with SMF (and RMF) Enable CPU MF in the LPAR security settings Overhead for counters/extended counters is negligible. Setup the HIS address space and start it Recommended: F HIS,B,TT= Text',PATH='/his/',CTRONLY,CTR=(B,E),SI=SYNC Or F HIS,B,TT= Text',PATH='/his/',CTRONLY,CTR=ALL,SI=SYNC Set up SMF113 recording in SMFPRMxx SMF 113 has 2 subtypes: 1 (delta counter) and 2 (total counter) In our example we always use subtype 2 SMF 113s were 1.2% of the space compared to SMF 70s & SMF 72s Recommendation: Setup it once and run it continuously CPU MF Webinar Replays and Presentations z/os CPU MF Detailed Instructions Step by Step Guide z/vm Using CPU Measurement Facility Host Counters 1
11 What can be done with SMF 113 data? First it allows you to evaluate your LPARs from before and after the migration: Collect data from comparable time periods before and after the migration Compare time periods when same workload executes For example: Prime shift from 08:00 to 12:00 or 08:00 to 16:00 Metrics: CPI = Cycles per Instructions MIPS = Million Instructions per Second = GHz / CPI Annotation: The MIPS value is the average MIPS value per processor, LSPR tables show the MIPS value for the CEC (all regular processors) Example: from a real installation (zec12 to z13) Data was collected from before and after the migration For comparable weeks, comparable time frame, comparable load Notice: There are always fluctuations The difference is never one number, in this case the improvement was between 4.4% and 13.1% in favor of the z13 which is within expectations (on average in this case 9% vs. expectations: ~10%)
12 What can be done with SMF 113 data? Second it is possible to analyze the access of processors to data and cache in detail For example: When a Level 1 Miss occurs from where the data or instructions come from (why is this important? next page) CPI: Estimated Instruction Complexity Tells whether the Instruction mix is uniform and compared to other models how much it is faster (slower) CPI: From Finite Cache/Memory Gives the average number of cycles to acquire data/instructions from the cache hierarchy and memory In the example below:» The processor of the z13 is on average 18% faster» Access to data and instructions from cache/memory requires about the same number of cycles» On average the z13 is 9% faster for the three compared days
13 SMF 113 data: Assess what happens if something goes wrong Extreme Example: Effect if data needs to be fetched from a different drawer 1 8 % 1 3 % 1
14 z13: Some discussion points SMF 113 data allows to assess the efficiency of the LPARs before and after the migration Make sure the time periods can really be compared to each other SMF 113 data allows to identify problems Question is what to look at for setting up LPARs on z13 Avoid that LPARs are split between drawers This can be avoided in >99% of all cases for VH processors Important: Make sure the memory is distributed between the drawers so that so that no unnecessary split occurs Do not define an excessive amount of vertical low processors VL processors reduce the efficiency of VM processors by using some of their share VL (and VM) processors may look for a PCP on a different Chip, Node and sometimes even drawer Therefore they often need to access data/instructions from remote cache structures They cause VH processors to access data/instructions from remote cache structures Thus they reduce the efficiency of all logical processors Recommendations: Define only the number of logical processors which are really needed to handle peaks Do not define an excessive amount of VL processors Define the weight so that at least 80% of the time during the time period when important workload executes everything can be handled by VH processors
15 Annotation: Processing SMF 113 data If you need a quick start to process SMF 113 data Look at the WLM home page: The tool Provides a set of REXX programs which process SMF 113 subtype 2 data from all processors types: z10, z196, zec12, z13 The output is a CSV file including the most common metrics which can be calculated from the SMF 113 counters There is also a spreadsheet to display the most basic statistics for cache access and cycles per Instructions Remark: The spreadsheet expects US number notation with a dot as decimal point
16 z13 Analysis: Logical Processor Topology If you want to know how your logical processors are placed on Chips, Nodes and Drawers Collect SMF 99 subtype 14 records from if possible all z/os partitions of your CECs The amount of data is very minimal even much smaller than for SMF 113 data Combine all SMF 99 subtype 14 records of one CEC into one SMF dataset Now process the data with another tool available on the WLM home page (URL see below) Topology Reporting Tool Creates a CSV file containing the topology information of all logical processors Processes data for z10, z196, zec12 and z13 Recommendations for running the program: Use the ALLINTV option Combine SMF data from multiple LPARs of the same CEC into one SMF dataset For z/os 1.13: Apply Apar OA47418 Provides a spreadsheet to display the topologies 1
17 z13: Taking a Breath What did we learn so far z13 provides ~10% ITR improvement compared to zec12 z13 is a little more sensitive to the LPAR definition Is there anything else which needs to be considered?
18 What else to consider? What about the Program Code? Recommendations Use newest compiler versions whenever possible because those are optimized for the latest architecture Use latest service levels especially after introduction of a new architecture Operating system Firmware Last but not least New processor designs sometimes have implications of what was once a good thing to do
19 Service Levels Bundle 14 Changes how PR/SM assigns VH and VM processors for small partitions Before LPARs with a weight within {1.5 to 2.0} PCP share were assigns 1 VH and 1 VM Now these LPARs are assigned 2 VM processors Why? Because PR/SM optimizes the placement of VH processors but not necessarily the placement of VM processors» Therefore it is possible that VH and VM processors of the same partition are located on different Chips (or Nodes) which is not brilliant for small partitions» While 2 VM processors are most often placed closed together on the same or adjacent chips OA47968 (WLM Hiperdispatch enhancements) One important change is the park, un-park sequence of VL processors Now it is ensured that VLs with logically high processor numbers are first parked because those are very often not closely located to VH or VM processors
20 Store Into Instruction Stream (SIIS) Look at WSC Flash (Processor Design Considerations) The situation is not new (flash was released for z10) but still exists It has to do with Out-of-Order processing of modern processors Out-Of-Order Processing OOO yields significant performance benefit for compute intensive apps through Re-ordering instruction execution Later (younger) instructions can execute ahead of an older stalled instruction Re-ordering storage accesses and parallel storage accesses OOO maintains good performance growth for traditional apps
21 Background: SIIS Split instruction/data cache design requires special SIIS handling Ifetching runs (far!) ahead Storing runs somewhat ahead (out of order) To really execute the store, the istream must have stopped using the cache line LR 15,4 MR 14,5 LR 2,15 LR 15,4 MR 14,5 ST 3,208(0,13) LR 3,15 A 2,2604(0,7) ST 2,976(0,13) A 3,2604(0,7) ST 3,LABEL LABEL: L 2,92(0,9) L 15,44(0,2) L 3,64(0,12) LA 1,237(0,3) BASR 14,15 LR 1,4 MR 0,5 LR 4,1 L 5,208(0,13) LR 1,5 MR 0,6 LR 14,1 A 4,2604(0,7) ST 4,976(0,13) L 4,304(0,9) A 14,3944(0,4) ST 14,984(0,13) L 15,44(0,2)... Fully done ( completed ) up to here Store into instruction stream Out-of-order (started execution) up to here Ifetch is already far, far away This can be many instructions This can be a lot more
22 Background: SIIS LR 15,4 MR 14,5 LR 2,15 LR 15,4 MR 14,5 ST 3,208(0,13) LR 3,15 A 2,2604(0,7) ST 2,976(0,13) A 3,2604(0,7) ST 3,LABEL LABEL: L 2,92(0,9) L 15,44(0,2) L 3,64(0,12) LA 1,237(0,3) BASR 14,15 LR 1,4 MR 0,5 LR 4,1 L 5,208(0,13) LR 1,5 MR 0,6 LR 14,1 A 4,2604(0,7) ST 4,976(0,13) L 4,304(0,9) A 14,3944(0,4) ST 14,984(0,13) L 15,44(0,2)... Caches keep track of cache lines, so cache line accesses are used to detect SIIS cases Cache line size: 256 bytes (since a long time) IFetch just keeps on fetching Even though only the blue instructions are done yet, IFetch will be way beyond that point, having fetched the SIIS, SIIS-victim, and a lot of stuff following that already Out-of-order execution runs ahead quite a bit Even though only the blue instructions are done yet, the SIIS into LABEL will already fetch the cache line into the data cache Without special handling, at this point: IStream will loose the cache line, potentially even the first green instruction This means the SIIS will also be lost, even though it just got the cache line (it was not the next instruction to be done yet) IFetch will immediately ask for the first green instruction again Moving the cache line back from the data cache into the instruction cache Repeat Special core internal interlock between instruction and data cache prevents this loop But there is no path to send the store data directly into the IStream
23 Background: SIIS What changed from z12 to z13: z12 core cache hierarchy Core boundary L1-I$ (64K) L1-D$ (96K) L2-D$ (1M) L2-I$ (1M) Fast, private L2-D$ for data cache Slow, shared L2-I$ Usually bypassed for L2-D$ fetches Cache lines that have to travel between instruction and data side can do so through the L2-I$ L3$ (48M) Not shared Shared
24 Background: SIIS What changed from z12 to z13: z13 core cache hierarchy Core boundary L1-I$ (96K) L2-I$ (2M) L1-D$ (128K) L2-D$ (2M) Fast, private L2-D$ for data cache and fast, private L2-I$ for instruction cache This improves instruction cache miss latency It also improves L2 cache miss latency First shared cache level is L3 cache Cache lines that have to travel between instruction and data side can do so through the L3$ L3$ (48M) Not shared Shared
25 Store Into Instruction Stream On z13 sometimes more visible than on z196 and zec12 because the bigger L2 I- and D-caches are not interconnected anymore So what is in general an advantage can become a disadvantage in this particular case Did this happen on z13: The answer is Yes Where does it typically happen? In assembler routines which exist for a long time Typically Batch programs use such routines What to do? If you recognize that specific Batch programs run significantly longer Use tools to analyze the instruction pattern Analyze the hot spots in the program code For SIIS coding patterns: remove such occurrences Notice: The situation is not new and modern processor design require technologies which may interfere with historical coding practices.
26 z13: Summary z13 provides new design points and set the base to grow capacity and functionality for z Systems processors Because it is new some care for migration and planning is advised
27
28 Resources Processor Design Considerations (WSC Flash) LPAR Design Tool System z Topology Report SMF 113 Reporting Tool CPU MF Webinar Replays and Presentations z/os CPU MF Detailed Instructions Step by Step Guide z/vm Using CPU Measurement Facility Host Counters CPU MF Extended Counter Description The Load-Program-Parameter and the CPU-Measurement Facilities
29 Annotation The real value of z13 comes with exploiting its features Examples DB2 V11 zedc
30 (z13 and) DB2 V11
31 z13 and DB2 V11 DB2 workloads showing 4 to 38% range Mostly better than expected 10% improvement 2.4x reduction in compression cost utility
32
33 zedc test results: Three European clients Cost comparison: zedc and tailored compression: Tailored compression: 2.52 CPU sec per GB zedc: 0.15 CPU sec per GB
34 zedc benefits: European clients
35 Session feedback Please submit your feedback at Session is BK 35
CPU MF Counters Enablement Webinar
Advanced Technical Skills (ATS) North America CPU MF Counters Enablement Webinar John Burg Kathy Walsh May 2, 2012 1 Announcing CPU MF Enablement Education Two Part Series Part 1 General Education Today
More informationWhat are the major changes to the z/os V1R13 LSPR?
Prologue - The IBM Large System Performance Reference (LSPR) ratios represent IBM's assessment of relative processor capacity in an unconstrained environment for the specific benchmark workloads and system
More informationWhy is the CPU Time For a Job so Variable?
Why is the CPU Time For a Job so Variable? Cheryl Watson, Frank Kyne Watson & Walker, Inc. www.watsonwalker.com technical@watsonwalker.com August 5, 2014, Session 15836 Insert Custom Session QR if Desired.
More information2015 CPU MF Update. John Burg IBM. March 3, 2015 Session Number Insert Custom Session QR if Desired.
2015 CPU MF Update John Burg IBM March 3, 2015 Session Number 16803 Insert Custom Session QR if Desired. Trademarks The following are trademarks of the International Business Machines Corporation in the
More informationz10 Capacity Planning Issues Fabio Massimo Ottaviani EPV Technologies White paper
z10 Capacity Planning Issues Fabio Massimo Ottaviani EPV Technologies White paper 1 Introduction IBM z10 machines present innovative architecture and features (HiperDispatch) designed to exploit the speed
More informationz/os Performance Hot Topics Bradley Snyder 2014 IBM Corporation
z/os Performance Hot Topics Bradley Snyder Bradley.Snyder@us.ibm.com Agenda! Performance and Capacity Planning Topics Introduction of z Systems z13 Processor Overview of SMT CPUMF and HIS Support zpcr
More informationZ13 customer experiences
IBM z13 Customer Experiences June 18 th 2015 Matthias R. Bangert, Executive IT Specialist, IOT Europe Matthias.bangert@de.ibm.com Phone: +49-170-4533091 1 Z13 customer experiences We have installed round
More informationz/os 1.11 and z196 Capacity Planning Issues (Part 2) Fabio Massimo Ottaviani EPV Technologies White paper
z/os 1.11 and z196 Capacity Planning Issues (Part 2) Fabio Massimo Ottaviani EPV Technologies White paper 5 Relative Nest Intensity (RNI) Only three z/os 1.11 benchmarks are available: Low RNI, AVG RNI
More informationExploring the SMF 113 Processor Cache Counters
Exploring the SMF 113 Processor Cache Counters Instructor: Peter Enrico Email: Peter.Enrico@EPStrategies.com z/os Performance Education, Software, and Managed Service Providers Enterprise Performance Strategies,
More informationCPU MF Counters Enablement Webinar
Advanced Technical Skills (ATS) North America MF Counters Enablement Webinar June 14, 2012 John Burg Kathy Walsh IBM Corporation 1 MF Enablement Education Part 2 Specific Education Brief Part 1 Review
More informationThe Relatively New LSPR and zec12/zbc12 Performance Brief
The Relatively New LSPR and zec12/zbc12 Performance Brief SHARE Anaheim 15204 EWCP Gary King IBM March 12, 2014 Page 1 Trademarks The following are trademarks of the International Business Machines Corporation
More informationz Processor Consumption Analysis Part 1, or What Is Consuming All The CPU?
z Processor Consumption Analysis Part 1, or What Is Consuming All The CPU? Peter Enrico Email: Peter.Enrico@EPStrategies.com z/os Performance Education, Software, and Managed Service Providers Enterprise
More informationUnderstanding Simultaneous Multithreading on z Systems (post-announcement)
nderstanding Simultaneous Multithreading on z Systems (post-announcement) Bob Rogers 9/10/2015 Copyright NewEra Software and Robert Rogers, 2015, All rights reserved. 1 Abstract Simultaneous Multithreading
More informationz/os Performance Hot Topics
z/os Performance Hot Topics Glenn Anderson IBM Lab Services and Tech Training IBM Systems Technical Events ibm.com/training/events Copyright IBM Corporation 2017. Technical University/Symposia materials
More informationz/os 1.11 and z196 Capacity Planning Issues
z/os 1.11 and z196 Capacity Planning Issues Fabio Massimo Ottaviani EPV Technologies White paper 1 Introduction Experienced capacity planners know that every new generation of machines provides a major
More informationCollecting CPU MF (Counters) on z/os
Collecting CPU MF (Counters) on z/os The purpose of this document is to describe the steps necessary to enable CPU MF Counters. Background With the System z10 and later processors there is a new hardware
More informationThe All New LSPR and z196 Performance Brief
The All New LSPR and z196 Performance Brief SHARE Anaheim EWCP Gary King IBM March 2, 2011 1 Trademarks The following are trademarks of the International Business Machines Corporation in the United States
More informationz/os Heuristic Conversion of CF Operations from Synchronous to Asynchronous Execution (for z/os 1.2 and higher) V2
z/os Heuristic Conversion of CF Operations from Synchronous to Asynchronous Execution (for z/os 1.2 and higher) V2 z/os 1.2 introduced a new heuristic for determining whether it is more efficient in terms
More informationCPU MF Update and What s New with z/os 2.1?
CPU MF Update and What s New with z/os 2.1? John Burg IBM August 7, 2014 Session Number 15705 Insert Custom Session QR if Desired. 2 System z WSC Performance Team Trademarks The following are trademarks
More informationzpcr Capacity Sizing Lab Exercise
Page 1 of 35 zpcr Capacity Sizing Lab Part 2 Hands On Lab Exercise John Burg Function Selection Window Page 2 of 35 Objective You will use zpcr (in Advanced Mode) to define a customer's current LPAR configuration
More informationPractical Capacity Planning in 2010 zaap and ziip
Practical Capacity Planning in 2010 zaap and ziip Fabio Massimo Ottaviani EPV Technologies February 2010 1 Introduction When IBM released zaap (2004) and ziip(2006) most companies decided to acquire a
More informationPlanning Considerations for HiperDispatch Mode Version 2 IBM. Steve Grabarits Gary King Bernie Pierce. Version Date: May 11, 2011
Planning Considerations for HiperDispatch Mode Version 2 IBM Steve Grabarits Gary King Bernie Pierce Version Date: May 11, 2011 This document can be found on the web, www.ibm.com/support/techdocs Under
More informationzpcr Capacity Sizing Lab
zpcr Capacity Sizing Lab John Burg IBM March 4, 2015 Session Number 16806 / 16798 Insert Custom Session QR if Desired. Trademarks The following are trademarks of the International Business Machines Corporation
More informationTo MIPS or Not to MIPS. That is the CP Question!
To MIPS or Not to MIPS That is the CP Question! SHARE Seattle 16811 EWCP Gary King IBM March 4, 2015 1 2 Trademarks Systems & Technology Group The following are trademarks of the International Business
More informationChapter 2: Memory Hierarchy Design Part 2
Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental
More informationz990 Performance and Capacity Planning Issues
z990 Performance and Capacity Planning Issues Cheryl Watson Session 2537; SHARE 104 in Anaheim March 2, 2005 Watson & Walker, Inc. home of Cheryl Watson's TUNING Letter, CPU Chart, BoxScore & GoalTender
More informationFocus on zedc SHARE 123, Pittsburgh
IBM Systems & Technology Group Focus on zedc SHARE 123, Pittsburgh John Eells IBM Poughkeepsie eells@us.ibm.com 4 August 2014 Many thanks to Anthony Sofia for the updates! 1 2012, 2014 IBM Corporation
More informationThe Hidden Gold in the SMF 99s
The Hidden Gold in the SMF 99s Peter Enrico Email: Peter.Enrico@EPStrategies.com z/os Performance Education, Software, and Managed Service Providers Enterprise Performance Strategies, Inc. 3457-53rd Avenue
More informationPick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality
Pick a time window size w. In time span w, are there, Multiple References, to nearby addresses: Spatial Locality Repeated References, to a set of locations: Temporal Locality Take advantage of behavior
More informationz990 and z9-109 Performance and Capacity Planning Issues
z990 and z9-109 Performance and Capacity Planning Issues Cheryl Watson Session 501; CMG2005 in Orlando December 8, 2005 Watson & Walker, Inc. home of Cheryl Watson's TUNING Letter, CPU Chart, BoxScore
More informationzpcr Processor Capacity Reference for IBM Z and LinuxONE LPAR Configuration Capacity Planning Function Advanced-Mode QuickStart Guide zpcr v9.
zpcr Function Overview LPAR Configuration Capacity Planning Function Advanced-Mode QuickStart Guide zpcr v9.1a 1. Display LSPR Processor Capacity Ratios tables Multi-Image table: Provides capacity relationships
More informationConfiguring and Using SMF Logstreams with zedc Compression
Glenn Anderson, IBM Lab Services and Training Configuring and Using SMF Logstreams with zedc Compression Summer SHARE August 2015 Session 17644 Overview: Current SMF Data Flow SMF Address Space Record
More informationMaking System z the Center of Enterprise Computing
8471 - Making System z the Center of Enterprise Computing Presented By: Mark Neft Accenture Application Modernization & Optimization Strategy Lead Mark.neft@accenture.com March 2, 2011 Session 8471 Presentation
More informationMultiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering
Multiprocessors and Thread-Level Parallelism Multithreading Increasing performance by ILP has the great advantage that it is reasonable transparent to the programmer, ILP can be quite limited or hard to
More informationzpcr Capacity Sizing Lab Part 2 Hands-on Lab
Advanced Technical Skills (ATS) North America zpcr Capacity Sizing Lab Part 2 Hands-on Lab SHARE - Session 11497 August 7, 2012 John Burg Brad Snyder Materials created by John Fitch and Jim Shaw IBM 68
More informationIBM Education Assistance for z/os V2R2
IBM Education Assistance for z/os V2R2 Item: RSM Scalability Element/Component: Real Storage Manager Material current as of May 2015 IBM Presentation Template Full Version Agenda Trademarks Presentation
More informationLinux Performance on IBM System z Enterprise
Linux Performance on IBM System z Enterprise Christian Ehrhardt IBM Research and Development Germany 11 th August 2011 Session 10016 Agenda zenterprise 196 design Linux performance comparison z196 and
More informationHigh Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 23 Hierarchical Memory Organization (Contd.) Hello
More informationzpcr Capacity Sizing Lab
(ATS) North America zpcr Capacity Sizing Lab SHARE - Sessions 10885 / 10880 March 15, 2012 John Burg Materials created by John Fitch and Jim Shaw IBM 1 2 Trademarks The following are trademarks of the
More informationz/vm Data Collection for zpcr and zcp3000 Collecting the Right Input Data for a zcp3000 Capacity Planning Model
IBM z Systems Masters Series z/vm Data Collection for zpcr and zcp3000 Collecting the Right Input Data for a zcp3000 Capacity Planning Model Session ID: cp3kvmxt 1 Trademarks The following are trademarks
More informationChapter 2: Memory Hierarchy Design Part 2
Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental
More informationChecklist For z/os Performance Improvement That Every System Programmer Should Know 15789
Checklist For z/os Performance Improvement That Every System Programmer Should Know 15789 Insert Custom Session QR if Desired. Meral Temel System Architect / z/os Team Leader ISBANK meral.temel@is.net.tr
More informationCSEE W4824 Computer Architecture Fall 2012
CSEE W4824 Computer Architecture Fall 2012 Lecture 8 Memory Hierarchy Design: Memory Technologies and the Basics of Caches Luca Carloni Department of Computer Science Columbia University in the City of
More informationDB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in
DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in versions 8 and 9. that must be used to measure, evaluate,
More informationzpcr Capacity Sizing Lab
(ATS) North America zpcr Capacity Sizing Lab SHARE - Sessions 8883/9098 March 2, 2011 John Burg Brad Snyder Materials created by John Fitch and Jim Shaw IBM 1 2 Advanced Technical Skills Trademarks The
More informationMemory Hierarchies. Instructor: Dmitri A. Gusev. Fall Lecture 10, October 8, CS 502: Computers and Communications Technology
Memory Hierarchies Instructor: Dmitri A. Gusev Fall 2007 CS 502: Computers and Communications Technology Lecture 10, October 8, 2007 Memories SRAM: value is stored on a pair of inverting gates very fast
More informationLecture notes for CS Chapter 2, part 1 10/23/18
Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental
More informationShare San Francisco Session February 4, :30. Richard Ralston Humana Inc.
Share San Francisco Session 12923 February 4, 2013 13:30 Richard Ralston Humana Inc. Original justification Reduced overall cost (considering hardware used for MQ only) Increased performance Increased
More informationzpcr Capacity Sizing Lab Part 2 Hands on Lab
Advanced Technical Skills (ATS) North America zpcr Capacity Sizing Lab Part 2 Hands on Lab SHARE - Session 9098 March 2, 2011 John Burg Brad Snyder Materials created by John Fitch and Jim Shaw IBM 49 2011
More informationReducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip
Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off
More informationIBM zaware - Using Analytics to Improve System z Availability
IBM zaware - Using Analytics to Improve System z Availability Anuja Deedwaniya anujad@us.ibm.com Session 16077 Insert Custom Session QR if Desired. Thanks to Garth Godfrey, zaware Development for contribution
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationSplunking Your z/os Mainframe Introducing Syncsort Ironstream
Copyright 2016 Splunk Inc. Splunking Your z/os Mainframe Introducing Syncsort Ironstream Ed Hallock Director of Product Management, Syncsort Inc. Disclaimer During the course of this presentation, we may
More informationIBM C IBM z Systems Technical Support V6.
IBM C9030-634 IBM z Systems Technical Support V6 http://killexams.com/exam-detail/c9030-634 QUESTION: 66 The value of the MobileFirst Platform is that it: A. Provides a platform to build, test, run and
More informationHistorical Collection Best Practices. Version 2.0
Historical Collection Best Practices Version 2.0 Ben Stern, Best Practices and Client Success Architect for Virtualization and Cloud bstern@us.ibm.com Copyright International Business Machines Corporation
More informationzpcr Capacity Sizing Lab
(ATS) North America zpcr Capacity Sizing Lab SHARE - Sessions 10001/9667 August 11, 2011 John Burg Brad Snyder Materials created by John Fitch and Jim Shaw IBM 1 2 Advanced Technical Skills Trademarks
More informationDynamic Routing: Exploiting HiperSockets and Real Network Devices
Dynamic Routing: Exploiting s and Real Network Devices Session 8447 Jay Brenneman rjbrenn@us.ibm.com Exploiting s and Real Network Devices Session 8447 Trademarks The following are trademarks of the International
More informationCheryl s Hot Flashes #21
Cheryl s Hot Flashes #21 Cheryl Watson Watson & Walker, Inc. March 6, 2009 Session 2509 www.watsonwalker.com home of Cheryl Watson s TUNING Letter, CPU Chart, BoxScore, and GoalTender Agenda Survey Questions
More information2
1 2 3 4 5 6 For more information, see http://www.intel.com/content/www/us/en/processors/core/core-processorfamily.html 7 8 The logic for identifying issues on Intel Microarchitecture Codename Ivy Bridge
More informationSPECULATIVE MULTITHREADED ARCHITECTURES
2 SPECULATIVE MULTITHREADED ARCHITECTURES In this Chapter, the execution model of the speculative multithreading paradigm is presented. This execution model is based on the identification of pairs of instructions
More informationThe Right Read Optimization is Actually Write Optimization. Leif Walsh
The Right Read Optimization is Actually Write Optimization Leif Walsh leif@tokutek.com The Right Read Optimization is Write Optimization Situation: I have some data. I want to learn things about the world,
More informationFinal Lecture. A few minutes to wrap up and add some perspective
Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection
More informationIBM InfoSphere Streams v4.0 Performance Best Practices
Henry May IBM InfoSphere Streams v4.0 Performance Best Practices Abstract Streams v4.0 introduces powerful high availability features. Leveraging these requires careful consideration of performance related
More informationCollecting CPU MF (Counters) on z/os
Collecting CPU MF (Counters) on z/os The purpose of this document is to describe the steps necessary to enable CPU MF Counters. Background With the System z10, z196 and z114 there is a new hardware instrumentation
More informationWrite only as much as necessary. Be brief!
1 CIS371 Computer Organization and Design Midterm Exam Prof. Martin Thursday, March 15th, 2012 This exam is an individual-work exam. Write your answers on these pages. Additional pages may be attached
More informationAccelerating Applications. the art of maximum performance computing James Spooner Maxeler VP of Acceleration
Accelerating Applications the art of maximum performance computing James Spooner Maxeler VP of Acceleration Introduction The Process The Tools Case Studies Summary What do we mean by acceleration? How
More informationCPU MF - the Lucky SMF 113s - z196 Update and WSC Experiences
(ATS) North America MF - the Lucky SMF 113s - z196 Update and WSC Experiences SHARE - Session 8882 March 3, 2011 John Burg jpburg@us.ibm.com IBM Washington Systems Center 1 2 Advanced Technical Skills
More informationFit for Purpose Platform Positioning and Performance Architecture
Fit for Purpose Platform Positioning and Performance Architecture Joe Temple IBM Monday, February 4, 11AM-12PM Session Number 12927 Insert Custom Session QR if Desired. Fit for Purpose Categorized Workload
More informationBest Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.
IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development
More informationWSC Short Stories and Tall Tales. Session IBM Advanced Technical Support. March 5, John Burg. IBM Washington Systems Center
IBM Advanced Technical Support WSC Short Stories and Tall Tales Session 2536 March 5, 2009 John Burg IBM Washington Systems Center 1 2 Advanced Technical Support Washington Systems Center Trademarks The
More informationParallel Computing Prof. Subodh Kumar Department of Computer Science & Engineering Indian Institute of Technology Delhi
Parallel Computing Prof. Subodh Kumar Department of Computer Science & Engineering Indian Institute of Technology Delhi Module No # 01 Lecture No # 02 Parallel Programming Paradigms (Refer Slide Time:
More informationIntroduction to HiperDispatch Management Mode with z10 1
Introduction to HiperDispatch Management Mode with z10 1 Donald R. Deese Computer Management Sciences, Inc. Hartfield, Virginia 23071-3113 HiperDispatch was introduced with IBM s z10 server, and is available
More informationMultithreading: Exploiting Thread-Level Parallelism within a Processor
Multithreading: Exploiting Thread-Level Parallelism within a Processor Instruction-Level Parallelism (ILP): What we ve seen so far Wrap-up on multiple issue machines Beyond ILP Multithreading Advanced
More informationGetting Ready for VM Capacity Planning Studies AMERICAS TECHNICAL SALES SUPPORT
Getting Ready for VM Capacity Planning Studies AMERICAS TECHNICAL SALES SUPPORT Page 1 of 18 INTRODUCTION This document introduces the IBM Americas Techline Capacity Planning Process for VM systems. The
More informationSTEPS Towards Cache-Resident Transaction Processing
STEPS Towards Cache-Resident Transaction Processing Stavros Harizopoulos joint work with Anastassia Ailamaki VLDB 2004 Carnegie ellon CPI OLTP workloads on modern CPUs 6 4 2 L2-I stalls L2-D stalls L1-I
More information(Refer Slide Time: 1:26)
Information Security-3 Prof. V Kamakoti Department of Computer science and Engineering Indian Institute of Technology Madras Basics of Unix and Network Administration Operating Systems Introduction Mod01,
More informationCPU and ziip usage of the DB2 system address spaces Part 2
CPU and ziip usage of the DB2 system address spaces Part 2 Fabio Massimo Ottaviani EPV Technologies February 2016 4 Performance impact of ziip over utilization Configurations where the number of ziips
More informationMemory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358
Memory Management Reading: Silberschatz chapter 9 Reading: Stallings chapter 7 1 Outline Background Issues in Memory Management Logical Vs Physical address, MMU Dynamic Loading Memory Partitioning Placement
More informationChapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY
Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored
More informationScaling Without Sharding. Baron Schwartz Percona Inc Surge 2010
Scaling Without Sharding Baron Schwartz Percona Inc Surge 2010 Web Scale!!!! http://www.xtranormal.com/watch/6995033/ A Sharding Thought Experiment 64 shards per proxy [1] 1 TB of data storage per node
More informationProcessor Architecture and Interconnect
Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing
More informationWhite Paper. 1 Introduction. Managing z/os costs with capping: what s new with zec12 GA2 and z/os 2.1? Fabio Massimo Ottaviani - EPV Technologies
White Paper Managing z/os costs with capping: what s new with zec12 GA2 and z/os 2.1? Fabio Massimo Ottaviani - EPV Technologies 1 Introduction In the current volatile economic environment, companies want
More informationMainframe Optimization System z the Center of Enterprise Computing
Mainframe Optimization System z the Center of Enterprise Computing Presented By: Mark Neft Accenture Application Modernization & Optimization Strategy Lead August 7,2012 Session : 11813 Presentation Abstract:
More informationCMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 12: Multi-Core Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 4 " Due: 11:49pm, Saturday " Two late days with penalty! Exam I " Grades out on
More informationzpcr Capacity Sizing Lab
(ATS) North America zpcr Capacity Sizing Lab SHARE - Sessions 11599 / 11497 August 7, 2012 John Burg Materials created by John Fitch and Jim Shaw IBM 1 2 Advanced Technical Skills Trademarks The following
More informationSoftware Migration Capacity Planning Aid IBM Z
zsoftcap User s Guide Software Migration Capacity Planning Aid for IBM Z IBM Corporation 2011, 2018 Version 5.4a v54a zsoftcap UG 2018b09 Customer.docx 05/01/2018 The following are trademarks of the International
More informationMeasuring zseries System Performance. Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012
Measuring zseries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Outline Computer System Performance Performance Factors and Measurements zseries
More informationSo, coming back to this picture where three levels of memory are shown namely cache, primary memory or main memory and back up memory.
Computer Architecture Prof. Anshul Kumar Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture - 31 Memory Hierarchy: Virtual Memory In the memory hierarchy, after
More informationFlash Express on z Systems. Jan Tits IBM Belgium 9 December 2015
Flash Express on z Systems Jan Tits IBM Belgium jantits@be.ibm.com 9 December 2015 IBM Flash Express Improves Availability and Performance Flash Express is an innovative server based solution to help you
More informationLECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY
LECTURE 4: LARGE AND FAST: EXPLOITING MEMORY HIERARCHY Abridged version of Patterson & Hennessy (2013):Ch.5 Principle of Locality Programs access a small proportion of their address space at any time Temporal
More informationDynamic Routing: Exploiting HiperSockets and Real Network Devices
Dynamic Routing: Exploiting s and Real Network Devices now with z/vm 6.2 & Relocation!! Jay Brenneman IBM Poughkeepsie Z Software Test Lab rjbrenn@us.ibm.com Exploiting s and Real Network Devices Session
More informationEE 4683/5683: COMPUTER ARCHITECTURE
EE 4683/5683: COMPUTER ARCHITECTURE Lecture 6A: Cache Design Avinash Kodi, kodi@ohioedu Agenda 2 Review: Memory Hierarchy Review: Cache Organization Direct-mapped Set- Associative Fully-Associative 1 Major
More informationOptimising for the p690 memory system
Optimising for the p690 memory Introduction As with all performance optimisation it is important to understand what is limiting the performance of a code. The Power4 is a very powerful micro-processor
More informationMemory-Based Cloud Architectures
Memory-Based Cloud Architectures ( Or: Technical Challenges for OnDemand Business Software) Jan Schaffner Enterprise Platform and Integration Concepts Group Example: Enterprise Benchmarking -) *%'+,#$)
More informationIntroduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006
November 21, 2006 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds MBs to GBs expandable Disk milliseconds
More informationConcurrent High Performance Processor design: From Logic to PD in Parallel
IBM Systems Group Concurrent High Performance design: From Logic to PD in Parallel Leon Stok, VP EDA, IBM Systems Group Mainframes process 30 billion business transactions per day The mainframe is everywhere,
More informationVM & VSE Tech Conference May Orlando Session G41 Bill Bitner VM Performance IBM Endicott RETURN TO INDEX
VM & VSE Tech Conference May 2000 - Orlando Session G41 Bill Bitner VM Performance IBM Endicott 607-752-6022 bitnerb@us.ibm.com RETURN TO INDEX Legal Stuff Disclaimer The information contained in this
More informationCaches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016
Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationUnderstanding the TOP Server ControlLogix Ethernet Driver
Understanding the TOP Server ControlLogix Ethernet Driver Page 2 of 23 Table of Contents INTRODUCTION 3 UPDATE RATES AND TAG REQUESTS 4 CHANNEL AND DEVICE CONFIGURATION 7 PROTOCOL OPTIONS 9 TAG GENERATION
More information