Enterprise2014. GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)

Similar documents
SAS workload performance improvements with IBM XIV Storage System Gen3

AIX Power System Assessment

PRESENTATION TITLE GOES HERE

IBM Emulex 16Gb Fibre Channel HBA Evaluation

IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide

Building a High IOPS Flash Array: A Software-Defined Approach

Lies, Damn Lies and Performance Metrics. PRESENTATION TITLE GOES HERE Barry Cooks Virtual Instruments

Aerospike Scales with Google Cloud Platform

The Oracle Database Appliance I/O and Performance Architecture

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Architecting For Availability, Performance & Networking With ScaleIO

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Emulex LPe16000B Gen 5 Fibre Channel HBA Feature Comparison

Practical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 02, 2015 Percona Technical Webinars

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Analytics in the cloud

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM

Performance Analysis in the Real World of Online Services

On BigFix Performance: Disk is King. How to get your infrastructure right the first time! Case Study: IBM Cloud Development - WW IT Services

Proof of Concept TRANSPARENT CLOUD TIERING WITH IBM SPECTRUM SCALE

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

Isilon Performance. Name

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

NAS for Server Virtualization Dennis Chapman Senior Technical Director NetApp

Solid State Performance Comparisons: SSD Cache Performance

SMB 3.0 Performance Dan Lovinger Principal Architect Microsoft

Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit

DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in

CSE 451: Operating Systems Spring Module 12 Secondary Storage

Cloud Monitoring as a Service. Built On Machine Learning

DELL Reference Configuration Microsoft SQL Server 2008 Fast Track Data Warehouse

Deployment Planning and Optimization for Big Data & Cloud Storage Systems

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation

The next step in Software-Defined Storage with Virtual SAN

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide

Interface Trends for the Enterprise I/O Highway

Storage validation at GoDaddy Best practices from the world s #1 web hosting provider

IBM FlashSystem. IBM FLiP Tool Wie viel schneller kann Ihr IBM i Power Server mit IBM FlashSystem 900 / V9000 Storage sein?

EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER

Accelerating storage performance in the PowerEdge FX2 converged architecture modular chassis

How Flash-Based Storage Performs on Real Applications Session 102-C

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

GLOBAL APPLICATION & SAN ACCELERATION WITH SOLID-STATE STORAGE A WHITE PAPER

Surveillance Dell EMC Isilon Storage with Video Management Systems

Practical MySQL Performance Optimization. Peter Zaitsev, CEO, Percona July 20 th, 2016 Percona Technical Webinars

Historical Collection Best Practices. Version 2.0

Solid State Storage is Everywhere Where Does it Work Best?

Performance Benefits of NVMe over Fibre Channel A New, Parallel, Efficient Protocol

Practical Strategies For High Performance SQL Server High Availability

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Consolidating Microsoft SQL Server databases on PowerEdge R930 server

Lesson 2: Using the Performance Console

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer

Extremely Fast Distributed Storage for Cloud Service Providers

Block Storage Service: Status and Performance

EMC CLARiiON Backup Storage Solutions

Falcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University

Technical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array

Ceph in a Flash. Micron s Adventures in All-Flash Ceph Storage. Ryan Meredith & Brad Spiers, Micron Principal Solutions Engineer and Architect

Webinar Series: Triangulate your Storage Architecture with SvSAN Caching. Luke Pruen Technical Services Director

Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016

Ekran System System Requirements and Performance Numbers

PS2 out today. Lab 2 out today. Lab 1 due today - how was it?

Hewlett Packard Enterprise HPE GEN10 PERSISTENT MEMORY PERFORMANCE THROUGH PERSISTENCE

Database Performance on NAS: A Tutorial. Darrell Suggs. NAS Industry Conference Network Appliance - Darrell Suggs

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Optimizing Server Designs for Speed

PERFORMANCE INVESTIGATION TOOLS & TECHNIQUES. 7C Matthew Morris Desynit

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

Real-time Monitoring, Inventory and Change Tracking for. Track. Report. RESOLVE!

IBM Spectrum Scale IO performance

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona

Port Tapping Session 2 Race tune your infrastructure

Exam : S Title : Snia Storage Network Management/Administration. Version : Demo

BlackBerry AtHoc Networked Crisis Communication Capacity Planning Guidelines. AtHoc SMS Codes

Optimise Your Virtualised Applications with Flash

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic

Feedback on BeeGFS. A Parallel File System for High Performance Computing

EECS 482 Introduction to Operating Systems

Managing Performance Variance of Applications Using Storage I/O Control

Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000

Increasing Performance of Existing Oracle RAC up to 10X

Introduction Optimizing applications with SAO: IO characteristics Servers: Microsoft Exchange... 5 Databases: Oracle RAC...

SAP HANA IBM x3850 X6

Measuring HEC Performance For Fun and Profit

How To Manage Disk Effectively with MPG's Performance Navigator

SC Series: Performance Best Practices. Brad Spratt Performance Engineering Midrange & Entry Solutions

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

Sage ERP Accpac. Compatibility Guide Versions 5.5 and 5.6. Revised: November 18, Compatibility Guide for Supported Versions

Port Tapping Session 3 How to Survive the SAN Infrastructure Storm

Enabling NVMe I/O Scale

IBM ProtecTIER and Netbackup OpenStorage (OST)

IBM Education Assistance for z/os V2R2

Storage Protocol Offload for Virtualized Environments Session 301-F

Network Design Considerations for Grid Computing

Transcription:

Chris Churchey Principal ATS Group, LLC churchey@theatsgroup.com (610-574-0207) October 2014 GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)

Why Monitor? (Clusters, Servers, Storage, Net, etc.) Ensure the services and apps are available to our users (customers) Ensure they perform optimally Identify constraints, problems or configuration concerns Learn from past behaviors and trends Anticipate/Avoid capacity constraints vs. reacting to them and impact to users It s our job I hope 2

What to Monitor (for starters) CPU (User + System) >= 80% Waiting on I/O >= 10% Possible IO bottleneck Memory Paging Page-In/Swap-In >= 5 per second Scan/Free Ratio >= 4 Page/Swap Space Used >= 80% Huge/Large pages Allocated >0 but Used=0 Thrashing >90% Critical Waste Network & Fiber Adapters Running-Speed = Supported-Speed Read/Write Throughput >= 80% Running-Speed Load Balanced across adapters HBA Queue Depth and Transfer Size settings give huge gains 3

What to Monitor.. Filesystems Space Used >= 90% Space Used >= 90% and Free < 1GB / and /var Space Used > 95% and Free < 512MB I-nodes Used >= 90% Traditional check less Alerts Critical Disks Write Size < 64KB and Writes/s > 20 and Service Time < 1ms SAN storage today with write Cache should have all small to medium size writes be < 1ms on average Queue Depth, Algorithm and Transfer Size settings give huge gains Processes High CPU and/or Memory consumers Runaway long running processes Long running gradual memory growth (Memory Leak?) 4

What to Monitor.. GPFS All previously listed plus. NSD s are distributed equally and balanced across NSD servers unless you designated specific Roles to NSD server pairs Server and Client node GPFS specific Node/Filesystem stats mmpmon, etc. Special tuning cases arise with Large clusters, millions to billions of files, mixed large and small files and the behavior access to them often will determine special design considerations Use of Meta-only NSD s on dedicated disks using SSDs or Flash and dedicated adapters for short size IOps intensive access away from large throughput IO Contact IBM or the Galileo Performance team for assistance Worker Threads 5

Daily Monitoring Steps (Methodology) 1. Cluster view Check the Dashboard 2. Identify candidates to investigate e.g. What to Monitor 2. Follow the data.charts views... 3. View over a period of time 4. Determine usage mix and observed Peaks * Make it easy with Galileo Performance Explorer GPFS and Storage agents and new automated Analytics capability! 6

Cluster view Immediately 3 observations stand out! (May be ok May not be.) 7

Investigate high CPU %Busy which NODE? Find out which node it is (Top: 1)..gvicp8gpfsRH05.Lets look at Processes next 8

Investigate high CPU %Busy found Node which Process? 9 Find which Process(s) (Top: 2) runaway and every2hrs 3 & 1 Threads.. * Checked with user runaway is bad every2hrs is Scheduled (good)..

Investigate high IO Wait which NODE? Find out which node it is..gvicp8gpfsaix04.next..look at nodes details 10

Investigate high IO found Node is problem HBA or Disks? 11 Found (4) HBAs fcs0/fcs1 each 500MB/s fcs2=100mb/s fcs3=0. * Problem was fcs3 not zoned corrected lets see what this improved..

Investigate high IO found Node is problem HBA or Disks? Corrected fcs3 zoning.now both fcs2 and fcs3 pushing 250MB/s each 12

Investigate high IO found Node is problem HBA or Disks? Fixed zoning, increased IO throughput BUT now caused a Memory Paging problem * the OLD saying Fixing one Perf problem often Exposes another!... 13

Eg. NSD Servers not Balanced (Clients constrained) 14 Looks like (1) NSD Server is doing all the work (gvicp8gpfsaix01)

NSD Servers not Balanced (Clients constrained).. 15 Identify what File-System is heavily used and the Client node(s)

Round-Robin NSD Server-list to Balance load Changed NSD Server Order to Balance between gvicp8gpfsaix01 and aix02 16

Switched the 2 Clients to Direct-attached-Node 17 Now Data intensive nodes can go Direct storage, major throughput improvement.yes could do an all Infiniband Network..

Galileo Analytics engine minutes vs. hours of past 11-Slides. 18

Galileo Analytics engine..booth-22 19

E.g. Seq. 50/50 Read/Write 256K 8-Threads V7K-SAS 20

E.g. Seq. 50/50 Read/Write 256K 8-Threads Flash-840 21

We are seeking Use-Cases for input to Galileo PE Analytics engine for automation Lessons Learned / Best Practices / Thresholds as well We have an Innovation Center lab where we test, demo and showcase technology Ideas to demo, POC, verify claims, etc. you would like to see us perform and share! support@galileosuite.com or sales@galileosuite.com or churchey@theatsgroup.com..please contact us..!!!!!! Booth #22 22

Questions and Answers 23

We can help analyze and implement. Contact us! Check-out Galileo Performance Explorer Visit Booth #22 for a hands-on demo Sign-up for a trial at www.galileosuite.com Complimentary* no-strings attached 3 months use for Conference attendees sales@galileosuite.com (484-320-4302) www.galileosuite.com * First time Galileo user 24

Referenced Material Deploying a big data solution using IBM GPFS-FPO http://public.dhe.ibm.com/common/ssi/ecm/en/dcw03051usen/dcw03051usen.pdf GPFS tuning guidelines for deploying SAS http://www.sas.com/content/dam/sas/en_us/doc/partners/ibm-gpfs-tuning-guidelines.pdf GPFS Wiki IBM DeveloperWorks https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/general%20parallel%20file%20system%20%28 GPFS%29 GSS / ESS https://www.ibm.com/developerworks/community/blogs/5things/entry/gpfs_storage_server?lang=en Galileo Performance Explorer http://www.galileosuite.com * First time Galileo user 25