Enterprise2014. GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)

Chris Churchey Principal ATS Group, LLC churchey@theatsgroup.com (610-574-0207) October 2014 GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)

Why Monitor? (Clusters, Servers, Storage, Net, etc.) Ensure the services and apps are available to our users (customers) Ensure they perform optimally Identify constraints, problems or configuration concerns Learn from past behaviors and trends Anticipate/Avoid capacity constraints vs. reacting to them and impact to users It s our job I hope 2

What to Monitor (for starters) CPU (User + System) >= 80% Waiting on I/O >= 10% Possible IO bottleneck Memory Paging Page-In/Swap-In >= 5 per second Scan/Free Ratio >= 4 Page/Swap Space Used >= 80% Huge/Large pages Allocated >0 but Used=0 Thrashing >90% Critical Waste Network & Fiber Adapters Running-Speed = Supported-Speed Read/Write Throughput >= 80% Running-Speed Load Balanced across adapters HBA Queue Depth and Transfer Size settings give huge gains 3

What to Monitor.. Filesystems Space Used >= 90% Space Used >= 90% and Free < 1GB / and /var Space Used > 95% and Free < 512MB I-nodes Used >= 90% Traditional check less Alerts Critical Disks Write Size < 64KB and Writes/s > 20 and Service Time < 1ms SAN storage today with write Cache should have all small to medium size writes be < 1ms on average Queue Depth, Algorithm and Transfer Size settings give huge gains Processes High CPU and/or Memory consumers Runaway long running processes Long running gradual memory growth (Memory Leak?) 4

What to Monitor.. GPFS All previously listed plus. NSD s are distributed equally and balanced across NSD servers unless you designated specific Roles to NSD server pairs Server and Client node GPFS specific Node/Filesystem stats mmpmon, etc. Special tuning cases arise with Large clusters, millions to billions of files, mixed large and small files and the behavior access to them often will determine special design considerations Use of Meta-only NSD s on dedicated disks using SSDs or Flash and dedicated adapters for short size IOps intensive access away from large throughput IO Contact IBM or the Galileo Performance team for assistance Worker Threads 5

Daily Monitoring Steps (Methodology) 1. Cluster view Check the Dashboard 2. Identify candidates to investigate e.g. What to Monitor 2. Follow the data.charts views... 3. View over a period of time 4. Determine usage mix and observed Peaks * Make it easy with Galileo Performance Explorer GPFS and Storage agents and new automated Analytics capability! 6

Cluster view Immediately 3 observations stand out! (May be ok May not be.) 7

Investigate high CPU %Busy which NODE? Find out which node it is (Top: 1)..gvicp8gpfsRH05.Lets look at Processes next 8

Investigate high CPU %Busy found Node which Process? 9 Find which Process(s) (Top: 2) runaway and every2hrs 3 & 1 Threads.. * Checked with user runaway is bad every2hrs is Scheduled (good)..

Investigate high IO Wait which NODE? Find out which node it is..gvicp8gpfsaix04.next..look at nodes details 10

Investigate high IO found Node is problem HBA or Disks? 11 Found (4) HBAs fcs0/fcs1 each 500MB/s fcs2=100mb/s fcs3=0. * Problem was fcs3 not zoned corrected lets see what this improved..

Investigate high IO found Node is problem HBA or Disks? Corrected fcs3 zoning.now both fcs2 and fcs3 pushing 250MB/s each 12

Investigate high IO found Node is problem HBA or Disks? Fixed zoning, increased IO throughput BUT now caused a Memory Paging problem * the OLD saying Fixing one Perf problem often Exposes another!... 13

Eg. NSD Servers not Balanced (Clients constrained) 14 Looks like (1) NSD Server is doing all the work (gvicp8gpfsaix01)

NSD Servers not Balanced (Clients constrained).. 15 Identify what File-System is heavily used and the Client node(s)

Round-Robin NSD Server-list to Balance load Changed NSD Server Order to Balance between gvicp8gpfsaix01 and aix02 16

Switched the 2 Clients to Direct-attached-Node 17 Now Data intensive nodes can go Direct storage, major throughput improvement.yes could do an all Infiniband Network..

Galileo Analytics engine minutes vs. hours of past 11-Slides. 18

Galileo Analytics engine..booth-22 19

E.g. Seq. 50/50 Read/Write 256K 8-Threads V7K-SAS 20

E.g. Seq. 50/50 Read/Write 256K 8-Threads Flash-840 21

We are seeking Use-Cases for input to Galileo PE Analytics engine for automation Lessons Learned / Best Practices / Thresholds as well We have an Innovation Center lab where we test, demo and showcase technology Ideas to demo, POC, verify claims, etc. you would like to see us perform and share! support@galileosuite.com or sales@galileosuite.com or churchey@theatsgroup.com..please contact us..!!!!!! Booth #22 22

Questions and Answers 23

We can help analyze and implement. Contact us! Check-out Galileo Performance Explorer Visit Booth #22 for a hands-on demo Sign-up for a trial at www.galileosuite.com Complimentary* no-strings attached 3 months use for Conference attendees sales@galileosuite.com (484-320-4302) www.galileosuite.com * First time Galileo user 24

Referenced Material Deploying a big data solution using IBM GPFS-FPO http://public.dhe.ibm.com/common/ssi/ecm/en/dcw03051usen/dcw03051usen.pdf GPFS tuning guidelines for deploying SAS http://www.sas.com/content/dam/sas/en_us/doc/partners/ibm-gpfs-tuning-guidelines.pdf GPFS Wiki IBM DeveloperWorks https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/general%20parallel%20file%20system%20%28 GPFS%29 GSS / ESS https://www.ibm.com/developerworks/community/blogs/5things/entry/gpfs_storage_server?lang=en Galileo Performance Explorer http://www.galileosuite.com * First time Galileo user 25