TECH BRIEF VirtualWisdom â ProbeNAS Brief Business Drivers and Business Value for VirtualWisdom Infrastructure is expensive, costly to maintain, and often difficult to scale. While transitioning to virtualized and cloud environments helps reduce capital expenditures (CAPEX) on physical hardware, IT has replaced this outlay with increased expenditures for management software and additional staff. Further complicating things, the resource- intensive applications available today have spurred the introduction of DevOps teams to manage the deployment of the virtual environments that the applications live in. Meanwhile, Enterprise IT doesn t have a uniform basis for understanding how the underlying infrastructure is performing. They are challenged to correlate disparate metrics across a heterogeneous environment that constantly changes. What is needed is a purpose-built solution that adapts and scales to this constant state of change and complexity one that provides definitive answers to the most complex questions. Figure 1: ProbeNAS allows you to understand overall NAS performance by Client/Server flows ProbeNAS metrics are correlated with those from other VirtualWisdom Probes, persistently stored, and presented by the VirtualWisdom Platform Appliance providing holistic and timely insight into the health, utilization, and performance of large-scale, heterogeneous, open-systems based infrastructures.
Research firm EMA recently wrote Though storage management tools provide views into macro-level performance results, what they cannot provide is a vendor-agnostic solution that delivers acute/granular real-time understanding of performance issues and how to resolve them. A solution that quantifies the performance impacts of an ever-changing IT environment across both SAN and NAS-based infrastructures is required. Both the performance cause and effect of the ecosystem is needed, complete with an unbiased feedback system to quantify infrastructure and workload changes. Mission critical applications now run on NAS storage, in part because of the increased granularity and simplicity NAS affords virtualization. IT organizations can benefit greatly from NAS storage monitoring and real-world testing, especially since NAS storage presents unique challenges that can affect performance. Metadata consumes 50% and more of NFS (the primary NAS protocol) traffic. This problem grows with scale, as more files translate into additional metadata, and the older a file is, the more metadata it is likely to collect. Changing workload and usage patterns, noisy neighbors and rogue client activity are all potential issues in addition to the NAS hardware itself. Because of the difficulty in solving NAS performance problems, a common solution is to buy more NAS storage, which may solve the problem but does so at a cost much higher than necessary. Figure 2: Throwing More Storage at the Problem increases CAPEX Infrastructure Performance Monitoring with ProbeNAS VirtualWisdom helps guarantee that hosts, VMs, databases, servers, switches and storage devices are performing as they should, in service of applications. Thanks to VirtualWisdom s wire-data monitoring, you can proactively monitor and manage infrastructure, rather than having to fire-fight issues as they happen. VirtualWisdom performance monitoring is NOT just monitoring speeds and feeds, but provides a comprehensive health check consisting of a collection of cross-domain, tightly correlated and analyzed entities, all playing a role in the health and optimization of the NAS infrastructure. All in service of the applications. That s an important statement. Speeds and feeds are for data sheets. True production workload optimization, while not rocket science, depends on, and demands insights into the complete stack, from host to file (in this case). Only VirtualWisdom can provide real-time, wire-data insights that proactively alerts and provides deep knowledge of what is and what can impact your company s mission critical business applications.
ProbeNAS silver bullet performance metric; tapping lowers OPEX Using a hardware probe with a passive TAP enables the measurement of the most important metric possible, real latency. With that metric, you can tier storage more confidently, use storage virtualizers more confidently, and find performance bottlenecks more quickly. Figure 3: ProbeNAS Product Suite Overview VirtualWisdom ProbeNAS Performance Analytics The VirtualWisdom NAS Performance Probe (aka ProbeNAS) is the industry s most complete real-time, full line rate monitoring solution for NFSv3 NAS. Working completely out-of-band, the ProbeNAS analyzes every IP Packet on monitored NAS ports, in real-time, and reports hundreds of metrics every second to provide comprehensive, accurate, and vendor agnostic monitoring at the protocol level. It captures the true, unaltered, I/O profile of the actual application traffic, detecting application performance slowdowns and transmission errors by measuring every I/O transaction time from start to finish. NAS is Simple? Right? Figure 4: What can go wrong in accessing a NAS environment A multitude of issues can cause a NAS environment to impact users satisfaction and slow down the infrastructure environment, including impacting adjacent applications and systems. Common NAS problem
causality ranges from TCP Window closings, Metadata storms, and rogue clients, to IP physical layer anomalies and errors that could potentially impact the infrastructure performance balance. The NAS Performance Probe firmware allows the probe itself to calculate detailed one-minute summaries of all the performance probe metrics before sending them to the VirtualWisdom Platform Appliance for storage, data analysis, and presentation. Summarizing the metrics on the probe drastically improves the Platform Appliance scalability and reduces the network latency requirements without losing the detail in the summary. Here are some examples of typical NAS performance problems and causes: NAS Performance: TCP Window Close During a TCP/IP connection--with every packet, in fact--each endpoint tells the other how much space it has left in its receive buffer; this lets the "other side" know how much more data it can send without waiting for ACKnowledgments. This "space available" is referred to as the TCP Window. Use case: A financial trading customer is experiencing poor response times for Market Data Feeds during ingest staging. We can see here that 3 hosts are having their TCP Window shrunk at end of trade day, affecting the application performance. This happens on a regular basis in the environment. They were unable to diagnose the problem without ProbeNAS Flow Control (much like B2B Credits) are easily detected by ProbeNAS Figure 5: TCP Window Close Alerts Figure 6: TCP Closings causes latency problems Here s where we can see the impact of the TCP Window closing has. The read latency shoots up to nearly 20ms on average during the entire timeframe because of the flow control issue. We can also see plenty of IOPS in the 20-50ms range, with a few peaking into the 500ms range.
Without ProbeNAS the TCP Windows Close issue would likely have been addressed by acquiring additional NAS heads to speed the system up. This customer did not have to do that and saved hundreds of $thousands of dollars by using VirtualWisdom with ProbeNAS. A simple change to the NFS TCP Window flow control primitives fixed the issue. NAS Performance: Rogue Clients and Meta Data Storms A NAS world is filled with multi-site file services, cloud file storage, virtual storage appliances; which basically translates into many, many NAS devices to monitor and manage. Being able to leverage tools to shorten and simplify the troubleshooting cycle involved in these complex environments becomes more of a competitive advantage. Consider the case where there is a runaway client process that is filling up the available space on a given volume. You've received the alert from your storage that a volume is filling up, but now what? You're well informed with regards to the volume name that is being filled up, but other than walking all of your clients looking for a rogue process, there is quite a bit of analysis that would normally need to take place to tease out the offender.. A rogue client is a client that has not established an authorized NAS mount or is operating out of a designated access window on the network. Rogue clients can easily impact application performance by bottlenecking the available TCP windows for critical data exchanges. Use case: A manufacturing company is highly virtualized and has experienced VM application slow down complaints. The backend datastore is a NAS storage system. Virtualization admins are receiving multiple tickets per day and need to understand where the problem resides and how to resolve it. A single client was swamping the array with a meta-data storm. This was easily found using VirtualWisdom Event Advisor Analytics, and ProbeNAS
NAS Performance: Meta Data Storms Figure 7: The graph shows Remote Procedure (RPC) calls, broken down by clients, in IOPS This client was attempting to do file level based replication. The replication engine was looking at each file s modified date, and assessing whether it needed to be replicated or not. This approach worked fine when the filesystem only had 500k files in it. Over time though, that filesystem grew, and now holds 3 million files in it. Every time the scan was kicked off (which was originally set to once a minute), the client was asking the array for modified times of every file. Though VERY little data was being transmitted (VERY low throughput), the array was running out of front end resources to handle all the requests. The solution to the issue wasn t adding more storage, or upgrading the filers, it was simply reducing the frequency of these scans. Once the change was completed, the complaints disappeared. NAS Physical Layer: Undetected physical issues Loss of sync and signal detects that there is something wrong between the 2 points ie from HBA to switch port. VirtualWisdom with ProbeNAS can detect and resolve many of these physical layer issues that burden IT staff with hours of vendor discussions and systems wide troubleshooting scenarios. ProbeNAS Advanced Analytics VirtualWisdom is the industry s leading analytics platform for IT Infrastructure Performance Monitoring (IPM) for production environments. It empowers data center operations professionals to deliver on the complex requirements of their application infrastructure. The platform provides insights into the performance and availability of the end-to-end server to storage infrastructure across physical, virtual and cloud environments. It intelligently correlates and analyzes an unmatched breadth and depth of data,
transforming it into answers and actionable insights. This enables IT teams to guarantee performance-based service level agreements (SLAs), increasing the value of the infrastructure. Event Advisor: Lets you quickly determine if there are any trends or events that should be investigated or noted across the entire environment Trend Match: Automatically correlates events across entire set of relevant metrics to quickly determine root cause. Conclusion The VirtualWisdom NAS Performance Probe (ProbeNAS) is the industry s most complete real-time, full linerate monitoring solution for NFSv3 environments. Working completely out-of-band, the NAS Performance Probe tracks every transaction on monitored NAS ports, and reports hundreds of metrics to provide comprehensive, accurate, and vendor agnostic monitoring at the protocol level. The Probe is offered in a 2U chassis with up to 16 x 10GE links. ProbeNAS metrics are correlated with those from other VirtualWisdom Probes, persistently stored, and presented by the VirtualWisdom Platform Appliance providing holistic and timely insight into the health, utilization, and performance of large-scale, heterogeneous, open-systems based infrastructures. Read the data sheet for more detailed specifications. Sales Sales@virtualinstruments.com 1.888-522.2557 Website virtualinstruments.com 2/2017 Virtual Instruments. All rights reserved. Features and specifications are subject to change without notice. VirtualWisdom, Virtual Instruments, SANInsight, Workload Central and Load DynamiX are trademarks or registered trademarks of Virtual Instruments in the United States and/or in other countries.