Login VSI 4.1 Released

Disclaimer: I received a free Login VSI 12-month license this year as part of my membership of the vExpert programme. There is no requirement for me to blog about their products and I am not compensated in any way for this post.  This disclaimer is also probably longer than I’d intended the original post to be, but there you go.

Login VSI recently released version 4.1 of their performance testing tool for virtualised desktop environments. I’m going to do a longer post on this in the future, but in short new functionality includes:

  • Four new workloads: Task, Office (1 vCPU), Knowledge (2 vCPU) and Power User;
  • Import, mix and correlate performance data from any source like ESXtop and Perfmon; and
  • Improved VSImax simplifies understanding of potential bottlenecks like CPU or Disk I/O.

Sounds pretty cool. If you want to see a video about the new features, check it out here.

EMC – Silly things you can do with stress testing – Part 2

I’ve got a bunch of graphs that indicate you can do some bad things to EFDs when you run certain SQLIO stress tests against them and compare the results to FC disks. But EMC is pushing back on the results I’ve gotten for a number of reasons. So in the interests of keeping things civil I’m not going to publish them – because I’m not convinced the results are necessarily valid and I’ve run out of time and patience to continue testing. Which might be what EMC hoped for – or I might just be feeling a tad cynical.

What I have learnt though, is that it’s very easy to generate QFULL errors on a CX4 if you follow the EMC best practice configs for Qlogic HBAs and set the execution throttle to 256. In fact, you might even be better off leaving it at 16, unless you have a real requirement to set it higher. I’m happy for someone to tell me why EMC suggests it be set to 256, because I’ve not found a good reason for it yet. Of course, this is dependent on a number of environmental factors, but the 256 figure still has me scratching my head.

Another thing that we uncovered during stress testing had something to do with the Queue Depth of LUNs. For our initial testing, we had a Storage Pool created with 30 * 200GB EFDs, 70 * 450GB FC spindles, and 15 * 1TB SATA-II Spindles with FAST-VP enabled. The LUNs on the EFDs were set to no data movement – so everything sat on the EFDs. We were getting kind of underwhelming performance stats out of this config, and it seems like the main culprit was the LUN queue depth. In a traditonal RAID Group setup, the queue depth of the LUN is (14 * (the number of data drives in the LUN) + 32). So for a RAID 5 (4+1) LUN, the queue depth is 88. If, for some reason, you want to drive a LUN harder, you can increase this by using MetaLUNs, with the sum of the components providing the LUN’s queue depth. What we observed on the Pool LUN, however, was that this seemed to stay fixed at 88, regardless of the number of internal RAID Groups servicing the Pool LUN. This seems like it’s maybe a bad thing, but that’s probably why EMC quietly say that you should stick to traditional MetaLUNs and RAID Groups if you need particular performance characteristics.

So what’s the point I’m trying to get at? Storage Pools and FAST-VP are awesome for the majority of workloads, but sometimes you need to use more traditional methods to get what you want. Which is why I spent last weekend using the LUN Migration tool to move 100TB of blocks around the array to get back to the traditional RAID Group / MetaLUN model. Feel free to tell me if you think I’ve gotten this arse-backwards too, because I really want to believe that I have.

EMC – I heart EFD performance

We got some EFDs on our CX4-960s recently, and we had the chance to do some basic PassMark testing on them before we loaded them up with production configurations. We’re running 30 * 200GB EFDs in a Storage Pool on the CX4-960. FAST and FAST Cache haven’t been turned on. The VM was sitting on an HP BL460p G6 blade with 96GB RAM, 12 Nehalem cores and vSphere 4.1. The blades connect to the arrays via Cisco 9124e FC switches with 8Gbps port-channels to the Cisco MDS 9513. We’re only using 2 fe ports per SP on the CX4-960 at the moment. We used Pass Mark on a Windows 2008 R2 VM sitting on a 100GB vmdk. There wasn’t a lot of other LUNs bound in the Storage Pool, so the results are skewed. Nonetheless, they look pretty, and that’s what I’m going with.

100% Sequential Write:

 

Results:

100% Sequential Read:

Results:

File Server Simulation (80% Read, 20% Write):

Results: