EMC – DIY Heatmaps – Updated Version

Mat has updated the DIY Heatmaps for EMC CLARiiON and VNX arrays to version 3.015. You can get it from the Utilities page here. Any and all feedback welcome.

EMC – Broken Vault drive munts FAST Cache

Mat sent me an e-mail this morning, asking “why would FAST Cache be degraded after losing B0 E0 D2 in one of the CX4-960s?”. For those of you playing at home 0_0_2 is one of the Vault disks in the CX4 and VNX. Here’s a picture of the error:

Check out the 0x7576 that pops up shortly after the array says there’s a faulted disk. Here’s a closeup of the error:

Weird, huh?  So here’s the output of the naviseccli command that will give you the same information, but with a text-only feel.

"c:/Program Files/EMC/Navisphere CLI/NaviSECCli.exe"  -user Ebert -scope 0 -password xx -h 255.255.255.255  cache -fast -info -disks -status
Disks:
Bus 0 Enclosure 7 Disk 0
Bus 2 Enclosure 7 Disk 0
Bus 0 Enclosure 7 Disk 1
Bus 2 Enclosure 7 Disk 1
Bus 1 Enclosure 7 Disk 1
Bus 1 Enclosure 7 Disk 0
Bus 3 Enclosure 7 Disk 1
Bus 3 Enclosure 7 Disk 0
Mode:  Read/Write
Raid Type:  r_1
Size (GB):  366
State:  Enabled_Degraded
Current Operation:  N/A
Current Operation Status:  N/A
Current Operation Percent Completed:  N/A

So what’s with the degraded cache? The reason for this is that FAST Cache stores a small database on the first 3 drives (0_0_0, 0_0_1, 0_0_2). if any of these disks fail, FAST Cache flushes to disk and goes into a degraded state. But it shouldn’t, because the database is triple-mirrored. And what does it mean exactly? It means your FAST Cache is not processing writes at the moment. Which is considered “bad darts”.

This is a bug. Have a look on Powerlink for emc267579. Hopefully this will be fixed in R32 for the VNX. I couldn’t see details about the CX4 though. I strongly recommend that if you’re a CX4 user and you experience this issue, you raise a service request with your local EMC support mechanisms as soon as possible. The only way they get to know the severity of a problem is if people in the field feedback issues.

EMC – DIY Heatmaps

My friend Mat has developed a pretty cool script that can make pretty pictures out of Analyzer files from CLARiiON and VNX arrays. He’s decided to release it to the world, so you can download it here. I’ve also added the following instructions to a pdf document available here. Here’s a sample output file from the script. He’s after feedback as well, so send it through and I’ll make sure it gets to him.

Purpose:

The heatmap script allows you to generate heatmaps of various metrics over time from NAR files generated from EMC Clariion/VNX arrays.

Requirements/Notes:

  • This script was developed and tested using  Strawberry Perl (v5.12.3), but there is no reason it won’t work with other flavours or versions of Perl, but it is possible that other perl modulse other than the one listed below may need to be installed
  • It requires the  Text::CSV module for Perl.  Available here http://search.cpan.org/CPAN/authors/id/M/MA/MAKAMAKA/Text-CSV-1.21.tar.gz , if you can get cpan working  (run “cpan Text::CSV”) it will be easy to install, otherwise you’ll have to do it manually …

Unzip the files into eg C:\Text-CSV-1.21 and run the following commands

C:\Text-CSV-1.21>perl Makefile.PL

C:\Text-CSV-1.21>dmake

C:\Text-CSV-1.21>dmake test

C:\Text-CSV-1.21>dmake install

  • The script uses the Microsoft tool logparser.exe to manipulate various CSV files available here http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=24659
  • The script uses naviseccli to convert the NAR files into CSV files.
  • At the moment the script will only run on a windows server/PC due to its reliance on the logparser tool, if people are interested I will investigate other tools available in the Linux/Unix worlds.
  • The default min/max figures that I am using may be unrealistic, I am by no means an expert in analysing the performance of an array, any feedback is welcome.

Usage:

The following are the command line options available for the script

–logparser <path\filename>

–naviseccli <path\filename>

These options allow you to set the location of the various 3rd party binaries that the script relies on

–avg_interval <seconds>

The interval that the analyser stats are averaged down to, ie 1800 seconds will set the average interval to 30 minutes

–nar_file <filename>

The input NAR file(s), more than one NAR file can be specified

–out <filename>

The filename of the generated HTLM output file, the default is HEATMAP.HTML

–array_name <name>

The name of the array, this is for reporting purposes only

–summary

Generate a summary for each attribute type, displaying minimum, average and maximum figures for each metric

Metric

Min

Max

Avg

Disk Utilization (%)

0.27

97.38

8.16

Disk Total Throughput (IO/s)

0.72

309.06

23.03

SP Utilization (%)

38.03

61.11

49.78

SP Total Throughput (IO/s)

5588.70

25693.24

10088.56

–mash

–mashonly

This generated a mash table for each metric type (currently only Storage Processor and Disk), averaging each of the metrics for each type down into a single table, allowing you to combine multiple metrics into a single table.

The –mash option will display the mash-up table alongside the selected metrics, and the –mashonly options will only display the mash-up tables.

–get_drive_type

–array_ip <ip addresS>

This option allows you to query the array (as it is currently configured) to determine drive types (currently only SATA II, FIBRE CHANNEL, and SATA II SSD), currently this options only affects the IOPS calculations for drives.

–user_id <user id>

–pwd <password>

–scope <0|1>

These options are only used in conjunction with the –get_drive_type option, if you have cached credentials for the array configured then these options should not be necessary.

–display_drive_type

This option is only used in conjunction with the –get_drive_type option, it will display another drive table, and allow you to view the different drive type, drive size, pool and RAID Group layouts.

–disk_highlight

This option is only used in conjunction with the –get_drive_type and –display_drive_type options, it will highlight the corresponding disks in the other heatmaps, it will also allow you to select all of the drives that have the same attributes as displayed by the –display_drive_type option.

–config_file <filename>

–generate_config <filename>

These options allow you to generate a configuration file, using the –generate_config option, this will generate a config file with all of the default attributes, and to use a configuration file using the –config_file option.

–help

This option will display the following help information.

 

Heatmap Generator

 

Usage: heatmap.3.010.pl <options>

Where options can be the following:

–logparser <path\filename> – The path to the logpaser executable (c:/Program Files/Log Parser 2.2/LogParser.exe)

–naviseccli <path\filename> – The path to the naviseccli executable (c:/Program Files/EMC/Navisphere CLI/NaviSECCli.exe)

–avg_interval <seconds>     – The interval in seconds that the stats are averaged at (1800 seconds)

–nar_file <filename>        – NAR input file(s) this option can be specified multiple times

–out <filename>             – The output filename (heatmap.html)

–array_name <name>          – Set the name of the array in the report

–summary                    – Displays a summary of each metric

–mash                       – Creates a mash-up of each metric per object type, and displays alongside other metrics

–mashonly                   – Creates a mash-up of each metric per object type, and only displays the mash-ups

–get_drive_type             – Query Array to get drive type information

–array_ip <ip addresS>      – Set the IP address of the array

–user_id <user id>          – Set the user ID to log into the array

–pwd <password>             – Set the user password of the array

–scope <0|1>                – Set the Scope of the array account

–display_drive_type         – Display the drive types in the charts

–disk_highlight             – Highlight drives on mouse over

–config_file <filename>     – Use a configuration file to set attribute min/maxes

–generate_config <filename> – Generate a configuration file using the defined defaults

–help                       – This help

–attrib <attribute>         – Set the attribute to graph, where attribute can be the following

d_utilization – display stats based on disk utilization (%)

d_iops        – display stats based on disk Total Throughput (IOPS)

d_r_iops      – display stats based on disk read IOPS

d_w_iops      – display stats based on disk write IOPS

d_queue       – display stats based on disk Queue Length

d_b_queue     – display stats based on disk Average Busy Queue Length

d_response    – display stats based on disk Response Time

d_service     – display stats based on disk Service Time

d_bandwidth   – display stats based on disk Total Bandwidth

d_r_bandwidth – display stats based on disk Read Bandwidth

d_w_bandwidth – display stats based on disk Write Bandwidth

d_r_size      – display stats based on disk Read Size

d_w_size      – display stats based on disk Write Size

d_seek        – display stats based on disk Average Seek Distance

 

s_utilization – display stats based on SP utilization

s_response    – display stats based on SP Response Time

s_bandwidth   – display stats based on SP Total Bandwidth

s_iops        – display stats based on SP Total Throughput (IOPs)

s_b_queue     – display stats based on SP Average Busy Queue Length

s_service     – display stats based on SP Service Time

s_c_dirty     – display stats based on SP Cache Dirty Pages (%)

s_c_flush     – display stats based on SP Cache Flush Ratio

s_c_flush_mb  – display stats based on SP Cache MBs Flushed (MB/s)

s_c_hw_flush  – display stats based on SP Cache High Water Flush On

s_c_i_flush   – display stats based on SP Cache Idle Flush On

s_c_lw_flush  – display stats based on SP Cache Low Water Flush Off

s_wc_flush    – display stats based on SP Write Cache Flushes/s

s_fc_dirty    – display stats based on FAST Cache Dirty Pages (%)

s_fc_flush_mb – display stats based on FAST Cache MBs Flushed (MB/s)

 

Running the script without any –nar_file options will result in the script prompting the user to supply NAR file(s)

–attrib <attribute>

Allows you to select which attributes you want to display on the heatmap.

New Article – VNX5700 Configuration Guidelines

I’ve added a new article to the articles section of the blog. This one is basically a rehash of the recent posts I did on the VNX7500, but focussed on the VNX5700 instead. As always, your feedback is welcome.

EMC – Configure FAST Cache disks with naviseccli

I’m sorry I couldn’t think of a fancy title for this post, but did you know you can configure FAST Cache with naviseccli? I can’t remember whether I’ve talked about this before or not. So just go with it. This one’s quick and dirty, by the way. I won’t be talking about where you should be putting your EFDs in the array. That really depends on the model array you have and the number of EFDs at your disposal. But don’t just go and slap them in any old way. Please, think of the children.

To use FAST Cache, you’ll need:

  • The FAST Cache enabler installed;
  • EFD disks that are not in a RAID group or Storage Pool;
  • To have configured the FAST Cache (duh);
  • The correct number of disks for the model of CLARiiON or VNX you’re configuring; and
  • To have enabled FAST Cache for the RAID group LUNs and/or the pools with LUNs that will use FAST Cache.

Basically, you can run the following switches after the standard naviseccli -h sp-ip-address

cache -fast -create – this creates FAST Cache.

cache -fast -destroy – this destroys FAST Cache.

cache -fast -info – this displays FAST Cache information.

When you create FAST Cache, you have the following options:

cache -fast -create -disks disksList [-rtype raidtype] [-mode ro|rw] [-o]

Here is what the options mean:

-disks disksList – You need to specify what disks you’re adding, or it no worky. Also, pay close attention to the order in which you bind the disks.

-mode ro|rw – The ro is read only mode and rw is readwrite mode.

-rtype raidtype – I don’t know why this is in here, but valid RAID types are disk and r_1.

-o – Just do it and stop asking questions!

naviseccli cache -fast -create -disks 0_1_6 1_1_6 -mode rw -rtype r_1

In this example I’ve used disks on Bus 0, Enclosure 1, Disk 6 and Bus 1, Enclosure 1, Disk 6.

Need info about what’s going on? Use the following command:

cache -fast -info [-disks] [-status] [-perfData]

I think -perfdata is one of the more interesting options here.

EMC – Sometimes RAID 6 can be a PITA

This is really a quick post to discuss how RAID 6 can be a bit of a pain to work with when you’re trying to combine traditional CLARiiON / VNX DAEs and Storage Pool best practices. It’s no secret that EMC strongly recommend using RAID 6 when you’re using SATA-II / NL-SAS drives that are 1TB or greater. Which is a fine and reasonable thing to recommend. However, as you’re no doubt aware, the current implementation of FAST VP uses Storage Pools that require homogeneous RAID types. So you need multiple tools if you want to run both RAID 1/0 and RAID 6. If you want a pool that can leverage FAST to move slices between EFD, SAS, and NL-SAS, it all needs to be RAID 6. There are a couple of issues with this. Firstly, given the price of EFDs, a RAID 6 (6+2) of EFDs is going to feel like a lot of money down the drain. Secondly, if you stick with the default RAID 6 implementation for Storage Pools, you’ll be using 6+2 in the private RAID groups. And then you’ll find yourself putting private RAID groups across backend ports. This isn’t as big an issue as it was with the CX4, but it still smells a bit ugly.

What I have found, however, is that you can get the CLARiiON to create non-standard sized RAID 6 private RAID groups. If you create a pool with 10 spindles in RAID 6, it will create a private RAID groups in a 8+2 configuration. This seems to be the magic number at the moment. If you add 12 disks to the pool it will create 2 4+2 private RAID groups, and if you use 14 disks it will do a 6+2 and a 4+2 RAID group. Now, the cool thing about 10 spindles in a private RAID group is that you could, theoretically (I’m extrapolating from the VNX Best Practices document here), split the 8+2 across two DAEs in a 5+5. In this fashion, you can increase the rebuild times slightly in the event of a disk failure, and you can also draw some sensible designs that fit well in a traditional DAE4P. Of course, creating your pools in increments of 10 disks is going to be a pain, particularly for larger Storage Pools, and particularly as there is no re-striping of data done after a pool expansion. But I’m sure EMC are focussing on this issue in the future, as a lot of customers have had a problem with the initial approach. The downside to all this, of course, is that you’re going to suffer a capacity and, to a lesser extent, performance penalty by using RAID 6 across the board. In this instance you need to consider whether FAST VP is going to give you the edge over split RAID pools or traditional RAID groups.

I personally like the idea of Storage Pools, and I’m glad EMC have gotten on-board with them in their midrange stuff. I’m also reasonably optimistic that they’re working on addressing a lot of issues that have come up in the field. I just don’t know when that will be.

EMC CLARiiON VNX7500 Configuration guidelines – Part 3

One thing I didn’t really touch on in the first two parts of this series is the topic of RAID Groups and binding between disks on the DPE / DAE-OS and other DAEs. It’s a minor point, but something people tend to forget when looking at disk layouts. Ever since the days of Data General, the CLARiiON has used Vault drives in the first shelf. For reasons that are probably already evident, these drives, and the storage processors, are normally protected by a Standy Power Supply (SPS) or two. The SPS provides enough battery power in a power failure scenario such that cache can be copied to the Vault disks and data won’t be lost. This is a good thing.

The thing to keep in mind with this, however, is that the other DAEs in the array aren’t protected by this SPS. Instead, you plug them into UPS-protected power in your data centre. So when you lose power with those, they go down. This can cause “major dramas” with Background Verify operations when the array is rebooted. This is a sub-optimal situation to be in. The point of all this is that, as EMC have said for some time, you should bind RAID groups across disks that are either contained in that first DAE, or exclusive to that DAE.

Now, if you really must do it, there are some additional recommendations:

  • Don’t split RAID 1 groups between the DPe and another DAE;
  • For RAID 5, ensure that at least 2 drives are outside the DPE;
  • For RAID 6, ensure that at least 3 drives are outside the DPE;
  • For RAID 1/0 – don’t do it, you’ll go blind.

It’s a minor design consideration, but something I’ve witnessed in the field when people have either a) tried to be tricky on smaller systems, or b) have been undersold on their requirements and have needed to be creative. As an aside, it is also recommended that you don’t include drives from the DPE / DAE-OS in Storage Pools. This may or may not have an impact on your Pool design.

EMC – Configure the Reserved LUN Pool with naviseccli

I’ve been rebuilding our lab CLARiiONs recently, and wanted to configure the Reserved LUN Pool (RLP) for use with SnapView and MirrorView/Asynchronous. Since I spent approximately 8 days per week in Unisphere recently performing storage provisioning, I’ve since made it a goal of mine to never, ever have to log in to Unisphere to do anything again. While this may be unattainable, you can get an awful lot done with a combination of Microsoft Excel, Notepad and naviseccli.

So I needed to configure a Reserved LUN Pool for use with MV/A, SnapView Incremental SAN Copy, and so forth. I won’t go into the reasons for what I’ve created, but let’s just say I needed to create about 50 LUNs and give them each a label. Here’s what I did:

Firstly, I created a RAID Group with an ID of 1 using disks 5 – 9 in the first enclosure.

C:\>naviseccli -h 256.256.256.256 createrg 1 0_0_5 0_0_6 0_0_7 0_0_8 0_0_9

It was then necessary to bind a series of 20GB LUNs to use, 25 for each SP. If you’re smart with Excel you can set the following command to do this for you with little fuss.

C:\>naviseccli -h 256.256.256.256  bind r5 50 -rg 1 -aa 0 -cap 20 -sp a -sq gb  

Here I’ve specified the raid-type (r5), the lun id (50), the RAID Group (1),  -aa 0 (disabling auto-assign), -cap (the capacity), -sp (a or b), and the -sq (size qualifier, which can be mb|gb|tb|sc|bc). Note that if you don’t specify the LUN ID, it will automatically use the next available ID.

So now I’ve bound the LUNs, I can use another command to give them a label that corresponds with our naming standard (using our old friend chglun):

C:\>naviseccli -h 256.256.256.256 chglun -l 50 -name TESTLAB1_RLP01_0050

Once you’ve created the LUNs you require, you can then add them to the Reserved LUN Pool with the reserved command.

C:\>naviseccli -h 256.256.256.256 reserved -lunpool -addlun 99

To check that everything’s in order, use the -list switch to get an output of the current RLP configuration.

C:\>naviseccli -h 256.256.256.256 reserved -lunpool -list
Name of the SP:  GLOBAL
Total Number of LUNs in Pool:  50
Number of Unallocated LUNs in Pool:  50
Unallocated LUNs:  53, 63, 98, 78, 71, 56, 88, 69, 92, 54, 99, 79, 72, 58, 81, 5
7, 85, 93, 61, 96, 67, 76, 86, 64, 50, 66, 52, 62, 68, 77, 89, 70, 55, 65, 91, 8
0, 73, 59, 82, 90, 94, 84, 97, 74, 60, 83, 95, 75, 87, 51
Total size in GB:  999.975586
Unallocated size in GB:  999.975586
Used LUN Pool in GB:  0
% Used of LUN Pool:  0
Chunk size in disk blocks:  128
No LUN in LUN Pool associated with target LUN.
C:\>

If, for some reason, you want to remove a LUN from the RLP, and it isn’t currently in use by one of the layered applications, you can use the -rmlun switch.

C:\>naviseccli -h 256.256.256.256 reserved -lunpool -rmlun 99 -o

If you omit the override [-o] option, the CLI prompts for confirmation before removing the LUN from reserved LUN pool. It’s possible to argue that, with the ability to create multiple LUNs from Unisphere, it might be simpler to not worry about naviseccli, but I think that it’s a very efficient way to get things done quickly, particularly if you’re working in a Unisphere domain with a large number of CLARiiONs, or on a workstation that has some internet browser “issues”.

EMC – Silly things you can do with stress testing – Part 2

I’ve got a bunch of graphs that indicate you can do some bad things to EFDs when you run certain SQLIO stress tests against them and compare the results to FC disks. But EMC is pushing back on the results I’ve gotten for a number of reasons. So in the interests of keeping things civil I’m not going to publish them – because I’m not convinced the results are necessarily valid and I’ve run out of time and patience to continue testing. Which might be what EMC hoped for – or I might just be feeling a tad cynical.

What I have learnt though, is that it’s very easy to generate QFULL errors on a CX4 if you follow the EMC best practice configs for Qlogic HBAs and set the execution throttle to 256. In fact, you might even be better off leaving it at 16, unless you have a real requirement to set it higher. I’m happy for someone to tell me why EMC suggests it be set to 256, because I’ve not found a good reason for it yet. Of course, this is dependent on a number of environmental factors, but the 256 figure still has me scratching my head.

Another thing that we uncovered during stress testing had something to do with the Queue Depth of LUNs. For our initial testing, we had a Storage Pool created with 30 * 200GB EFDs, 70 * 450GB FC spindles, and 15 * 1TB SATA-II Spindles with FAST-VP enabled. The LUNs on the EFDs were set to no data movement – so everything sat on the EFDs. We were getting kind of underwhelming performance stats out of this config, and it seems like the main culprit was the LUN queue depth. In a traditonal RAID Group setup, the queue depth of the LUN is (14 * (the number of data drives in the LUN) + 32). So for a RAID 5 (4+1) LUN, the queue depth is 88. If, for some reason, you want to drive a LUN harder, you can increase this by using MetaLUNs, with the sum of the components providing the LUN’s queue depth. What we observed on the Pool LUN, however, was that this seemed to stay fixed at 88, regardless of the number of internal RAID Groups servicing the Pool LUN. This seems like it’s maybe a bad thing, but that’s probably why EMC quietly say that you should stick to traditional MetaLUNs and RAID Groups if you need particular performance characteristics.

So what’s the point I’m trying to get at? Storage Pools and FAST-VP are awesome for the majority of workloads, but sometimes you need to use more traditional methods to get what you want. Which is why I spent last weekend using the LUN Migration tool to move 100TB of blocks around the array to get back to the traditional RAID Group / MetaLUN model. Feel free to tell me if you think I’ve gotten this arse-backwards too, because I really want to believe that I have.

EMC – Unisphere Dashboard

I went away for a few days this week and came back to a Unisphere dashboard that looked slightly, er, funky.