EMC – DIY Heatmaps – Updated Version

Mat has updated the DIY Heatmaps script to support SAS-type Flash drives. Download it from here, take it for a spin and let us know what you think. And tell your friends.

EMC – CLARiiON / VNX Disk Addresses

Mat came across this about 9 months ago, and I’d forgotten to post it here. I don’t know when it happened, or whether it’s always been like this, but EMC have apparently used letters to reference disks in VNX and CLARiiON systems for a while now. It looks something like this:

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

0

1

2

3

4

5

6

7

8

9

A B C D E

What we can’t work out is what it looks like when you’ve got a 60-disk DAE in play. In any case, I verified that it works with naviseccli on our CX4 arrays.

C:\Users\penguinpunk>naviseccli -h arraysp1 getdisk 1_3_14
Bus 1 Enclosure 3 Disk 14
Vendor Id: HITACHI
Product Id: HUS15606 CLAR600
Product Revision: C3A8
Lun: Unbound
Type: N/A
State: Enabled
Hot Spare: NO
Prct Rebuilt: Unbound
Prct Bound: Unbound

[snip]

Current Speed: 4Gbps
Maximum Speed: 4Gbps

And here it is but using E instead of 14.


C:\Users\penguinpunk>naviseccli -h arraysp1 getdisk 1_3_E
Bus 1 Enclosure 3 Disk 14
Vendor Id: HITACHI
Product Id: HUS15606 CLAR600
Product Revision: C3A8
Lun: Unbound
Type: N/A
State: Enabled
Hot Spare: NO
Prct Rebuilt: Unbound
Prct Bound: Unbound

[snip]

Current Speed: 4Gbps
Maximum Speed: 4Gbps

EMC – DIY Heatmaps – Updated Version

Mat has updated the DIY Heatmaps for EMC CLARiiON and VNX arrays to version 4.01. You can get it from the Utilities page. Any and all feedback welcome.

Updates and Changes to the script

  • Add database storage / retrieval for performance stats. The database size will be approximately 2.1 x the size of the NAR file based on the default interval of 30 minutes. On my PC it took a bit over 9 hours to process 64 NAR files into a database, the NAR files were 1.95GB and the resulting database was 4.18GB. However running the script over the database to produce a heatmap only takes seconds.
  • Changed to use temporary tables for transitional data.  This should slightly reduce the size of the database file, as the temporary data is not written to disk.
  • Changed the way the script processes multiple NAR files, the script previously bunched all NAR files into a single naviseccli process, this was problematic if you were processing multiple large NAR files, the script now processes them one at a time.
  • Add command line options:

–output_db                               Output the processed NAR file to the nominated database

–input_db                                  Use the nominated database as the source of data for the heatmap

–s_date                                       Specify a start date/time must be in the format (with quotes if specifying date and time “mm/dd/yyyy hh:mm:ss”

–e_date                                       Specify an end date/time

–retrieve_all_nar                     When retrieving NAR files from the array, you can now retireve all nar files (it wont overwrite files already downloaded)

–process_only_new                 If you are downloading NAR files, only process files that haven’t been downloaded previously

–max_nar_files                        Set the maximum number of files to download and process

 

Please let us know if you find any bugs or problems with the script, or if you have any further suggestions for changes and enhancements.

Thanks

Mat.

EMC – DIY Heatmaps – Updated Version

Mat has updated the DIY Heatmaps for EMC CLARiiON and VNX arrays to version 3.0211. You can get it from the Utilities page here. Any and all feedback welcome. Changes below:

Add –min_colour, –mid_colour, –max_colour options (just a change of spelling of colour)

Remove case sensitivity for colours

Added FC SSD drive type

EMC – Using naviseccli to locate unused LUNs

I know I carry on like a bit of a pork chop when it comes to naviseccli – but I think some of the adulation is worth it. We’ve been trying to “reclaim storage” for a little while now (that’s code for not being allowed to spend money on additional capacity), and I’ve been looking for unused LUNs that may have been decommissioned and not unbound. To do this with naviseccli, use the getunusedluns command.

C:\Users\dan>naviseccli -h SP_IPaddress getunusedluns
RAID GROUP: 43
DC2_SAN_R5_ABCD_0923
RAID GROUP: 0
LUN 0 - VAULT - DO NOT USE

The output is fairly straightforward – you’ll get the RAID Group, followed by a list of LUN IDs that aren’t in use either by storage groups or replication features of the array (so you don’t accidentally destroy secondary images of mirrored LUNs, etc). If you’re happy with the list, you can go ahead and unbind those LUNs.

One interesting side-effect of using this command is that you’ll get a 71058233 error (‘Failing Command: Set LUN’) in the SP’s event logs. There’s a kb article on Powerlink (emc170736) for those of you who have access. If you don’t, here’s the root cause – “The getunusedluns command tries to determine if a LUN is unbound by issuing commands to the LUN to unload features (such as mirroring). If they cannot be unbound (by being unloaded), the LUN is not considered unused and these error messages show up in the log file.” So apparently it’s polite if you let the Ops team know before you run this command.

 

EMC – CX4 FAST Cache cosmetic issues and using /debug

I noticed that one of our CX4s was exhibiting some odd behaviour the other day. When looking at the System Information window, I noticed that FAST Cache seemed broken. Here’s a picture of it.

Going to the FAST Cache tab on System Properties yielded the same result, as did the output of naviseccli (using naviseccli -h IPaddress cache -fast -info). Interestingly, though, it was still showing up with dirty pages.

We tried recreating it, but the 8 * 100GB EFDs we were using for FAST Cache weren’t available. So we logged a call, and after a bit of back and forth with support, worked out how to fix it. A few things to note first though. If support tell you that FAST Cache can’t be used because you’re using EFDs, not SSDs, ask to have the call escalated. Secondly, the solution I’m showing here fixes the specific problem we had. If you frig around with the tool you may end up causing yourself more pain than it’s worth.

So, to fix the problem we had, we needed to log in to the /debug page on the CX4. To do this, go to http://<yourSPaddress>/debug.

You’ll need your Navisphere or LDAP credentials to gain access. Once you’ve logged in, the page should look something like the following (paying particular attention to the warning).

 Now scroll down until you get to “Force A Full Poll”. Click on that and wait a little while.

Once this is done, you can log back into Unisphere and FAST Cache should look normal again.

 Hooray!

EMC – DIY Heatmaps – Updated Version

Mat has updated the DIY Heatmaps for EMC CLARiiON and VNX arrays to version 3.021. You can get it from the Utilities page here. Any and all feedback welcome. Changes below:

Add command line options:

 

–min_color –mid_color –max_color

To allow the user to select different color schemes for their heatmap graphs. The available colors to choose from are (red, green, blue, yellow, cyan, magenta, purple, orange, black, white)

 

–steps

Change the granularity of the heatmap steps, for example on an attribute like % Utilization, if steps is set to 20, there will be different color bands for 0-4%, 5-9%, 10-14%,etc the default is 10 so color bands will be at 0-9%,10-19%,20-29%, etc

 

–detail_data

This option will allow you to display detail heat graph for an object over time when it has been selected. For example, selecting the SP-B heatmap object below, produces a heat graph for that object over the duration of the NAR file.  Thanks to Ian  for the idea and code behind this.

There have been some other script improvements that:

Add exit code checking after running naviseccli

Browser compatibility fixes – mainly with Chrome, but this should improve display consistency across different browser platforms

EMC – Using naviseccli to change the LUN Migration priority of a LUN

I am currently working through a metric shitload of LUN migrations on one of our CX4 arrays, with the end-goal being a replacement of 60 or so 300GB FC disks with 60 600GB FC disks. This is the sort of stuff you have to do when you’re not allowed to buy new arrays (budget notwithstanding). But enough moaning.

I’ve set up a number LUN migrations to get this done, but Mat says I shouldn’t run them at a high priority during business hours. I say if the spurs fit. But whatever. With naviseccli, you can list all of the LUNs that are currently migrating on the array. Once you have this list, you can grep for the LUN IDs and modify the migration priority depending on whether it’s during or after hours.

C:\Users\dan>naviseccli -h SPA migrate -list
Source LU Name: LUN 7010
Source LU ID: 7010
Dest LU Name: LUN 6
Dest LU ID: 6
Migration Rate: HIGH
Current State: MIGRATING
Percent Complete: 89
Time Remaining: 24 minute(s)

Source LU Name: LUN 7018
Source LU ID: 7018
Dest LU Name: LUN 8
Dest LU ID: 8
Migration Rate: HIGH
Current State: MIGRATING
Percent Complete: 90
Time Remaining: 20 minute(s)

Source LU Name: LUN 600
Source LU ID: 600
Dest LU Name: LUN 1
Dest LU ID: 1
Migration Rate: MEDIUM
Current State: MIGRATING
Percent Complete: 44
Time Remaining: 8 hour(s) 23 minute(s)
[snip]

 

So let’s say I want to change the migration priority for LUN ID 7018. All I need to use is the modify command, specify the correct source ID, and the set the rate to low | medium | high | asap. I’m using -o to avoid prompting.

C:\Users\dan>naviseccli -h SPA migrate -modify -source 7018 -rate medium -o

Now I can use the -list switch to verify my work. Note that it takes the array a little while to calculate the new estimate for  Time Remaining. So if you really want to know that, give it a minute or so.

C:\Users\dan>naviseccli -h SPA migrate -list
Source LU Name: LUN 7010
Source LU ID: 7010
Dest LU Name: LUN 6
Dest LU ID: 6
Migration Rate: HIGH
Current State: MIGRATING
Percent Complete: 90
Time Remaining: 22 minute(s)

Source LU Name: LUN 7018
Source LU ID: 7018
Dest LU Name: LUN 8
Dest LU ID: 8
Migration Rate: MEDIUM
Current State: MIGRATING
Percent Complete: 91
Time Remaining: ?

Source LU Name: LUN 600
Source LU ID: 600
Dest LU Name: LUN 1
Dest LU ID: 1
Migration Rate: MEDIUM
Current State: MIGRATING
Percent Complete: 44
Time Remaining: 8 hour(s) 23 minute(s)
[snip]

C:\Users\dan>

And that’s pretty much it.

EMC – Manipulating the Write Intent Log LUNs with naviseccli

Last week Mat was wanting to migrate some WIL LUNs on one of our CX4s to a different RAID Group. Unfortunately, one does not simply migrate WIL LUNs. Firstly, once they’re allocated, they’re treated as Private LUNs by FLARE, so you can’t use Virtual LUN technology to migrate them. Secondly, if you have active MirrorView mirrors in place, you’ll get the following error when trying to de-allocate the WIL LUNs via Unisphere: “Unable to deallocate WIL since in use (0x71058199)[0x71058199]“. Note that fracturing the mirrors is not sufficient to avoid this error.

So there’s a few things you need to do to resolve this. Firstly, confirm the WIL LUNs in use on the array (there’s only 2).

naviseccli -user User_ID -scope 0 -password xxxxxxxx -h 1.1.1.1 mirror -sync -listlog
Storage Processor: SP A
Lun Number: 10000
Storage Processor: SP B
Lun Number: 10001

Now list all the mirrors that are using the WIL. Note that it lists the primary and secondary mirrors, so you’ll need to extraxt the primary mirrors from the list.

naviseccli -user User_ID -scope 0 -password xxxxxxxx -h 1.1.1.1 mirror -sync -list -usewriteintentlog
MirrorView Name: DC1_FC_R5_DATA_0002_SRM
Write Intent Log Used: YES
[snip]

Now the simplest thing to do is create a script that runs the command below for each of the mirrors in your list of primaries. This command disable the use of the WIL, and -o won’t prompt for confirmation.

naviseccli -user User_ID -scope 0 -password xxxxxxxx -h 1.1.1.1 mirror -sync -change -name DC1_FC_R5_DATA_0002_SRM -usewriteintentlog no -o

Now that you’ve done that, you can allocate the new WIL LUNs.

naviseccli -user User_ID -scope 0 -password xxxxxxxx -h 1.1.1.1 mirror -sync -allocatelog -spA lun_ID1 -spB lun_ID2

Now run your script again, changing -usewriteintentlog from no to yes.

naviseccli -user User_ID -scope 0 -password xxxxxxxx -h 1.1.1.1 mirror -sync -change -name DC1_FC_R5_DATA_0002_SRM -usewriteintentlog yes -o

And now you can get rid of those old WIL LUNs. Huzzah!

EMC – Check on a RAID Group’s defragmentation progress with naviseccli

For those of you still using RAID Groups on the CX4, you’ll know that you sometimes need to defragment them to get a contiguous amount of free space, particularly if you’ve unbound LUNs in the group that sit between other, remaining LUNs. If you kick off a whole bunch of defrags and then want a quick way to check the progress, why not use naviseccli? The getrg option is the one you want to use in this example.

naviseccli -h 256.256.256.256 getrg -prcntdf

The output will look something like this:

RaidGroup ID: 0
Percent defragmented: 100
RaidGroup ID: 1
Percent defragmented: 100
RaidGroup ID: 2
Percent defragmented: 100
RaidGroup ID: 3
Percent defragmented: 1
RaidGroup ID: 4
Percent defragmented: 1
RaidGroup ID: 5
Percent defragmented: 100
RaidGroup ID: 6
Percent defragmented: 2
RaidGroup ID: 7
Percent defragmented: 1
RaidGroup ID: 8
Percent defragmented: 100