EMC – VNX Pool LUN Allocation vs Default Owner

I had a question about this come up this week and thought I’d already posted something about it. Seems I was lazy and didn’t. If you have access, have a look at Primus article emc311319 on EMC’s support site. If you don’t, here’s the rough guide to what it’s all about.

When a Storage Pool is created, a large number of private LUNs are bound on all the Pool drives and these are divided up between SP A and B. When a Pool LUN is created, it uses the allocation owner to determine which SP private LUNs should be used to store the Pool LUN slices. If the default and current owner are not the same as the allocation owner, the I/O will have to be passed over the CMI bus between SP, to reach the Pool Private FLARE LUNs. This is a bad thing, and can lead to higher response times and general I/O bottlenecks.

OMG, I might have this issue, what should I do? You can change the default owner of a LUN by accessing the LUN properties in Unisphere. You can also change the default owner of a LUN thusly.

naviseccli -h <SP A or B> chglun -l <metalun> -d owner <0|1>

where

-d owner 0 = SP A
-d owner 1 = SP B

But what if you have too many LUNs where the allocation owner sits on one SP? And when did I start writing blog posts in the form of a series of questions? I don’t know the answer to the latter question. But for the first, the simplest remedy is to create a LUN on the alternate SP and use EMC’s LUN migration tool to get the LUN to the other SP. Finally, to match the current owner of a LUN to the default owner, simply trespass the LUN to the default owner SP.

Note that this is a problem from CX4 arrays through to VNX2 arrays. It does not apply to traditional RAID Group FLARE LUNs though, only Pool LUNs.

EMC – FAST Cache and LUN expansion or shrink operations

Someone on twitter asked me about a white paper they were reading on the EMC site recently that suggested that LUN expansion or shrink operations would require that FAST Cache be disabled. The white paper in question is located here. For those of you loitering on Powerlink the EMC Part Number is h8046.7. In any case, on page 8 it covers a number of requirements for using FAST Cache – most of which seem fairly reasonable. However, this one kind of got my attention (once my attention was drawn to it by @andrewhatfield) – “Once FAST Cache has been created, expansion or shrink operations require disabling the cache and re-creating the FAST Cache“. Wow. So if I want to do a LUN expansion I need to delete and re-create FAST Cache once it’s complete? Seriously? I informally confirmed this with my local Account TC as well.

It takes a while to create FAST Cache on a busy VNX. It takes even longer to disable it on a busy system. What a lot of piss-farting around to do something which used to be a fairly trivial operation (the expansion I mean). Now, I’ll be straight with you, I haven’t had the opportunity to test what happens if I don’t disable FAST Cache before I perform these operations. Knowing my luck the damn thing will catch on fire. But it’s worth checking this document out before you pull the trigger on FAST Cache.

[Edit: Or maybe they mean if you want to expand or shrink the FAST Cache? Because that makes sense. I hope that’s what they mean.]

[Edit #2: Joe (@virtualtacit) kindly clarified that this requirement relates to the shrinking or expansion of FAST Cache, not LUNs. My bad! Nothing to see here, move along :)]

EMC – Boot from SAN MSCS Cluster configuration

Disclaimer: I haven’t done a Windows-based CLARiiON host-attach activity in about 4 or 5 years. And it’s been a bit longer than that since I did boot from SAN configurations. So you can make of this what you will. We’ve been building a Windows 2008 R2 Boot from SAN cluster lately. We got to the point where we were ready to add the 60+ LUNs that the cluster would use. The initial configuration had 3 hosts in 3 storage groups with their respective boot LUNs. I had initially thought that I’d just create another Storage Group for the cluster’s volumes and add the 3 hosts to that. All the time I was trying to remember the rule about multiple hosts or multiple LUNs in a Storage Group. And of course I remembered incorrectly.

To get around this issue, I had to add each LUN (there are about 67 of them) to each Storage Group for the cluster nodes. And ensure that they had consistent host IDs across the Storage Groups. Which has worked fine, but isn’t, as Unisphere points out, recommended. There’s also an issue with the number of LUNs I can put in a Consistency Group (32) – but that’s a story for another time.

EMC Unisphere – Basics – Part 2 – Configure Hypervisor Information

I’ve added a new article to the articles page that discusses the steps required to configure EMC’s Unisphere with Hypervisor Information that helps with LUN – VM and VM – LUN mappings. If you don’t run hypervisors than I guess this article won’t mean that much to you.

CLARiiON Hot Spare Rebuild Progress and naviseccli

We’re upgrading our CX4-960s to FLARE 30 tonight and, after a slew (a slew being aproximately equal to 4) of disk failures and replacements over the last few weeks, we’re still waiting for one of the SATA-II disks to rebuild. Fortunately, EMC has a handy knowledge base article entitled “What is a CLARiiON proactive hot spare?”, which talks about how to go about using proactive hot spares on the CLARiiON. You can find it on Powerlink as emc150779. The side benefit of this article is that it provides details on how to query the rebuild status of hot spare disks and the RAID groups they’re sparing for.

Using naviseccli, you can get the rebuild state of the disk thusly:

I:\>naviseccli -h SP_IP_address -user username -scope 0 getdisk 3_4_3 -state -rb

Enter password:

Bus 3 Enclosure 4 Disk 3

State: Equalizing

Prct Rebuilt: 241: 100 820: 100 7056: 100 275: 100 7460: 100 7462: 100 321: 100 341: 100 270: 100 250: 32

In this case, I wanted to query the status of the disk in Bus 3 / Enclosure 4 / Disk 3. As you can see from the above example – LUN 250 is at 32%. You can also see the status of the rebuild by looking at the properties of the LUN that is being equalized.

So we should be done in time for the code upgrade. I’ll let you know how that works out for us.

By the numbers – Part 1

As I mentioned in the previous post, I’ve being working on a large data migration project. After a brief hiatus, I’m back at work, and thought I’d take a moment to share what I’ve done so far.

  • Attached 56 hosts via dual paths to the new array.
  • Created 234 new zonesets.
  • Created 23 Storage groups.
  • Created 131 RAID Groups.
  • Added 26 hot spare disks.
  • Designed and provisioned 620 LUNs. This includes 52 4-component MetaLUNs.
  • Established 33 Incremental SAN Copy Sessions.

I don’t know how many sVMotions I’ve done so far, but it feels like a lot. I can’t exactly say how many TB I’ve moved yet either, but by the end we’ll have moved over 112TB of configured storage. Once I’ve finished this project – by end of June this year – I’ll tally up the final numbers and make a chart or something.