CLARiiON Virtual LUN Technology and Layered Applications

I recently had one of those head-slapping moments where a ridiculously simple thing had me scratching my head and sifting through release notes to find out how to do something that I’d done many times before, but couldn’t get working this time. I have a client with some EMC CLARiiON AX4-5 arrays running full Navisphere and MirrorView/Asynchronous. He was running out of space on a NetWare fileserver LUN and needed to urgently expand the volume. As there was some space on another Raid Group, and he was limited by the size of the secondary image he could create on the DR array, we came up with a slightly larger size and I suggest using the LUN Migration tool to perform the migration. EMC calls this “Virtual LUN Technology”, and, well, it’s a pretty neat thing to have access to. I think it came in with FLARE Release 16 or 19, but I’m not entirely sure.

In any case, we went through the usual steps of creating a larger LUN on the Raid Group, removing the MirrorView relationship, but, for some reason, we couldn’t see the newer, larger LUN. I did some testing and found that we could migrate to a LUN that was the same size, but not a larger LUN. This was strange, as I thought we’d removed the MirrorView relationship and freed the LUN from any obligations it may have felt to the CLARiiON’s Layered Applications. To wit, the latest FLARE release notes refer to this limitation, which also applies to the CX4 – “If a layered application is using the source LUN, the source LUN can be migrated only to a LUN of the same size”. What I didn’t realise, until I’d spent a few hours on this, was that the SAN Copy sessions I’d created originally to migrate the LUNs from the CX200 to the AX4-5 were still there. Even though they weren’t active (the CX200 is no long gone), Navisphere wasn’t too happy about the idea that the LUN in question would be bigger than it was originally. Removing the stale SAN Copy sessions allowed me to migrate the LUN to the larger destination, and from a NetWare perspective things went smoothly. Of course, recreating the secondary image on the DR array required a defrag of the RAID Group to make enough space for the larger image, but that’s a story for another time.

What have I been doing? – Part 1

I recently had the “pleasure” of working on a project before Christmas that had a number of, er, interesting elements involved.  During the initial scoping, the only thing mentioned was two new arrays (with MirrorView/Asynchronous), a VMware ESX upgrade, and a few new ESX hosts.  But here’s what there really was:

– 4 NetWare 6.5 hosts in 2 NCS clusters;
– An EMC CLARiiON CX200 (remember them?) hosting a large amount (around 5TB) of NetWare and VMware data;
– A single McData switch running version 7 firmware;
– 2 new Dell hosts with incompatible CPUs with the existing 2950 hosts;
– A memory upgrade to the two existing nodes that meant one host had 20GB and the other had 28GB;
– A MirrorView target full of 1TB SATA-II spindles;
– A DR target with only one switch;
– Singley-attached (ie one HBA) hosts everywhere;
– An esXpress installation that needed to be upgraded / re-installed;
– A broken VUM implementation.

Hmmm, sound like fun? It kind of was, just because some of the things I had to do to get it to work were things I wouldn’t normally expect to do.  I don’t know whether this is such a good thing.  There’re a number of things that popped up during the project, each of which would benefit from dedicated blog posts.  But given that I’m fairly lazy, I think I’ll try and cram it all into one post.

Single switches and single HBAs are generally a bad idea

<rant> When I first started working on SANs about 10 minutes ago, I was taught that redundancy in a mid-range system is a good thing. The components that go into your average mid-range systems, while being a bit more reliable than your average gamedude’s gear, are still prone to failure. So you build a level of redundancy into the system such that when, for whatever reason, a component fails (such as a disk, fibre cable, switch or HBA), the system stays up and running. On good systems, the only people who know there’s a failure are the service personnel called out to replace the broken component in question. On a cheapy system, like the one you keep the Marketing Department’s critical morning tea photos on, a few more people might know about it. Mid-range disk arrays can run into the tens and hundreds of thousands of dollars, so sometimes people think that they can save a bit of cash but cutting a few corners by, for example, leaving the nodes with single HBAs, or having only one switch at the DR site, or using SATA as a replication target. But I would argue that, given your spending all of this cash on a decent mid-range array, why wouldn’t you do all you can to ensure it’s available all the time? Saying “My cluster provides the resiliency / We’re not that mission critical / I needed to shave $10K off the price” strikes me as counter-intuitive to the goal of providing reliable, available and sustainable infrastructure solutions. </rant>

All that said, I do understand that sometimes the people making the purchasing decisions aren’t necessarily the best-equipped people to understand the distinction between single- and dual-attached hosts, and what good performance is all about. All I can suggest is that you start with a solid design, and do the best you can to keep that design through to deployment. So what should you be doing? For a simple FC deployment (let’s assume two switches, one array, two HBAs per host), how about something like this?

Zoning

Notice that there’s no connection between the two FC switches here. That’s right kids, you don’t want to merge these fabrics. The idea is that if you munt the config on one switch, it won’t automatically pass that muntedness on to the peer switch. This is a good thing if you, like me, like to do zoning from the CLI but occassionally forget to check the syntax and spelling before you make changes. And for the IBM guys playing at home, the “double redundant loops” excuse doesn’t apply to the CLARiiON. So do yourself a favour, and give yourself 4 paths to the box, dammit!

And don’t listen to Apple and get all excited about just using one switch either – not that they’re necessarily saying that, of course … Or that they’re necessarily saying anything much at all about storage any more, unless Time Capsules count as storage. But I digress …

EMC CLARiiON AX4-5

Jesse was mildly annoyed about being stuck on “low-end” CLARiiON projects recently. Well, I’m lucky enough to have been scraping a little lower down the barrel and have now deployed my first AX4-5. It’s branded as a CLARiiON, but just like its predecessors (the AX100 and AX150), it’s a little, er, different to your average CX or CX3. For a good overview of the tech specs and howtos, etc, have a look at the support site and click on “Learn”. I have only had the opportunity to initialise the array and haven’t loaded it up yet. The Navisphere Express interface is different to full Navisphere, but you can upgrade that if need be.

There are a few things to look out for when you configure one of these arrays. Firstly, it runs a variation of FLARE (Release 23 for those keeping score at home), and the first 4 disks (as opposed to the first 5 in the CX and CX3) are used for this code and other array features. This means that, when you configure a RAID Group (or Disk Pool), using these disks, you’ll lose approximately 17GB per spindle. It’s not as bad as the 33GB on the CX3, but it can still pooch your storage calculations. Especially when the type of customer deploying this array has little exposure to storage anyway, and might think that 1 TB equals 1024GB. Secondly, the hot spare needs to be on a disk other than a FLARE disk. This is nothing new, but again, these kind of deployments aren’t always done with these kind of constraints in mind. I have a bad feeling I’ll be doing a deployment shortly on a 4-disk SATA shelf that was sold without any hot spare at all. Thirdly, the only way you can assign LUNs (Virtual Disks) to different SPs is via Disk Pool membership. The manual suggests that you make at least 2 disk pools to balance the load across the SPs. Sounds good, except when you’re working on a 6-disk (total) deployment, one of which is already a hot-spare. Unless I did something nasty with 2 RAID 1/0 pools (I can’t afford the capacity penalty), I couldn’t do this. Watch out for storage designs that only look at capacity. Finally, if you’re getting it with SAS disks, make sure the “AX4-5 Expansion Pack” is ordered, as this allows you to actually use the SAS disks, rather than just look at them. A colleague of mine had to wait while this was shipped before he could finish a deployment. The job I did recently had the enabler pre-installed (I guess they rectified that issue). By the way, it looks like it’s manufactured by Foxconn (judging by the big foxconn label on the box and this news item). If nothing else, Foxconn’s press releases are hilarious.

Oh yeah, and we’re throwing a 4-node vi3 cluster and 18 guests at this little puppy. I can’t wait to watch it melt.