HP MSA array failover

I’ve blogged briefly about the MSA array before, thinking it was a reasonable piece of kit for the price, assuming your expectations were low. But I had a problem recently with a particular MSA2012fc and don’t know whether I’ve got it right or whether I’m missing something fundamental.

I had it setup in a DAS configuration. Interconnect was turned on, and loop was the default topology in place.  This worked fine for the 2 RHEL boxes attached to the host. Later I connected the array and 2 hosts to 2 Brocade 300 Switches with discrete fabrics. I changed the topology to point-to-point, and changed to straight-through from interconnect. This seemed like a reasonable thing to do based on my understanding of the admin, user and reference guides.

In a switched topology / straight-through / point-to-point connection, LUNs owned by a vdisk on controller A are only presented via paths from controller A. If controller A fails however, I don’t believe the vdisk fails over. If, however, a cable or switch fails, you’re covered, because each controller is cabled to each fabric. I believe this is why I saw two paths to everything – these being the fibre ports of the controller owning the vdisk that owns the LUN.

In a direct-attach / interconnect / loop setup, controllers mirror their peer’s LUNs via the higher ports, so Controller A presents paths to controller B’s LUNs via A1. In this setup, you could sustain a controller failure, as a vdisk would be presented via the peer.The problem with this, however, is that interconnect is never used in a switched environment. I don’t believe changing the ports to loop will help, nor would removing the switches.

Have I totally missed the point here? Has anyone else seen this? Was there a workaround? Or something fixed in later revs of the code? It seems strange that HP would advertise this as an active-active array, but only for DAS configs.

HP MSA2012fc and linux

Sometimes, I can be a real muppet. I had to install a new MSA2012fc array at a site yesterday, with a few extra MSA2000 expansion shelves. The FC switches hadn’t arrived yet, so we did something with direct-attach. The hosts were x64 RHEL 5 U2 with dual-port Qlogic HBAs. Reasonably simple stuff, my main concern being that the LUN design discussion with the DBAs would never end. So I initialized the array, setup some IP addresses, security and vdisks, mapped the host ports and created some aliases, and presented some test LUNs to the linux hosts to confirm we could see the volumes. I was about to bail when the customer said “so what if I pull this cable”. “Well, it should failover”. But it didn’t. Some teeth grinding and fiddling yielded little result, and we decided to revisit the issue today.

Today went a lot better. For a start, I used the Qlogic modprobe.conf settings specified in the installation document for “Device Mapper Multipath Enablement Kit for HP StorageWorks Disk Arrays v4.1.0”. This seems like a reasonable thing, as I had installed version 4.1.0 on the hosts. But yesterday, like some kind of idiot, I’d being using the modprobe settings from “Installation and Reference Guide Device Mapper Multipath Enablement Kit for HP StorageWorks Disk Arrays Version 4.0.0”. This was the document I found buried somewhere in HP’s website. Not the document included with the tarball. Notice the critical difference yet? I’ll elaborate:

For version 4.0.0, you need to set the following options in modprobe.conf for Qlogic HBAs:

options qla2xxx qlport_down_retry=10 ql2xfailover=0

For version 4.1.0, the story changes slightly:

options qla2xxx ql2xmaxqdepth=16 qlport_down_retry=10 ql2xloginretrycount=30
ql2xfailover=0 ql2xlbType=1 ql2xautorestore=0xa0 ConfigRequired=0

Look different? Uh-huh. Different enough that, even though I had enabled the MSA’s Host Port Interconnects as mentioned in the Reference Guide, I wasn’t able to see volumes presented to the host via a different port. So what I should have been seeing was 2 paths to each volume, when I was only seeing one. It’s the simple errors that lead to hours wasted.

Incidentally, the CLI is useful if you want to change the default IP addresses from 10.0.0.2 and 10.0.0.3 to something more sensible. I recommend using the provided serial cable, and issue the following commands:

# show network-parameters

Network Parameters Controller A
——————————-
IP Address     : 10.0.0.2
Gateway        : 10.0.0.1
Subnet Mask    : 255.255.255.0
MAC Address    : 00:C0:FF:D5:FD:4E
Addressing Mode: DHCP

Network Parameters Controller B
——————————-
IP Address     : 10.0.0.3
Gateway        : 10.0.0.1
Subnet Mask    : 255.255.255.0
MAC Address    : 00:C0:FF:D7:02:52
Addressing Mode: DHCP

You can then use set network-parameters to set the IP addresses for each controller as follows:

# set network-parameters ip 192.168.2.50 netmask 255.255.255.0 gateway 192.168.2.254 controller a
Success: Network parameters have been changed
# set network-parameters ip
192.168.2.51 netmask 255.255.255.0 gateway 192.168.2.254 controller b
Success: Network parameters have been changed
#

You can then log in and use HP’s Storage Management Utility (SMU). This is a pretty intuitive interface and a lot easier to navigate than Sun’s CAM.