Random Short Take #89

Welcome to Random Short Take #89. I’ve been somewhat preoccupied with the day job and acquisitions. And the start of the NBA season. But Summer is almost here in the Antipodes. Let’s get random.

  • Jon Waite put out this article on how to deploy an automated Cassandra metrics cluster for VCD.
  • Chris Wahl wrote a great article on his thoughts on platform engineering as product design at scale. I’ve always found Chris to be a switched on chap, and his recent articles diving deeper into this topic have done nothing to change my mind.
  • Curtis and I have spoken about this previously, and he talks some more about the truth behind SaaS data recovery over at Gestalt IT. The only criticism I have for Curtis is that he’s just as much Mr Recovery as he is Mr Backup and he should have trademarked that too.
  • Would it be a Random Short Take without something from Chin-Fah? Probably not one worth reading. In this article he’s renovated his lab and documented the process of attaching TrueNAS iSCSI volumes to his Proxmox environment. I’m fortunate enough to not have had to do Linux iSCSI in some time, but it looks mildly easier than it used to be.
  • Press releases? Here’s one for you: Zerto research report finds companies lack a comprehensive ransomware strategy. Unlike the threat of World War 3 via nuclear strike in the eighties, ransomware is not a case of if, but when.
  • Hungry for more press releases? Datadobi is accelerating its channel momentum with StorageMAP.
  • In other PR news, Nyriad has unveiled its storage-as-a-service offering. I had a chance to speak to them recently, and they are doing some very cool stuff – worth checking out.
  • I hate all kinds of gambling, and I really hate sports gambling, and ads about it. And it drives me nuts when I see sports gambling ads in apps like NBA League Pass. So this news over at El Reg about the SBS offering consumers the chance to opt out of those kinds of ads is fantastic news. It doesn’t fix the problem, but it’s a step in the right direction.

EMC – naviseccli – checking your iSCSI ports are running at the correct speed

It’s been a while since I wrote about naviseccli and I admit I’ve missed it. I once wrote about using naviseccli to identify MirrorView ports on a CLARiiON array. Normally the MirrorView port is consistently located, but in that example we’d upgraded from a CX3-80 to a Cx4-960 and it was in a different spot. Oh how we laughed when we realised what the problem was. Anyway, we’ve been doing some work on an ever so slightly more modern VNX5300 and needed to confirm that some newly installed iSCSI SLICs were operating at the correct speed. (Note that these commands were run from the Control Station).

The first step is to list the ports

=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2016.09.07 08:59:37 =~=~=~=~=~=~=~=~=~=~=~=
[nasadmin@NAS001 ~]$ navicli -h A_VNXSP connection -getport

SP:  A
Port ID:  8
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.a8
iSCSI Alias:  0017.a8
IP Address:  192.168.0.13
Subnet Mask:  255.255.255.0
Gateway Address:  192.168.0.254
Initiator Authentication:  false

SP:  A
Port ID:  9
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.a9
iSCSI Alias:  0017.a9

SP:  A
Port ID:  10
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.a10
iSCSI Alias:  017.a10

SP:  A
Port ID:  11
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.a11
iSCSI Alias:  017.a11

SP:  B
Port ID:  8
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.b8
iSCSI Alias:  0017.b8
IP Address:  192.168.0.14
Subnet Mask:  255.255.255.0
Gateway Address:  192.168.0.254
Initiator Authentication:  false

SP:  B
Port ID:  9
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.b9
iSCSI Alias:  0017.b9

SP:  B
Port ID:  10
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.b10
iSCSI Alias:  017.b10

SP:  B
Port ID:  11
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.b11
iSCSI Alias:  017.b11

Once you’ve done that, you can list the port speed for a particular port

[nasadmin@NAS001 ~]$ navicli -h A_VNXSP connection -getport -sp a -portid 8 -speed
SP:  A
Port ID:  8
Port WWN:  iqn.1992-04.com.emc:cx.cetv2223700017.a8
iSCSI Alias:  0017.a8
IP Address:  192.168.0.13
Subnet Mask:  255.255.255.0
Gateway Address:  192.168.0.254
Initiator Authentication:  false
Port Speed:  1000 Mb
Auto-Negotiate:  Yes
Available Speeds:  10 Mb
-               :  100 Mb
-               :  1000 Mb
-               :  Auto

If you have a lot of ports to check this may not be the most efficient way to do it (ioportconfig may be more sensible), but if your network team are reporting on one particular port being an issue – this is a great way to narrow it down.

EMC – RecoverPoint 4.0

EMC recently made some announcements about RecoverPoint 4.0 amongst other things, and I thought it might be worthwhile briefly looking at what’s on offer. I don’t work for EMC, so if you have questions about how RP might work in your environment, or what you need to consider regarding upgrades, please contact your local EMC team.

Firstly, there’s a bunch of improvements with regards to configuration limits. Here’s a few of them:

  • The number of consistency groups in group set has been increased to 8.
  • The maximum number of replication sets per CG has been increased to 2048.
  • The maximum number of replication sets has been increased to 8192.
  • The maximum number of user volumes has increased to 16000.
  • The maximum replicated capacity per cluster has been increased to 2PB.

Secondly, multi-site allows both 4:1 fan in and 1:4 fan out.

Thirdly, and my favourite, is the Virtual RecoverPoint Appliance (vRPA). Here’s some interesting things to note about the vRPA:

  • Uses iSCSI. So you’ll need iSCSI SLICs in your VNX. Which leads to the next point.
  • Only available for use in RP/SE configurations – so you’ll need VNX storage.
  • Can be used for remote synchronous replication, as RP 4.0 supports sync over IP (assuming links are good).
  • Can replicate any block data, regardless of host connectivity.

There are 4 different RP/SE configs that can be used:

  • vRPA -> vRPA
  • Physical RPA -> vRPA
  • vRPA -> Physical RPA
  • Physical RPA -> Physical RPA

Note that you cannot have vRPAs and Physical RPAs in the same cluster. The vRPAs are deployed using ovf, and come in 3 different flavours.

One finally thing to note with RP 4.0 is that host and fabric splitters are not supported; only VMAX(e), VNX, CLARiiON and VPLEX splitters are supported with RP 4.0.

New Article – Storage Design Principles

I’ve added another new article to the articles section of the blog. This one is basically a high level storage design principles document aimed at giving those not so familiar with midrange storage design a bit of background on some of the things to consider when designing for VNX. It’s really just a collection of notes from information available elsewhere on the internet, so make of it what you will. As always, your feedback is welcome.

Dell PowerConnect and Jumbo Frames

A friend of mine had a problem recently attaching some EqualLogic storage to some vSphere hosts using Dell PowerConnect switches. You’ll notice that it wasn’t me doing the work, so I’ve had to resort to reporting on other people doing interesting or not so interesting things. In any case, he was seeing a lot of flakiness whenever he tried to do anything with the new volumes on the ESX hosts. We went through the usual troubleshooting routine and discussed whther it was either a problem with the ESX hosts (running latest update ESX 4) or something to do with the network.

He had enabled jumbo frames all the way through (host -> switch -> array). In vSphere, you set the packet size to 9000. On the EqualLogic PS Series you set the MTU to 9000. Apparently, on the Dell PowerConnect switches, you don’t. You set it to 9216. For those of you familiar with maths, 9124 is 9 * 1024. Amazing huh? Yes, that’s right, it follows that 9000 is 9 * 1000. Okay stop now. It’s amazing that 124 could make such a difference, but, er, I guess computers need a level of accuracy to do their thing.

console# configure

console(config)# interface range ethernet all

console(config-if)# mtu 9216

console(config-if)# exit

console(config)# exit

console# copy running-config startup-config

console# exit

New article added to articles page

I’ve added a new article to the articles page. While I agree that a bunch of screenshots do not a great technical document make, I think this is a useful visual guide for the first timer. Oh yeah, it covers the basic initialisation process used to deploy Dell | EqualLogic PS5xx0 Series arrays using 3.x firmware. Sure, it might be a little dated. Sure, I started writing it last year some time and then left it flailing about for some time. Sure, I probably could have left in my drafts queue forever. But I thought it would be nice to have something to refer back to that didn’t require logging in to the Dell website. You might find some of it useful too.

LVM settings with EMC MirrorView and VMware ESX

Some few weekends ago I did some failover testing for a client using 2 EMC CLARiiON CX4-120 arrays, MirrorView/Asynchronous over iSCSI and a 2-node ESX Cluster at each site. the primary goal of the exercise was to ensure that we could promote mirrors at the DR site if need be and run Virtual Machines off the replicas. Keep in mind that the client, at this stage isn’t running SRM, just MV and ESX. I’ve read many, many articles about how MirrorView could be an awesome addition to the DR story, and in the past this has rung true for my clients running Windows hosts. But VMware ESX isn’t Windows, funnily enough, and since the client hadn’t put any production workloads on the clusters yet, we decided to run it through its paces to see how it worked outside of a lab environment.

One thing to consider, when using layered applications like SnapView or MirrorView with the CLARiiON, is that the LUNs generated by these applications are treated, rightly so, as replicas by the ESX hosts. This makes sense, of course, as the secondary image in a MirrorView relationship is a block-copy replica of the source LUN. As a result of this, there are rules in play for VMFS LUNs regarding what volumes can be presented to what, and how they’ll be treated by the host. There are variations on the LVM settings that can be configured on the ESX node. These are outlined here. Duncan of Yellow Bricks fame also discusses them here. Both of these articles are well written and explain clearly why you would take the approach that you have and use the settings that you have with LVM. However, what neither article addresses, at least clearly enough for my dumb arse, is what to do when what you see and what you expect to see are different things.

In short, we wanted to set the hosts to “State 3 – EnableResignature=0, DisallowSnapshotLUN=0”, because the hosts at the DR site had never seen the original LUNs before, nor did we want to go through and resignature the datastores at the failover site and have to put up with datastore volume labels that looked unsightly. Here’s some pretty screenshots of what your Advanced – LVM settings might look like after you’ve done this.

LVM Settings

But we wanted it to look like this:

LVM Settings

However, when I set the LVM settings accordingly, admin-fractured the LUN, promoted the secondary and presented it to the failover ESX host, I was able to rescan and see the LUN, but was unable to see any data on the VMFS datastore. Cool. So we set the LVM settings to “State 2 – EnableResignature=1, (DisallowSnapshotLUN is not relevant)”, and were able to resignature the LUNs and see the data, register a virtual machine and boot okay. Okay, so why doesn’t State 3 give me the desired result? I still don’t know. But I do know that a call to friendly IE at the local EMC office tipped me off to using the VI Client connected directly to the failover ESX host, rather than VirtualCenter. Lo and behold, this worked fine, and we were able to present the promoted replica, see the data, and register and boot the VMs at the failover site. I’m speculating that it’s something very obvious that I’ve missed here, but I’m also of the opinion that this should be mentioned in some of those shiny whitepapers and knowledge books that EMC like to put out promoting their solution. If someone wants to correct me, feel free to wade in at any time.

VMware and iSCSI – in case you cared

I’ve been hearing a lot of dribble about how good iSCSI is for VMware and how excited we should all be. While I won’t take the bait today, I will point my reader to this article from VMware on iSCSI design considerations. While most people are kidding themselves if they think they’ve done the necessary design before jumping on the network storage bandwagon, there’s a lot of very useful information in the document. My favourite quote is on page 6 “As a very economical, low-performance block I/O solution, iSCSI drivers can be run on an off-the-shelf Ethernet adapter”. VMware offers support for basic software iSCSI and a Qlogic hardware iSCSI adapter but nothing for software iSCSI with TOE cards: “Support for iSCSI software with TOE cards is under consideration and may change in the future. It is also likely to become more of an option as jumbo frame support and the faster interconnect speeds of 10 gigabit Ethernet are adopted more widely in the industry”. I could make a joke about the sound your body makes after hitting the floor from holding your breath too long but it would be clumsy. Anyway, read the article if an architect or sales guy is whispering sweet iSCSIs in your ear. Do some testing. Be alert, not alarmed. 

VMware and Storage – in case you cared

I just had the misfortune of being sent this quote by a colleague (thanks Crunch):

What should storage users know about VMware that they don’t know today?

Bock: I think a lot of people overestimate the complexity of doing a VMware deployment, because they hear about more complex cases and forget that the vast majority of installations are very simple. The storage industry has decided that iSCSI storage is best for VMware in smaller environments, and VMware agrees. There are a broad range of options out there.”

In and of itself, it’s the kind of quote that makes me irate, and the kind of quote that sales guys latch on to when trying to justify iSCSI opportunities. The whole article is here, and is up to the usual standard of searchstorage articles, in the sense that the tagline looks exciting, and there are some words on the screen and advertising and you’ll feel like you’ve not learnt anything after you’ve read it. But I digress. And while I think that some iterations of iSCSI are smokin’, you need to be careful, with _every_ deployment, regardless of the perceived complexity or otherwise. Rant, rant, iSCSI, red rag to a bull.