EMC – Maximum Pool LUN Size

Mat has been trying to create a 42TB LUN to use temporarily for Centera backups. I don’t want to go into why we’re doing Centera backups, but let’s just say we need the space. He created a Storage Pool on one of the CX4-960s, using 28 2TB spindles and 6+1 private RAID Groups. However, when he tried to bind the LUN, he got the following error.

err1

Weird. So what if we set the size to 44000GB?

err2

No, that doesn’t work either. Turns out, I should really read some of the stuff that I post here, like my article entitled “EMC CLARiiON VNX7500 Configuration guidelines – Part 1“, where I mention that the maximum size of a Pool LUN is 16TB. I was wrong in any case, as it looks more like it’s 14TB. Seems like we’ll be using RAID Groups and MetaLUNs to get over the line on this one.

EMC – CX4 Configuration – a few things I’d forgotten

I’ve been commissioning some new CX4-960s recently (it’s a long story), and came across a few things that I’d forgotten about for some reason. If you’re running older disks, and they get replaced by EMC, there’s a good chance they’ll be a higher capacity. In our case I was creating a storage pool with 45 300GB FC disks and kept getting the following error.

This error was driving me nuts for a while, until I realised that one of the 300GB disks had, at some point, been replaced with a 450GB drive. Hence the error.

The other thing I came across was the restriction that Private LUNs (Write Intent Log, Reserved LUN Pool, MetaLUN Components) have to reside on traditional RAID Groups and can’t live in storage pools. Not a big issue, but I hadn’t really planned to use RAID Groups on these arrays. If you search for emc254739 you’ll find a handy KB article on WIL performance considerations, including this nugget “Virtual Provisioning LUNs are not supported for the WIL; RAID group-based LUNs or metaLUNs should be used”. Which clarifies why I was unable to allocate the 2 WIL LUNs I’d configured in the pool.

*Edit* I re-read the KB article and realised it doesn’t address the problem I saw. I had created thick LUNs on a storage pool, but these weren’t able to be allocated as WIL LUNs. Even though the article states “[The WIL LUNs] can either be RAID-group based LUNs, metaLUNs or Thick Pool LUNs”. So I don’t really know. Maybe it’s a VNX vs CX4 thing. Maybe not.

EMC – DIY Heatmaps – Updated Version

Mat has done an updated version of the heatmaps script for CLARiiON with LUN info and good things like that. You can download it here. Updated release notes can be found here. A sample of the output is here. Enjoy, and feel free to send requests for enhancements.

New Article – VNX5700 Configuration Guidelines

I’ve added a new article to the articles section of the blog. This one is basically a rehash of the recent posts I did on the VNX7500, but focussed on the VNX5700 instead. As always, your feedback is welcome.

EMC – Sometimes RAID 6 can be a PITA

This is really a quick post to discuss how RAID 6 can be a bit of a pain to work with when you’re trying to combine traditional CLARiiON / VNX DAEs and Storage Pool best practices. It’s no secret that EMC strongly recommend using RAID 6 when you’re using SATA-II / NL-SAS drives that are 1TB or greater. Which is a fine and reasonable thing to recommend. However, as you’re no doubt aware, the current implementation of FAST VP uses Storage Pools that require homogeneous RAID types. So you need multiple tools if you want to run both RAID 1/0 and RAID 6. If you want a pool that can leverage FAST to move slices between EFD, SAS, and NL-SAS, it all needs to be RAID 6. There are a couple of issues with this. Firstly, given the price of EFDs, a RAID 6 (6+2) of EFDs is going to feel like a lot of money down the drain. Secondly, if you stick with the default RAID 6 implementation for Storage Pools, you’ll be using 6+2 in the private RAID groups. And then you’ll find yourself putting private RAID groups across backend ports. This isn’t as big an issue as it was with the CX4, but it still smells a bit ugly.

What I have found, however, is that you can get the CLARiiON to create non-standard sized RAID 6 private RAID groups. If you create a pool with 10 spindles in RAID 6, it will create a private RAID groups in a 8+2 configuration. This seems to be the magic number at the moment. If you add 12 disks to the pool it will create 2 4+2 private RAID groups, and if you use 14 disks it will do a 6+2 and a 4+2 RAID group. Now, the cool thing about 10 spindles in a private RAID group is that you could, theoretically (I’m extrapolating from the VNX Best Practices document here), split the 8+2 across two DAEs in a 5+5. In this fashion, you can increase the rebuild times slightly in the event of a disk failure, and you can also draw some sensible designs that fit well in a traditional DAE4P. Of course, creating your pools in increments of 10 disks is going to be a pain, particularly for larger Storage Pools, and particularly as there is no re-striping of data done after a pool expansion. But I’m sure EMC are focussing on this issue in the future, as a lot of customers have had a problem with the initial approach. The downside to all this, of course, is that you’re going to suffer a capacity and, to a lesser extent, performance penalty by using RAID 6 across the board. In this instance you need to consider whether FAST VP is going to give you the edge over split RAID pools or traditional RAID groups.

I personally like the idea of Storage Pools, and I’m glad EMC have gotten on-board with them in their midrange stuff. I’m also reasonably optimistic that they’re working on addressing a lot of issues that have come up in the field. I just don’t know when that will be.

EMC – Other VNX Configuration Guidelines that may be useful

Firstly, apologies for the recent lack of posts. I’ve been on holidays and then started a new job and it’s all been not very related to this blog. Secondly, while it was tempting to call this blog part 5 in the VNX7500 series – these configuration guidelines work well for most all of the VNX range of arrays, not just the 7500. Thirdly, forgive me if I’ve said some of this stuff before. And finally, yes, I know I promised I’d upload some sample designs and talk about them, and I promise I will. Soon. Or soonish. So, in no particular order, here’s a list of things that you should keep in mind when designing solutions around the VNX.

Note that Pool-based LUNs, EFD-based LUNs, FAST VP LUNs, and FAST Cached LUNs do not benefit from file system defragmentation in the way traditional LUNs do. This might require a bit of education on the part of the system administrators – because you know they loves them some defragmentation action.

When configuring FAST Cache on an array, it is important to locate the primary and secondary drives of the RAID 1 pair on different Back End ports. The order the drives are added into FAST Cache is the order in which they are bound. So pay attention when you do this. The disabling of  FAST Caching of Private LUNs is recommended (these include the WIL, Clone private LUNs and Reserved LUN Pool LUNs). However, you shouldn’t disable FAST Cache for MetaLUN components.

If you’re using EFDs for “Tier 0”, you’ll get good performance with up to 12 EFDs per Back End port. But if you’re on the hunt for the highest throughput, it is recommended that this number be kept to about 5.

It is recommended that you use RAID 6 with NL-SAS drives of 1TB or greater. This has some interesting implications for FAST VP hetergenous Pool configurations and the use of 15 vs 25-disk DAEs. I’m hoping to put together a brief article on ways around that in the next week or so.

When architecting for optimal response time, limit throughput to about 70% of the following values: 

 

 

 

 

It is considered prudent to plan for 2/3 of IOPS for normal use – this will give you some margin for burst and degraded mode operation.

When it comes to fancy RAID Group configurations – EMC recommend that a single DAE should be the default method for RAID Group provisioning. If you use vertical provisioning make sure that: for RAID 5, at least 2 drives per port are in the same DAE; for RAID 6, 3 drives in are the same DAE; and for RAID 1/0, both drives of a mirrored pair are on separate Back End ports. It should be noted that parity RAID Groups of 10 drives or more can benefit from binding across 2 Back End ports – this reduces rebuild times when you pop a disk.

Finally, it should be noted that you can’t use Vault drives in a FAST VP pool. I still prefer to not use them for anything.

EMC CLARiiON VNX7500 Configuration guidelines – Part 4

In Part 4 of this 2-part series on VNX7500 configuration guidelines I’m going to simply paraphrase Loyal Reader Dave’s comments on my previous posts, because he raised a number of points that I’d forgotten, and the other two of you reading this may not have read the comments section.

While I was heavily focussed on a VNX7500 filled with 600GB 10K SAS disks, it is important to note that with the VNX you can *finally* mix any drive type in any DAE. So if you have workloads that don’t necessarily conform to the 15 or 25-disk DAE this is no longer a problem. That’s right, with VNX it does not matter – SAS,EFD, or NL-SAS can all be in the same DAE. Whether you really want this will again depend on the workload you’re designing for. And before the other vendors jump in and say “we’ve had that for years”, I know you have. Dave also mentioned that the ability to mix RAID types within a heterogeneous FAST VP would be handy, and I agree there. I’m pretty sure that’s on a roadmap, but I’ve no idea when it’s slated for.

It is also important to expand a FAST-VP pool by the same amount of disks each time. So if you’ve got a 15-disk pool to start with, you should be expanding a pool with another 15 disks. This can get unwieldy if you have a 60-disk pool and then only need 5 more disks. I’m hearing rumours that the re-striping feature is coming, but so’s Christmas.

While you can have up to 10 DAE’s (15 disk DAE) on a bus or a maximum of 250 disks (new 25 disk DAE), Dave still follows legacy design best practices with the CX5s: RAID1/0 across two buses, and FAST cache across all available buses. I agree with this, and I don’t think the VNX is quite there in terms of just throwing disks anywhere and letting FAST-VP take care of it. Dave suggested that I should also mention that the cache numbers and page cache water marks should be adjusted to the best practices (this has changed for VNX models). I’m hoping to do a post on this in the near future.

While the initial release of Block OE doesn’t allow connectivity to another domain, the latest Block OE 05.31.000.5.502 has just gone GA, and I think this has been fixed.

If I get some time this week I’ll put up a copy of the disk layout I proposed for this design, including the multiple variations that were used to try and make everything fit nicely. Thanks again to Dave for his insightful commentary, and for reminding me of the things I should have already covered.

EMC CLARiiON VNX7500 Configuration guidelines – Part 2

In this episode of EMC CLARiiON VNX7500 Configuration Guidelines, I thought it would be useful to discuss Storage Pools, RAID Groups and Thin things (specifically Thin LUNs). But first you should go away and read Vijay’s blog post on Storage Pool design considerations. While you’re there, go and check out the rest of his posts, because he’s a switched-on dude. So, now you’ve done some reading, here’s a bit more knowledge.

By default, RAID groups should be provisioned in a single DAE. You can theoretically provision across buses for increased performance, but oftentimes you’ll just end up with crap everywhere. Storage Pools obviously change this, but you still don’t want to bind the Private RAID Groups across DAEs. But if you did, for example, want to bind a RAID 1/0 RAID Group across two buses – for performance and resiliency – you could do it thusly:

naviseccli -h <sp-ip> createrg 77 0_1_0 1_1_0 0_1_1 1_1_1

Where the numbers refer to the standard format Bus_Enclosure_Disk.

The maximum number of Storage Pools you can configure is 60. It is recommended that a pool should contain a minimum of 4 private RAID groups. While it is tempting to just make the whole thing one big pool, you will find that segregating LUNs into different pools may still be useful for FAST cache performance, availability, etc. Remember kids, look at the I/O profile of the projected workload, not just the capacity requirements. The mixing of drives with different performance characteristics in a homogenous pool is also contra-indiciated. When you create a Storage Pool the following Private RAID Group configurations are considered optimal (depending on the RAID type of the Pool):

  • RAID 5 – 4+1
  • RAID 1/0 – 4+4
  • RAID 6 – 6 + 2

Pay attention to this, because you should always ensure that a Pool’s private RAID groups align with traditional RAID Group best practices, while sticking to these numbers. So don’t design a 48 spindle RAID 5 Pool. That will be, er, non-optimal.

 

EMC recommend that if you’re going to blow a wad of cash on SSDs / EFDs, you should do it on FAST cache before making use of the EFD Tier.

 

With current revisions of FLARE 30 and 31, data is not re-striped when the pool is expanded. It’s also important to understand that preference is given to using the new capacity rather than the original storage until all drives in the Pool are at the same level of capacity. So if you have data on a 30-spindle Pool, and then add another 15 spindles to the Pool, the data goes to the new spindles first to even up the capacity. It’s crap, but deal with it, and plan your Pool configurations before you deploy them. For RAID 1/0, avoid private RAID Groups of 2 drives.

A Storage Pool on the VNX7500 can be created with or expanded by 180 drives at a time, and you should keep the increments the same. If you are considering the use of greater than 1TB drives use RAID 6. When FAST VP is working with Pools, remember that you’re limited to one type of RAID in a pool. So if you want to get fancy with different RAID Types and tiers, you’ll need to consider using additional Pools to accommodate this. It is, however, possible to mix thick and thin LUNs in the same Pool. It’s also important to remember that the consumed capacity for Pool LUNs = (User Consumed Capacity * 1.02) + 3GB. This can have an impact as capacity requirements increase.

 

A LUN’s tiering policy can be changed after the initial allocation of the LUN. FAST VP has the following data placement options: Lowest, Highest, Auto, no movement. This can present some problems if you want to create a 3-tier Pool. The only workaround I could come up with was to create the Pool with 2 tiers and place LUNs at highest and lowest. Then add the third tier and place those highest tier LUNs on the highest tier and change the middle tier LUNs to No Movement. What would be a better solution is to create the Pool with the tiers you want, put all of your LUNs on Auto placement, and let FAST VP sort it out for you. But if you have a lot of LUNs, this can take time.

 

For thin NTFS LUNs – use Microsoft’s sdelete to zero free space. When using LUN Compression – Private LUNs (Meta Components, Snapshots, RLP) cannot be compressed. EMC recommends that compression only be used for archival data that is infrequently accessed. Finally, you can’t defragment RAID 6 RAID Groups – so pay attention when you’re putting LUNs in those RAID Groups.

EMC CLARiiON VNX7500 Configuration guidelines – Part 1

I’ve been doing some internal design work and referencing the “EMC Unified Storage Best Practices for Performance and Availability Common Platform and Block Storage 31.0 – Applied Best Practices” – rev 23/06/2011 – h8268_VNX_Block_best_practices.pdf fairly heavily. If you’re an EMC customer or partner you can get it from the Powerlink website. I thought it would be a useful thing to put some of the information here, more as a personal reference. The first part of this two-part series will focus on configuration maximums for the VNX7500 – the flagship midrange array from EMC. The sequel will look at Storage Pools, RAID Groups and thin things. There may or may not be a third part on some of the hardware configuration considerations. Note that the information here is based on the revision of the document referenced at the start. Some of these numbers will change with code updates.

Here are some useful numbers to know when considering a VNX7500 deployment:

  • Maximum RAID Groups – 1000;
  • Maximum drives per RAID Group – 16;
  • Minimum drives per RAID Group – R1/0 – 2, R5 – 3, R6 – 4;
  • Stripe Size R1/0 (4+4) and R5 (4+1) – 256KB, R6 (6+2) – 384KB;
  • Maximum LUNs (this includes private LUNs) – 8192;
  • Maximum LUNs per Pool / all pools – 2048;
  • Maximum LUNs per RAID Group – 256;
  • Maximum MetaLUNs per System – 2048;
  • Maximum Pool LUN size (thick or thin) – 16TB;
  • Maximum traditional LUN size = largest, highest capacity RAID Group;
  • Maximum components per MetaLUN – 512;
  • EMC still recommends 1 Global Hot Spare per 30 drives.

When you add drives to a Storage Pool or RAID Group they are zeroed out – this is a background process but can take some time. New drives shipped from EMC are pre-zeroed and won’t be “re-zeroed”. The drives you bought off ebay are not. To pre-zero drives prior to adding to Storage Pool or RAID Group run the following commands with naviseccli:

naviseccli zerodisk -messner <disk-id> <disk-id> <disk-id> start
naviseccli zerodisk -messner <disk-id> <disk-id> <disk-id> status

Trespassing a Pool LUN will adversely affect its performance after the trespass. It is recommended that you avoid doing this, except for NDU or break-fix situations.

The LUN Migration tool provided by EMC has saved my bacon a number of times. If you need to know how long a LUN migration will take, you can use the following formula. LUN Migration duration = (Source LUN (GB) * (1/Migration Rate)) + ((Dest LUN Capacity – Source LUN Capacity) * (1/Initialization Rate)). The Migration rates are – Low = 1.4, Medium = 13, High = 44, ASAP = 85 (in MB/s). Up to 2 ASAP migrations can be performed at the same time per Storage Processor. Keep in mind that this will belt the Storage Processors though, so, you know, be careful.

SP Bugcheck – dodged a bullet there

So I’ve been known to do some silly things at times. And I have a habit of treating midrange storage arrays like fridges – they just work, and they’re about the same size. Of course it’s not always as simple as that. So if you’re looking for a way to cause a bugcheck on a CLARiiON CX4-960 running FLARE 29.003, all you really need to do is kick off about 8 or so RAID Group defragmentations at once, and then go into one of the RAID groups and unbind a LUN while it is defragmenting. Then the SP will scream silently and reboot. Of course, it’s long been known that you shouldn’t unbind LUNs on RAID groups that are defragging. Common sense, yes? EMC has ETAs that address these on Powerlink – search for emc145619 and emc153522. But there’s only so much they can do to protect yourself. So if you don’t want to see the following errors in your event logs, don’t do what I did. We were lucky that we didn’t suffer a data loss or data unavailable event. We do, however, still have an alert showing up that says “LUN (LUN XXXX) is unavailable to its server because of an internal software error. Alert Code: 0x7439 Resolution: Contact your service provider.” I guess we should sort that out at some stage.

Date:2010-04-27
Time:14:06:19

Event Code:0x76008106

Description:The Storage Processor rebooted unexpectedly @ 03:43:57 on 04/27/2010: BugCheck 0, {0000000000000000, 0000000000800022, fffffadf2de7d078, fffffadffb8c3040}, Failing Instruction: 0xfffffadf2e052604 in flaredrv.sys loaded @ 0xfffffadf2de27000 76008106

Subsystem:CKM000XXXXXX60

Device:N/A

SP:N/A

Host:A-IMAGE

Source:K10_DGSSP

Category:NT Application Log

Log:NT Application Log

Sense Key:N/A

Ext Code1:N/A

Ext Code2:N/A

Type:Error

 

Date:2010-04-27

Time:13:57:09

Event Code:0x907

Description:Microcode Panic

Subsystem:CKM000XXXXXX60

Device:SP A

SP:SPA

Host:A-IMAGE

Source:N/A

Category:N/A

Log:Storage Array

Sense Key:0x0

Ext Code1:0x800022

Ext Code2:0x2de7d078

Type:Error

 

Date:2010-04-27

Time:13:55:30

Event Code:0x2183

Description:The computer has rebooted from a bugcheck. The bugcheck was: 0x00000000 (0x0000000000000000, 0x0000000000800022, 0xfffffadf2de7d078, 0xfffffadffb8c3040). A dump was saved in: C:\dumps\crash.dmp.

Subsystem:CKM000XXXXXX60

Device:N/A

SP:N/A

Host:A-IMAGE

Source:Save Dump

Category:NT System Log

Log:NT System Log

Sense Key:N/A

Ext Code1:N/A

Ext Code2:N/A

Type:Error