EMC – Broken Vault drive munts FAST Cache

Mat sent me an e-mail this morning, asking “why would FAST Cache be degraded after losing B0 E0 D2 in one of the CX4-960s?”. For those of you playing at home 0_0_2 is one of the Vault disks in the CX4 and VNX. Here’s a picture of the error:

Check out the 0x7576 that pops up shortly after the array says there’s a faulted disk. Here’s a closeup of the error:

Weird, huh?  So here’s the output of the naviseccli command that will give you the same information, but with a text-only feel.

"c:/Program Files/EMC/Navisphere CLI/NaviSECCli.exe"  -user Ebert -scope 0 -password xx -h 255.255.255.255  cache -fast -info -disks -status
Disks:
Bus 0 Enclosure 7 Disk 0
Bus 2 Enclosure 7 Disk 0
Bus 0 Enclosure 7 Disk 1
Bus 2 Enclosure 7 Disk 1
Bus 1 Enclosure 7 Disk 1
Bus 1 Enclosure 7 Disk 0
Bus 3 Enclosure 7 Disk 1
Bus 3 Enclosure 7 Disk 0
Mode:  Read/Write
Raid Type:  r_1
Size (GB):  366
State:  Enabled_Degraded
Current Operation:  N/A
Current Operation Status:  N/A
Current Operation Percent Completed:  N/A

So what’s with the degraded cache? The reason for this is that FAST Cache stores a small database on the first 3 drives (0_0_0, 0_0_1, 0_0_2). if any of these disks fail, FAST Cache flushes to disk and goes into a degraded state. But it shouldn’t, because the database is triple-mirrored. And what does it mean exactly? It means your FAST Cache is not processing writes at the moment. Which is considered “bad darts”.

This is a bug. Have a look on Powerlink for emc267579. Hopefully this will be fixed in R32 for the VNX. I couldn’t see details about the CX4 though. I strongly recommend that if you’re a CX4 user and you experience this issue, you raise a service request with your local EMC support mechanisms as soon as possible. The only way they get to know the severity of a problem is if people in the field feedback issues.

4 Comments

  1. Are you sure about the emc267579? I searched all over on powerlink and not able to find anything.

    I have partner access on powerlink.

    Regards,
    Surya Kiran C

  2. Hi Surya Kiran, it’s definitely there. You could also try to search for “Vault drive failure results in Fast Cache LUN becoming degraded” – this is the title of the KB article.

    Cheers
    Dan

  3. So in your opinion does this change how you go about designing FAST cache?

    Can the locality be modified for this FAST cache database?

    Or is this simply a potential flaw in VNE OE and something you need to be aware of with FAST cache?

  4. I would follow emc251589 – FAST Cache configuration best practice. As far as I know you can’t change the location of the cache db. The KB article suggested that it would be fixed in R32.

Comments are closed.