EMC – Sometimes it’s best not to pay too much attention to USM

Just a quick one to start the year off on the right note. I was installing updated Utility Partition software on our lab CX4s today and noticed that USM was a bit confused as to when it had started installing a bit of the code. Notice the Time started and Time elapsed section. Well, I thought it was amusing.

CX4_Utility_Partition

EMC CLARiiON VNX7500 Configuration guidelines – Part 3

One thing I didn’t really touch on in the first two parts of this series is the topic of RAID Groups and binding between disks on the DPE / DAE-OS and other DAEs. It’s a minor point, but something people tend to forget when looking at disk layouts. Ever since the days of Data General, the CLARiiON has used Vault drives in the first shelf. For reasons that are probably already evident, these drives, and the storage processors, are normally protected by a Standy Power Supply (SPS) or two. The SPS provides enough battery power in a power failure scenario such that cache can be copied to the Vault disks and data won’t be lost. This is a good thing.

The thing to keep in mind with this, however, is that the other DAEs in the array aren’t protected by this SPS. Instead, you plug them into UPS-protected power in your data centre. So when you lose power with those, they go down. This can cause “major dramas” with Background Verify operations when the array is rebooted. This is a sub-optimal situation to be in. The point of all this is that, as EMC have said for some time, you should bind RAID groups across disks that are either contained in that first DAE, or exclusive to that DAE.

Now, if you really must do it, there are some additional recommendations:

  • Don’t split RAID 1 groups between the DPe and another DAE;
  • For RAID 5, ensure that at least 2 drives are outside the DPE;
  • For RAID 6, ensure that at least 3 drives are outside the DPE;
  • For RAID 1/0 – don’t do it, you’ll go blind.

It’s a minor design consideration, but something I’ve witnessed in the field when people have either a) tried to be tricky on smaller systems, or b) have been undersold on their requirements and have needed to be creative. As an aside, it is also recommended that you don’t include drives from the DPE / DAE-OS in Storage Pools. This may or may not have an impact on your Pool design.

EMC – CLARiiON Support Staff are Supportive, Funny.

I made a boo boo about a year ago. I was defragmenting a RAID Group and unbound a LUN after the defrag had commenced. The LUN wasn’t in use, but the array has had issues with my behaviour ever since.

Clearly, I’m a bad person. My colleague was sick and tired of looking at the Error, and was hoping Support could make it go away. Here’s how the WebEx chat went (I’ve evidently changed the names):

from EMC Support to All Participants:

can you tell me, on which of these clariion box you received the alert?

from Long Suffering Colleague to All Participants:

I’m not recieving an alert … the array is showing a fault

from Long Suffering Colleague to All Participants:

that I would like to get rid of

from EMC Support to All Participants:

on box CKM12341234567

from Long Suffering Colleague to All Participants:

yup

from EMC Support to All Participants:

give me 2 mins

from EMC Support to All Participants:

iam sorry Long Suffering but we cant remove it

from EMC Support to All Participants:

it is Functioning as designed

from Long Suffering Colleague to All Participants:

ever? what … this alert have been here for months

from EMC Support to All Participants:

this is an event on the box

from EMC Support to All Participants:

every time a poll happens , the event will be logged

from Long Suffering Colleague to All Participants:

so how do we get rid of the error

from EMC Support to All Participants:

in future flare releases we might be able to remove it but for now..we just need to ignore this

from Long Suffering Colleague to All Participants:

right … ok … thanks for your help

from EMC Support to All Participants:

thanks for your time

from EMC Support to All Participants:

so can we close the ticket ?

from Long Suffering Colleague to All Participants:

sure

from EMC Support to All Participants:

thanks…you have a nice day ahead…!

from Long Suffering Colleague to All Participants:

thanks

from EMC Support to All Participants:

goodbye

WIN-F-EEXCEPTION in CLARiiON CX4 SP logs

This error has been driving us nuts. Every few hours, since we upgrade to FLARE 30, we get:

Time Stamp 12/16/10 21:42:28 (GMT) Event Number 7600

Severity Error Host SAN01-B

Storage Array CK20XXXXXXXX79 SP N/A Device N/A

Description Dynamic strings:[16-Dec-2010 21:42:28+759ms GMT Standard Time] t@1732

WIN-F-EEXCEPTION-Windows exception 0XC0000005 at 0X12BE0DB9

So we logged a call with support, after reading emc234899 on Powerlink, which states that the alert is benign. Support have said it will be fixed in a future release of FLARE 30. Apparently not in patch .509 though, because the release notes make no mention of the problem being fixed. It’s no big deal, as the alert is benign, but it’s annoying for the people carrying the pager because it’s happening about 10 times a day.

EMC Unisphere – Basics – Part 1 – Setup Domain NTP

This is the first in a series of “death by screenshot” posts that demonstrate activities in Unisphere that I’ve found myself having to do, and thought that someone else may find useful as well. Today’s lesson is on setting NTP servers for the local domain.

From the dashboard, click on Domains.

Under the Local Domain menu on the left-hand side of the window, click on Configure NTP to, er, configure NTP.

A friendly message pops up regarding NTP.

You then have to actually do something. The first thing to do is tick the radio button to enable NTP, and decide on a synchronisation interval.

This is what a ticked radio button looks like:

You then need to add some IP addresses. In the following illustration, I’ve added 2 servers.

Click on OK and you’ll have the opportunity to confirm the configuration changes.

And I couldn’t finish without including my favourite message …

Seems like a trivial thing to post on, but I was scratching my head for a few minutes trying to work out where to configure NTP in Unisphere, and I thought you might too.

SP Bugcheck – dodged a bullet there

So I’ve been known to do some silly things at times. And I have a habit of treating midrange storage arrays like fridges – they just work, and they’re about the same size. Of course it’s not always as simple as that. So if you’re looking for a way to cause a bugcheck on a CLARiiON CX4-960 running FLARE 29.003, all you really need to do is kick off about 8 or so RAID Group defragmentations at once, and then go into one of the RAID groups and unbind a LUN while it is defragmenting. Then the SP will scream silently and reboot. Of course, it’s long been known that you shouldn’t unbind LUNs on RAID groups that are defragging. Common sense, yes? EMC has ETAs that address these on Powerlink – search for emc145619 and emc153522. But there’s only so much they can do to protect yourself. So if you don’t want to see the following errors in your event logs, don’t do what I did. We were lucky that we didn’t suffer a data loss or data unavailable event. We do, however, still have an alert showing up that says “LUN (LUN XXXX) is unavailable to its server because of an internal software error. Alert Code: 0x7439 Resolution: Contact your service provider.” I guess we should sort that out at some stage.

Date:2010-04-27
Time:14:06:19

Event Code:0x76008106

Description:The Storage Processor rebooted unexpectedly @ 03:43:57 on 04/27/2010: BugCheck 0, {0000000000000000, 0000000000800022, fffffadf2de7d078, fffffadffb8c3040}, Failing Instruction: 0xfffffadf2e052604 in flaredrv.sys loaded @ 0xfffffadf2de27000 76008106

Subsystem:CKM000XXXXXX60

Device:N/A

SP:N/A

Host:A-IMAGE

Source:K10_DGSSP

Category:NT Application Log

Log:NT Application Log

Sense Key:N/A

Ext Code1:N/A

Ext Code2:N/A

Type:Error

 

Date:2010-04-27

Time:13:57:09

Event Code:0x907

Description:Microcode Panic

Subsystem:CKM000XXXXXX60

Device:SP A

SP:SPA

Host:A-IMAGE

Source:N/A

Category:N/A

Log:Storage Array

Sense Key:0x0

Ext Code1:0x800022

Ext Code2:0x2de7d078

Type:Error

 

Date:2010-04-27

Time:13:55:30

Event Code:0x2183

Description:The computer has rebooted from a bugcheck. The bugcheck was: 0x00000000 (0x0000000000000000, 0x0000000000800022, 0xfffffadf2de7d078, 0xfffffadffb8c3040). A dump was saved in: C:\dumps\crash.dmp.

Subsystem:CKM000XXXXXX60

Device:N/A

SP:N/A

Host:A-IMAGE

Source:Save Dump

Category:NT System Log

Log:NT System Log

Sense Key:N/A

Ext Code1:N/A

Ext Code2:N/A

Type:Error

 

 

 

Locked vmdk files

Somehow, a colleague of mine put an ESX host in a cluster into maintenance mode while VMs were still running. Or maybe it just happened to crash when she was about to do this. I don’t know how, and I’m not sure I still believe it, but I saw some really weird stuff last week. the end result was that VMs powered off ungracefully, and the host became unresponsive, and things were generally bad. We started adding VMs back to other hosts, but one VM had locked files. Check out this entry at Gabe’s Virtual World on how to address this, but basically you want to ps, grep and kill -9 some stuff.

ps -elf | grep vmname

kill -9 PID

And you’ll find that it’s probably the vmdk files that are locked, not necessarily the vmx file.

CLARiiON CX700 FLARE Recovery – Part 3

In the previous series of posts, I covered the despair and ultimate joy of getting to a point where I could recover a munted CLARiiON CX700. In this post, I’ll cover the process to recover the array to a working state, and the steps required to get the array functioning at a useful level.

Having successfully performed a Utility Partition Boot, it’s ecessary to get the LAN service ports on the array configured in order to be able to ftp the recovery image to the array. Obviously, you’ll need the array and your service laptop plugged into a network-type thing that will enable frank communication between the arrays and you.

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) Enable LAN Service Port
8) Enable Engineering Mode
9) Install Images

Enter Option: 7
===============================================================================
Please enter the network settings you wish to use for this SP
===============================================================================
IP Address:  192.168.0.2
Subnet Mask:  255.255.255.0
Default Gateway:  192.168.0.255
Host Name:  spa
Domain Name:  

===============================================================================
Confirm Network Settings
===============================================================================
IP Address:      192.168.0.2  
Subnet Mask:     255.255.255.0
Default Gateway: 192.168.0.255
Host Name:       spa          
Domain Name:                  

Enable LAN Service Port with these settings? y/n [y] 
The LAN Service Port has been enabled

Automatically enable the LAN Port with these settings in the future? y/n [y] n

Press the Enter key to continue… 

Once you’ve enabled the LAN port on the SP you’re connected to, you need to ftp the image to the SP’s repository. The username to use is Clariion, and the password is clariion!. Once you’ve logged in, run a put command to put the file up there. It doesn’t really matter what you call it, but it should be a file of type mif. Here’s a pointless text capture of the ftp login process:

C:\>ftp 192.168.0.2
Connected to 192.168.0.2.
220-FileZilla Server version 0.8.3 beta test release 1
220-written by Tim Kosse (
Tim.Kosse@gmx.de)
220 Please visit
http://sourceforge.net/projects/filezilla/
User (192.168.0.2:(none)): Clariion
331 Password required for clariion
Password:
230 Logged on
ftp> ls
200 Port command successful
150 Opening data channel for directory list.
FLARE.mif
226 Transfer OK
ftp: 11 bytes received in 0.00Seconds 11000.00Kbytes/sec.
ftp>

Once you’ve successfully uploaded the recovery image, you’ll be good to go. It’s also important to note that the FLARE recovery image should be for the release that you intend to run. I didn’t consider uploading a Release 19 image, as I knew that these arrays had run Release 26 previously. In any case, jumping back into the Image menu on the terminal, it’s now time to copy the image from the RAM disk and then load it.

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) View LAN Service Port Settings
8) Enable Engineering Mode
9) Install Images

Enter Option: 4

===============================================================================
CLARiiON Utility Toolkit Image Repository Menu
===============================================================================
1) Back to the Main Menu
2) List Image Repository Contents
3) Delete Files from the Image Repository
4) Copy Files from the RAM Disk to the Image Repository
5) Copy Files from the Image Repository to the RAM Disk

Enter Option: 4

===============================================================================
Select files to copy to the Image Repository
===============================================================================
1) FLARE.mif

Enter comma separated list of options: 1

Copying FLARE.mif to the Image Repository… Success

Press the Enter key to continue…
 
===============================================================================
CLARiiON Utility Toolkit Image Repository Menu
===============================================================================
1) Back to the Main Menu
2) List Image Repository Contents
3) Delete Files from the Image Repository
4) Copy Files from the RAM Disk to the Image Repository
5) Copy Files from the Image Repository to the RAM Disk

Enter Option: 1

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) View LAN Service Port Settings
8) Enable Engineering Mode
9) Install Images

Enter Option: 9

===============================================================================
Select Images to Install
===============================================================================
1) FLARE.mif

Enter comma separated list of options: 1

===============================================================================
Confirm Image Installation
===============================================================================

  FLARE.mif

You need to install this only on the SP that you have visibility of, as troubleshooting the installation to both SPs is tricky.

Are you sure you want to install these images? y/n [n] y

===============================================================================
Select Storage Processors to install images for
===============================================================================
1) This SP (SP A)
2) Peer SP (SP B)
3) Both SP’s

Enter Option: 1
 
Installing Data Directory Boot Service 02.12
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Installing BIOS 03.70
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Installing Extended POST 02.38
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Installing FLARE Image 02.26.700.5.005
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Press the Enter key to continue… 

Once the copy has completed successfully, the system needs to be reset, and you’ll see the SP reboot up to three times before it’s useable.

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) View LAN Service Port Settings
8) Enable Engineering Mode
9) Install Images

Enter Option: 3

Requesting System Reset         

Once this is complete, you can either load the recovery image to the other SP via Navisphere in Engineering Mode, or you can use the same method as described above. Note that, once the image is copied to the repository, it is not necessary to re-upload it, as both SPs have access to the files.

The normal process needs to be followed as would normally be followed to initialise an array. In my case I initialised security, setup IP addresses for the SPs, logged in, committed the FLARE (R26.005), enabled Access Logix, configured cache settings, upgraded FLARE to the latest (R26.028), reloaded the latest Utility Partition and Recovery Images, and went about loading the appropriate enablers for the array. And now we have a working lab :)

CLARiiON CX700 FLARE Recovery – Part 2

In Part 1 of this series I talked about what looked like a borked CX700 and some of the options available to me to get it up and running again. I trawled powerlink looking for solutions and came across a number of articles that talked about ordering Vault packs and getting EMC CEs involved. As the arrays were no longer under maintenance, I didn’t have high hopes that this would be a process we could undertake.

My trawling for solutions, however, did yield a rather interesting nugget of information. For those of you with access to Powerlink, there’s an article entitled “CX700 array unmanaged and fails to display its serial number after changing WWN seed array“. This article also goes by the ID emc119598 and discusses the process to rectify the array’s WWN seed after a conversion from a CX500 to a CX700. The great thing about this article was not so much the solution provided as the alternative method described to access the CLARiiON’s Diagnostics Menu. To wit, using the password “SHIP_it” yields a menu subsystem that is dramatically different from the one provided with “DB_key“. The results are below, the full transcript can be downloaded here:

Copyright (c) EMC Corporation , 2007
Disk Array Subsystem Controller
Model: CX700: SAN GBFCC4
DiagName: Extended POST
DiagRev: Rev. 02.39
Build Date: Fri Jul 13 16:36:03 2007
StartTime: 01/21/2010 22:04:18
SaSerialNo: LKE00051202843

AabcdefgBC

EndTime: 01/21/2010 22:04:19
…. Storage System Failure – Contact your Service Representative …

*******
******  Aborting!!!!  ******

 
Hit ESC to begin running diagnostic menu…

Entering the alternative password, we see the following output:

Diagnostic Menu
1)  Reset Controller          21) BE1 FCC Sub-Menu
2)  Enter Debugger            22) CMI0 FCC Sub-Menu
3)  Display Warnings/Errors   23) CMI1 FCC Sub-Menu
4)  Boot OS                   24) AUX0 FCC Sub-Menu
5)  POST Sub-Menu             25) AUX1 FCC Sub-Menu
6)  Display/Change Privilege  26) FE0 FCC Sub-Menu
7)  Boot UProc Sub-Menu       27) FE1 FCC Sub-Menu
8)  Ap UProc Sub-Menu         28) FE2 FCC Sub-Menu
9)  Real Time Clock Sub-Menu  29) FE3 FCC Sub-Menu
10) Pers. Module Sub-Menu     30) POST ROM Sub-Menu
11) RAM Sub-Menu              31) BIOS ROM Sub-Menu
12) NOVRAM Sub-Menu           32) System Test Sub-Menu
13) Console UART Sub-Menu     33) Image Sub-Menu
14) SPS UART Sub-Menu         34) Disk Sub-Menu
15) LCC 0 UART Sub-Menu       35) Resume PROM Sub-Menu
16) LCC 1 UART Sub-Menu       36) Voltage Margin Sub-Menu
17) LCC 2 UART Sub-Menu       37) Information Display
18) LCC 3 UART Sub-Menu       38) ICA Sub-Menu
19) LAN Service Port Sub-Menu 39) DDBS Service Sub-Menu
20) BE0 FCC Sub-Menu          40) FCC Boot Sub-Menu

Enter Option : 33

Option 33 is what we’re interested in to start with. From here you can perform a Utility Partition Boot.

Image Sub-Menu
1)  Init Loop                 6)  Exit Loop
2)  Serial Download           7)  Relocate/Run Image
3)  Load from disk            8)  Display Sector Protection
4)  Save to disk              9)  Utility Partition Boot
5)  Update Firmware

                    0) Exit

Enter Option :  9
Relocating Data Directory Boot Service (DDBS: Rev. 02.12)…
DDBS: K10_REBOOT_DATA: Count = 0
DDBS: K10_REBOOT_DATA: State = 0
DDBS: K10_REBOOT_DATA: ForceDegradedMode = 0

DDBS: Read default MDDE off disk 1
DDBS: MDDE (Rev 2) on disk 1
DDBS: Read default DDE (0x40000F) off disk 1
DDBS: Read default MDDE off disk 3
DDBS: MDDE (Rev 2) on disk 3
DDBS: Read default DDE (0x400010) off disk 3

DDBS: MDB read from both disks.
DDBS: DD invalid on both disks, continuing…
DDBS: Disk WWN seeds match each other but not chassis WWN seed.
DDBS: First disk is valid for boot.
DDBS: Second disk is valid for boot.

[snip]

int13 – GET DRIVE PARAMETERS (Extended) (1437)
ICA::UtilityFrontEnd
(c) EMC Corporation 2001-2004 All Rights Reserved
DiagName: ICA::UtilityFrontEnd
DiagRev: 02.16.700.5.001
StartTime: 01/21/10 22:08:37

OS Type……………………….WinXP
SMBUS…………………………Running
SPID………………………….Running
ASIDC…………………………Running
ASIRAMDisk…………………….Running
ICA…………………………..Running
FileZilla Server……………….Running
Connecting to ICA………………Success
SP Type……………………….CX700
SP ID…………………………A
SP Signature…………………..0x08291953
Checking Image Repository……….
    ICA::IRFS no valid Volume was found on this system
    ICA::IRFS Creating new Volume
    ICA::IRFS Finished creating new volume
    ICA::IRFS Checking Volume for consistency
Sizing Image Repository…………1024 MB
Sizing RAM Disk………………..2039 MB
Discovering Management LAN Port….ManagementPort0
Checking LAN Port State…………Not Configured
Checking LAN Port Config………..Not Found
Loading Plugins………………..Done

EndTime: 01/21/10 22:09:03

Now that’s what I wanted to see :) – from here we just need to reload the FLARE image with ftp. I’ll cover this in Part 3.

CLARiiON CX700 FLARE Recovery – Part 1

I’ve broken this post into three parts not because I’m trying to being a tease, but rather so it’s a little easier to digest and you can head straight for the money shot if you so desire. Part 1 covers the intitial failure and subsequent troubleshooting. Part 2 covers the eventual workaround to the problem. Part 3 covers the work that needed to be done once the problem was resolved.

It’s strange to think of EMC’s CX700 CLARiiON array as a “legacy” array. Yet it’s now two generations behind EMC’s flagship mid-range array – the CX4-960. Our project was given access to two CX700s to use as test arrays for a multi-site data centre project we’re working on. That’s cool, as the CX700 is still a reasonably well-specced array, with multiple back-end loops and the a fair bit of useable cache (at least compared to the CX4-120). So after the data centre guys Macguyvered the kit into racks that were too big for the rails, we cabled the lot up and thought it would be a fairly trivial process to get everything up and running.

As usual, I was wrong. The department that had provided these hand-me-down arrays had bought a service from EMC whereby the data was securely erased. For those of you playing at home, this is known as the “Certified Data Erasure Service“, and you can grab the datasheet from here. So basically, these arrays were saved from the scrapheap, but not before they were rendered basically unbootable.

When we powered them up, we got the following output via the terminal:

Disk 2 Read Error 0x00000187
Number Sectors: 1
LBA: 0x0002284B
Buffer: 0x1000A114

DDBS: Can’t read MDB from first disk.
DDBS: Can’t read MDB from second disk.
DDBS: Using first disk for boot – but inaccessible.

FLARE image (0x00400007) located at sector LBA 0x0002284C

Disk Set: 0
ErrorCode: 0x0000018D
ErrorDesc:
Device: BOOT PATH
FRU: STORAGE PROCESSOR
Description: Dual-Mode Fibre Driver Exchange Error!
DualMode Driver Exchange Status: 0x1000000C
Target ID: 0x00
EndError:
ErrorTime: 01/19/2010 05:07:11

(the full boot log can be found here)

So I tried booting to the utility partition via the DDBS submenu. You would be familiar with this submenu if you’ve ever performed an out-of-familty array conversion, it’s where you go to run the conversion image over the new array and tell FLARE that it’s brain is, er, bigger. In any case, this can be accessed by pressing ESC during the initial POST and then typing in “DB_key“. Note that on newer CX3 and CX4 arrays, you don’t press ESC anymore, but rather CTRL-C is used to break the boot. You’ll then be presented with menus that look something like this:

Copyright (c) EMC Corporation , 2007
Disk Array Subsystem Controller
Model: CX700: SAN GBFCC4
DiagName: Extended POST
DiagRev: Rev. 02.39
Build Date: Fri Jul 13 16:36:03 2007
StartTime: 01/19/2010 05:38:05
SaSerialNo: LKE00051202843

AabcdefgBCDEabFabcdGHabIabcJabKabLab

EndTime: 01/19/2010 05:38:20
…. Storage System Failure – Contact your Service Representative …

******
******  Aborting!!!!  ******

 
Hit ESC to begin running diagnostic menu…

                Diagnostic Menu
1)  Reset Controller          3)  DDBS Service Sub-Menu
2)  Display Warnings/Errors   4)  FCC Boot Sub-Menu

Enter Option : 3

So I select option 3, and then attempt the Utility Partition Boot and get the following:

1)  Drive Slot ID Check       2)  Utility Partition Boot

                    0) Exit

Enter Option : 2

DDBS: K10_REBOOT_DATA: Count = 0
DDBS: K10_REBOOT_DATA: State = 0
DDBS: K10_REBOOT_DATA: ForceDegradedMode = 0

DDBS: Read default MDDE off disk 1
DDBS: MDDE (Rev 2) on disk 1
DDBS: Read default DDE (0x40000F) off disk 1
DDBS: Read default MDDE off disk 3
DDBS: MDDE (Rev 2) on disk 3
DDBS: Read default DDE (0x400010) off disk 3

DDBS: Can’t read MDB from first disk.
DDBS: Can’t read MDB from second disk.
DDBS: Using first disk for boot – but inaccessible.

Utility Partition image (0x0040000F) located at sector LBA 0x00BE804C

Disk Set: 1
ErrorCode: 0x00000187
ErrorDesc:
Device: DIAG MENU
FRU: STORAGE PROCESSOR
Description: Disk not logged in Error!
Target ID: 0x01
Targets Found: 0xF000FF53
EndError:
ErrorTime: 01/19/2010 05:39:52

(the full boot log can be found here)

Okay, so that’s not cool. I had hoped that I would be able to boot from the Utility Partition, because the process to load the Recovery Image either from the repository or via ftp is fairly simple. At this point we started to think of a number of whacky alternatives that could be used, including, but not limited to, reconstructing the FLARE disks from another CX700’s hot spares, using the Vault disks from a CX300 and performing an in-place conversion to a CX700, and begging and pleading with our local EMC office for a Vault pack. None of these options really struck us as awesome ideas. Read Part 2 for the solution to the problem.