Hyper-Veeam

Disclaimer: I recently attended VeeamON Forum Sydney 2018My flights and accommodation were paid for by Veeam. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

I recently had the opportunity to attend VeeamON Forum in Sydney courtesy of Veeam. I was lucky enough to see Dave Russell‘s keynote speech, and also fortunate to spend some time chatting with him in the afternoon. Dave was great to talk to and I thought I’d share some of the key points here.

 

Hyper All of the Things

If you scroll down Veeam’s website you’ll see mention of a number of different “hyper” things, including hyper-availability. Veeam are keen to position themselves as an availability company, with their core focus being on making data you need recoverable, at the time when you need it to be recoverable.

Hyper-critical

Russell mentioned that data has become “hyper-critical” to business, with the likes of:

  • GDPR compliance;
  • PII data retention;
  • PCI compliance requirements;
  • Customer data; and
  • Financial records, etc.

Hyper-growth

Russell also spoke about the hyper-growth of data, with all kinds of data (including structured, unstructured, application, and Internet of things data) is also growing at a rapid clip.

Hyper-sprawl

This explosive growth of data has also lead to the “hyper-sprawl” of data, with your data now potentially living in any or all of the following locations:

  • SaaS-based solutions
  • Private cloud
  • Public cloud

 

Five Stages of Intelligent Data Management

Russell broke down Intelligent Data Management (IDM) into 5 stages.

Backup

A key part of any data management strategy is the ability to backup all workloads and ensure they are always recoverable in the event of outages, attack, loss or theft.

Aggregation

The ability to cope with data sprawl, as well as growth, means you need to ensure protection and access to data across multiple clouds to drive digital services and ensure continuous business operations.

Visibility

It’s not just about protecting vast chunks of data in multiple places though. You also need to look at the requirement to “improve management of data across multi-clouds with clear, unified visibility and control into usage, performance issues and operations”.

Orchestration

Orchestration, ideally, can then be used to “[s]eamlessly move data to the best location across multi-clouds to ensure business continuity, compliance, security and optimal use of resources for business operations”.

Automation

The final piece of the puzzle is automation. According to Veeam, you can get to a point where the “[d]ata becomes self-managing by learning to backup, migrate to ideal locations based on business needs, secure itself during anomalous activity and recover instantaneously”.

 

Thoughts

Data growth is not a new phenomenon by any stretch, and Veeam obviously aren’t the first to notice that protecting all this staff can be hard. Sprawl is also becoming a real problem in all types environments. It’s not just about knowing you have some unstructured data that can impact workflows in a key application. It’s about knowing which cloud platform that data might reside in. If you don’t know where it is, it makes it a lot harder to protect, and your risk profile increases as a result. It’s not just the vendors banging on about data growth through IoT either, it’s a very real phenomena that is creating all kinds of headaches for CxOs and their operations teams. Much like the push in to public cloud by “shadow IT” teams, IoT solutions are popping up in all kinds of unexpected places in the enterprise and making it harder to understand exactly where the important data is being kept and how it’s protected.

Veeam are talking a very good game around intelligent data management. I remember a similar approach being adopted by a three-letter storage company about a decade ago. They lost their way a little under the weight of acquisitions, but the foundation principles seem to still hold water today. Dave Russell obviously saw quite a bit at Gartner in his time there prior to Veeam, so it’s no real surprise that he’s pushing them in this direction.

Backup is just the beginning of the data management problem. There’s a lot else that needs to be done in order to get to the “intelligent” part of the equation. My opinion remains that a lot of enterprises are still some ways away from being there. I also really like Veeam’s focus on moving from policy-based through to a behaviour-based approach to data management.

I’ve been aware of Veeam for a number of years now, and have enjoyed watching them grow as a company. They’re working hard to make their way in the enterprise now, but still have a lot to offer the smaller environments. They tell me they’re committed to remaining a software-only solution, which gives them a certain amount of flexibility in terms of where they focus their R & D efforts. There’s a great cloud story there, and the bread and butter capabilities continue to evolve. I’m looking to see what they have coming over the next 12 months. It’s a relatively crowded market now, and it’s only going to get more competitive. I’ll be doing a few more articles in the next month or two focusing on some of Veeam’s key products so stay tuned.

Apple – I know too much about iPad recovery after iOS 8

So I now know too much about how to recover old files from iPad backups. I know this isn’t exactly my bread and butter, but I found the process fascinating, and thought it was worth documenting the process here. It all started when I upgraded my wife’s iPad 2 to iOS 8. Bad idea. Basically, it ran like rubbish and was pretty close to unusable. So I rolled it back, using the instructions here. Ok, so that’s cool, but it turns out I can’t restore the data from a backup because that was made with iOS 8 and wasn’t compatible with iOS 7.1.2. Okay, fine, it was probably time to clear out some apps, and all of the photos were saved on the desktop, so no big deal. Fast forward a few days, and we realise that all of her notes were on that device. Now for the fun bit. Note that I’m using a Mac. No idea what you need to do on a Windows machine, but I imagine it’s not too dissimilar.

Step 1. Recover the iPad backup from before the iOS upgrade using Time Machine. Note that you’ll need to be able to see hidden files in Finder, as the backup is stored under HOME/Library/Application Support/MobileSync/Backup and Time Machine uses Finder’s settings for file visibility. I used these instructions. Basically, fire up a terminal and type:

$ defaults write com.apple.finder AppleShowAllFiles TRUE
$ killall Finder

You’ll then see the files you need with Time Machine. When you’re finished, type:

$ defaults write com.apple.finder AppleShowAllFiles FALSE
$ killall Finder

Step 2. Now you can browse to HOME/Library/Application Support/MobileSync/Backup and recover your backup files. If you have more than one iDevice backed up, you might need to dig a bit through the files to recover the correct files. I used these instructions to locate the correct backup files. You’ll want to look for a file called “Info.plist”. In that file, you’ll see something like

<key>Device Name</key>
<string>My iPhone</string>

And from there you can restore the correct files. It will look something like this when recovered:

screen1

Step 3. Now you’ll want to go to the normal location of your iPad backups and rename your current backup to something else. Then copy the files that you recovered from Time Machine to this location.

screen2

Step 4. At this point, I followed these quite excellent instructions from Chris Taylor and used the pretty neat iPhone Backup Extractor to extract the files I needed. Once you’ve extracted the files, you’ll have something like this. Note the path of the files is iOS Files/Library/Notes.

screen3

Step 5. At this point, fire up MesaSQLite and open up the “notes.sqlite” file as per the instructions on Chris’s post. Fantastic, I’ve got access to the text from the notes. Except they have a bunch of html tags in them and are generally unformatted. Well, I’m pretty lazy, so I used the tool at Web 2.0 Generators to decode the html to formatted text for insertion into Notes.app files. And that’s it.

Conclusion. As it happens, I’ve now set this iPad up with iCloud synchronisation. *Theoretically* I won’t need to do this again. Nor should I have had to do it in the first place. But I’ve never come across an update that was quite so ugly on particular iDevices. Thanks to Apple for the learning opportunity.

QNAP – How to repair RAID brokenness

I use a QNAP 639 Pro NAS at home to store my movies on. It’s a good unit and overall I’ve found it to be relatively trouble-free. I was recently upgrading the disks in the RAID set from 1TB to 2TB drives and it was going swimmingly. But then I heard a beep while the RAID was rebuilding disk 5 of 6 in the set. And I started to get some concerned e-mails from the NAS.

Server Name: qnap639
 IP Address: 256.256.256.256
 Date/Time: 2011/06/09 16:27:33
 Level:  Error
 [RAID5 Disk Volume: Drive 1 2 3 4 5 6] Error occurred while accessing Drive 3.

Server Name: qnap639
 IP Address: 256.256.256.256
 Date/Time: 2011/06/09 16:27:40
 Level:  Error
 [RAID5 Disk Volume: Drive 1 2 3 4 5 6] Error occurred while accessing the devices of the volume in degraded mode.

Server Name: qnap639
 IP Address: 256.256.256.256
 Date/Time: 2011/06/09 16:29:32
 Level:  Warning
 [RAID5 Disk Volume: Drive 1 2 3 4 5 6] Mount the file system read-only.

Server Name: qnap639
 IP Address: 256.256.256.256
 Date/Time: 2011/06/09 16:31:41
 Level:  Warning
 [RAID5 Disk Volume: Drive 1 2 3 4 5 6] Rebuilding skipped.

Basically, it looks like the NAS thought one of the disks had popped. You can see this thing all over the QNAP forums – here‘s a good example – and it’s usually because of incompatibility between the QNAP firmware and various green hard disks. but I’d checked that my disks were on the Official QNAP HCL, and, well, that couldn’t be it. So I rebooted a bunch of times and ran S.M.A.R.T. scans on the allegedly failed disk. I pulled it out and erased it on an XP box and put it back in. The NAS wanted no part of it though. So it was time to get dirty with mdadm.

Firstly, make sure there’s nothing going on on the NAS, stop the running services and unmount the RAID device.

/etc/init.d/services.sh stop
umount /dev/md0

Once the volume’s unmounted you can stop the volume.

mdadm -S /dev/md0

Now for the bit where you hold your breath for a while – the reassembly of the volume with the components you want.

mdadm --assemble /dev/md0 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3 /dev/sde3 /dev/sdf3

To see the progress, you can access a couple fo different commands.

mdadm --detail /dev/md0
cat /proc/mdstat

Once that’s complete, it’s best to run a filesystem check.

e2fsck -f /dev/md0

If there’re no errors, mount the volume and check that your stuff is still there.

mount /dev/md0 /share/MD0_DATA

I then rebooted and confirmed that everything started up correctly and my data was still there. But when I added the 6th drive, I got an error about a missing superblock and there didn’t seem to be any mdadm magic that would solve this problem. So like I good admin I rebooted, and the NAS started rebuilding the volume with the 6th disk. Now if I can only fix the problem where smbd kills the CPU and disconnects guests from the share we’ll be gold.

CLARiiON CX700 FLARE Recovery – Part 3

In the previous series of posts, I covered the despair and ultimate joy of getting to a point where I could recover a munted CLARiiON CX700. In this post, I’ll cover the process to recover the array to a working state, and the steps required to get the array functioning at a useful level.

Having successfully performed a Utility Partition Boot, it’s ecessary to get the LAN service ports on the array configured in order to be able to ftp the recovery image to the array. Obviously, you’ll need the array and your service laptop plugged into a network-type thing that will enable frank communication between the arrays and you.

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) Enable LAN Service Port
8) Enable Engineering Mode
9) Install Images

Enter Option: 7
===============================================================================
Please enter the network settings you wish to use for this SP
===============================================================================
IP Address:  192.168.0.2
Subnet Mask:  255.255.255.0
Default Gateway:  192.168.0.255
Host Name:  spa
Domain Name:  

===============================================================================
Confirm Network Settings
===============================================================================
IP Address:      192.168.0.2  
Subnet Mask:     255.255.255.0
Default Gateway: 192.168.0.255
Host Name:       spa          
Domain Name:                  

Enable LAN Service Port with these settings? y/n [y] 
The LAN Service Port has been enabled

Automatically enable the LAN Port with these settings in the future? y/n [y] n

Press the Enter key to continue… 

Once you’ve enabled the LAN port on the SP you’re connected to, you need to ftp the image to the SP’s repository. The username to use is Clariion, and the password is clariion!. Once you’ve logged in, run a put command to put the file up there. It doesn’t really matter what you call it, but it should be a file of type mif. Here’s a pointless text capture of the ftp login process:

C:\>ftp 192.168.0.2
Connected to 192.168.0.2.
220-FileZilla Server version 0.8.3 beta test release 1
220-written by Tim Kosse (
[email protected])
220 Please visit
http://sourceforge.net/projects/filezilla/
User (192.168.0.2:(none)): Clariion
331 Password required for clariion
Password:
230 Logged on
ftp> ls
200 Port command successful
150 Opening data channel for directory list.
FLARE.mif
226 Transfer OK
ftp: 11 bytes received in 0.00Seconds 11000.00Kbytes/sec.
ftp>

Once you’ve successfully uploaded the recovery image, you’ll be good to go. It’s also important to note that the FLARE recovery image should be for the release that you intend to run. I didn’t consider uploading a Release 19 image, as I knew that these arrays had run Release 26 previously. In any case, jumping back into the Image menu on the terminal, it’s now time to copy the image from the RAM disk and then load it.

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) View LAN Service Port Settings
8) Enable Engineering Mode
9) Install Images

Enter Option: 4

===============================================================================
CLARiiON Utility Toolkit Image Repository Menu
===============================================================================
1) Back to the Main Menu
2) List Image Repository Contents
3) Delete Files from the Image Repository
4) Copy Files from the RAM Disk to the Image Repository
5) Copy Files from the Image Repository to the RAM Disk

Enter Option: 4

===============================================================================
Select files to copy to the Image Repository
===============================================================================
1) FLARE.mif

Enter comma separated list of options: 1

Copying FLARE.mif to the Image Repository… Success

Press the Enter key to continue…
 
===============================================================================
CLARiiON Utility Toolkit Image Repository Menu
===============================================================================
1) Back to the Main Menu
2) List Image Repository Contents
3) Delete Files from the Image Repository
4) Copy Files from the RAM Disk to the Image Repository
5) Copy Files from the Image Repository to the RAM Disk

Enter Option: 1

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) View LAN Service Port Settings
8) Enable Engineering Mode
9) Install Images

Enter Option: 9

===============================================================================
Select Images to Install
===============================================================================
1) FLARE.mif

Enter comma separated list of options: 1

===============================================================================
Confirm Image Installation
===============================================================================

  FLARE.mif

You need to install this only on the SP that you have visibility of, as troubleshooting the installation to both SPs is tricky.

Are you sure you want to install these images? y/n [n] y

===============================================================================
Select Storage Processors to install images for
===============================================================================
1) This SP (SP A)
2) Peer SP (SP B)
3) Both SP’s

Enter Option: 1
 
Installing Data Directory Boot Service 02.12
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Installing BIOS 03.70
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Installing Extended POST 02.38
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Installing FLARE Image 02.26.700.5.005
0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100%
|—-|—-|—-|—-|—-|—-|—-|—-|—-|—-|  
***************************************************
The COPY operation has completed successfully.

Press the Enter key to continue… 

Once the copy has completed successfully, the system needs to be reset, and you’ll see the SP reboot up to three times before it’s useable.

===============================================================================
CLARiiON Utility Toolkit Main Menu
===============================================================================
1) About the Utility Toolkit
2) About this Array
3) Reset Storage Processor
4) Image Repository Sub-Menu
5) Plugin Sub-Menu
6) NVRAM Sub-Menu
7) View LAN Service Port Settings
8) Enable Engineering Mode
9) Install Images

Enter Option: 3

Requesting System Reset         

Once this is complete, you can either load the recovery image to the other SP via Navisphere in Engineering Mode, or you can use the same method as described above. Note that, once the image is copied to the repository, it is not necessary to re-upload it, as both SPs have access to the files.

The normal process needs to be followed as would normally be followed to initialise an array. In my case I initialised security, setup IP addresses for the SPs, logged in, committed the FLARE (R26.005), enabled Access Logix, configured cache settings, upgraded FLARE to the latest (R26.028), reloaded the latest Utility Partition and Recovery Images, and went about loading the appropriate enablers for the array. And now we have a working lab :)

CLARiiON CX700 FLARE Recovery – Part 2

In Part 1 of this series I talked about what looked like a borked CX700 and some of the options available to me to get it up and running again. I trawled powerlink looking for solutions and came across a number of articles that talked about ordering Vault packs and getting EMC CEs involved. As the arrays were no longer under maintenance, I didn’t have high hopes that this would be a process we could undertake.

My trawling for solutions, however, did yield a rather interesting nugget of information. For those of you with access to Powerlink, there’s an article entitled “CX700 array unmanaged and fails to display its serial number after changing WWN seed array“. This article also goes by the ID emc119598 and discusses the process to rectify the array’s WWN seed after a conversion from a CX500 to a CX700. The great thing about this article was not so much the solution provided as the alternative method described to access the CLARiiON’s Diagnostics Menu. To wit, using the password “SHIP_it” yields a menu subsystem that is dramatically different from the one provided with “DB_key“. The results are below, the full transcript can be downloaded here:

Copyright (c) EMC Corporation , 2007
Disk Array Subsystem Controller
Model: CX700: SAN GBFCC4
DiagName: Extended POST
DiagRev: Rev. 02.39
Build Date: Fri Jul 13 16:36:03 2007
StartTime: 01/21/2010 22:04:18
SaSerialNo: LKE00051202843

AabcdefgBC

EndTime: 01/21/2010 22:04:19
…. Storage System Failure – Contact your Service Representative …

*******
******  Aborting!!!!  ******

 
Hit ESC to begin running diagnostic menu…

Entering the alternative password, we see the following output:

Diagnostic Menu
1)  Reset Controller          21) BE1 FCC Sub-Menu
2)  Enter Debugger            22) CMI0 FCC Sub-Menu
3)  Display Warnings/Errors   23) CMI1 FCC Sub-Menu
4)  Boot OS                   24) AUX0 FCC Sub-Menu
5)  POST Sub-Menu             25) AUX1 FCC Sub-Menu
6)  Display/Change Privilege  26) FE0 FCC Sub-Menu
7)  Boot UProc Sub-Menu       27) FE1 FCC Sub-Menu
8)  Ap UProc Sub-Menu         28) FE2 FCC Sub-Menu
9)  Real Time Clock Sub-Menu  29) FE3 FCC Sub-Menu
10) Pers. Module Sub-Menu     30) POST ROM Sub-Menu
11) RAM Sub-Menu              31) BIOS ROM Sub-Menu
12) NOVRAM Sub-Menu           32) System Test Sub-Menu
13) Console UART Sub-Menu     33) Image Sub-Menu
14) SPS UART Sub-Menu         34) Disk Sub-Menu
15) LCC 0 UART Sub-Menu       35) Resume PROM Sub-Menu
16) LCC 1 UART Sub-Menu       36) Voltage Margin Sub-Menu
17) LCC 2 UART Sub-Menu       37) Information Display
18) LCC 3 UART Sub-Menu       38) ICA Sub-Menu
19) LAN Service Port Sub-Menu 39) DDBS Service Sub-Menu
20) BE0 FCC Sub-Menu          40) FCC Boot Sub-Menu

Enter Option : 33

Option 33 is what we’re interested in to start with. From here you can perform a Utility Partition Boot.

Image Sub-Menu
1)  Init Loop                 6)  Exit Loop
2)  Serial Download           7)  Relocate/Run Image
3)  Load from disk            8)  Display Sector Protection
4)  Save to disk              9)  Utility Partition Boot
5)  Update Firmware

                    0) Exit

Enter Option :  9
Relocating Data Directory Boot Service (DDBS: Rev. 02.12)…
DDBS: K10_REBOOT_DATA: Count = 0
DDBS: K10_REBOOT_DATA: State = 0
DDBS: K10_REBOOT_DATA: ForceDegradedMode = 0

DDBS: Read default MDDE off disk 1
DDBS: MDDE (Rev 2) on disk 1
DDBS: Read default DDE (0x40000F) off disk 1
DDBS: Read default MDDE off disk 3
DDBS: MDDE (Rev 2) on disk 3
DDBS: Read default DDE (0x400010) off disk 3

DDBS: MDB read from both disks.
DDBS: DD invalid on both disks, continuing…
DDBS: Disk WWN seeds match each other but not chassis WWN seed.
DDBS: First disk is valid for boot.
DDBS: Second disk is valid for boot.

[snip]

int13 – GET DRIVE PARAMETERS (Extended) (1437)
ICA::UtilityFrontEnd
(c) EMC Corporation 2001-2004 All Rights Reserved
DiagName: ICA::UtilityFrontEnd
DiagRev: 02.16.700.5.001
StartTime: 01/21/10 22:08:37

OS Type……………………….WinXP
SMBUS…………………………Running
SPID………………………….Running
ASIDC…………………………Running
ASIRAMDisk…………………….Running
ICA…………………………..Running
FileZilla Server……………….Running
Connecting to ICA………………Success
SP Type……………………….CX700
SP ID…………………………A
SP Signature…………………..0x08291953
Checking Image Repository……….
    ICA::IRFS no valid Volume was found on this system
    ICA::IRFS Creating new Volume
    ICA::IRFS Finished creating new volume
    ICA::IRFS Checking Volume for consistency
Sizing Image Repository…………1024 MB
Sizing RAM Disk………………..2039 MB
Discovering Management LAN Port….ManagementPort0
Checking LAN Port State…………Not Configured
Checking LAN Port Config………..Not Found
Loading Plugins………………..Done

EndTime: 01/21/10 22:09:03

Now that’s what I wanted to see :) – from here we just need to reload the FLARE image with ftp. I’ll cover this in Part 3.

CLARiiON CX700 FLARE Recovery – Part 1

I’ve broken this post into three parts not because I’m trying to being a tease, but rather so it’s a little easier to digest and you can head straight for the money shot if you so desire. Part 1 covers the intitial failure and subsequent troubleshooting. Part 2 covers the eventual workaround to the problem. Part 3 covers the work that needed to be done once the problem was resolved.

It’s strange to think of EMC’s CX700 CLARiiON array as a “legacy” array. Yet it’s now two generations behind EMC’s flagship mid-range array – the CX4-960. Our project was given access to two CX700s to use as test arrays for a multi-site data centre project we’re working on. That’s cool, as the CX700 is still a reasonably well-specced array, with multiple back-end loops and the a fair bit of useable cache (at least compared to the CX4-120). So after the data centre guys Macguyvered the kit into racks that were too big for the rails, we cabled the lot up and thought it would be a fairly trivial process to get everything up and running.

As usual, I was wrong. The department that had provided these hand-me-down arrays had bought a service from EMC whereby the data was securely erased. For those of you playing at home, this is known as the “Certified Data Erasure Service“, and you can grab the datasheet from here. So basically, these arrays were saved from the scrapheap, but not before they were rendered basically unbootable.

When we powered them up, we got the following output via the terminal:

Disk 2 Read Error 0x00000187
Number Sectors: 1
LBA: 0x0002284B
Buffer: 0x1000A114

DDBS: Can’t read MDB from first disk.
DDBS: Can’t read MDB from second disk.
DDBS: Using first disk for boot – but inaccessible.

FLARE image (0x00400007) located at sector LBA 0x0002284C

Disk Set: 0
ErrorCode: 0x0000018D
ErrorDesc:
Device: BOOT PATH
FRU: STORAGE PROCESSOR
Description: Dual-Mode Fibre Driver Exchange Error!
DualMode Driver Exchange Status: 0x1000000C
Target ID: 0x00
EndError:
ErrorTime: 01/19/2010 05:07:11

(the full boot log can be found here)

So I tried booting to the utility partition via the DDBS submenu. You would be familiar with this submenu if you’ve ever performed an out-of-familty array conversion, it’s where you go to run the conversion image over the new array and tell FLARE that it’s brain is, er, bigger. In any case, this can be accessed by pressing ESC during the initial POST and then typing in “DB_key“. Note that on newer CX3 and CX4 arrays, you don’t press ESC anymore, but rather CTRL-C is used to break the boot. You’ll then be presented with menus that look something like this:

Copyright (c) EMC Corporation , 2007
Disk Array Subsystem Controller
Model: CX700: SAN GBFCC4
DiagName: Extended POST
DiagRev: Rev. 02.39
Build Date: Fri Jul 13 16:36:03 2007
StartTime: 01/19/2010 05:38:05
SaSerialNo: LKE00051202843

AabcdefgBCDEabFabcdGHabIabcJabKabLab

EndTime: 01/19/2010 05:38:20
…. Storage System Failure – Contact your Service Representative …

******
******  Aborting!!!!  ******

 
Hit ESC to begin running diagnostic menu…

                Diagnostic Menu
1)  Reset Controller          3)  DDBS Service Sub-Menu
2)  Display Warnings/Errors   4)  FCC Boot Sub-Menu

Enter Option : 3

So I select option 3, and then attempt the Utility Partition Boot and get the following:

1)  Drive Slot ID Check       2)  Utility Partition Boot

                    0) Exit

Enter Option : 2

DDBS: K10_REBOOT_DATA: Count = 0
DDBS: K10_REBOOT_DATA: State = 0
DDBS: K10_REBOOT_DATA: ForceDegradedMode = 0

DDBS: Read default MDDE off disk 1
DDBS: MDDE (Rev 2) on disk 1
DDBS: Read default DDE (0x40000F) off disk 1
DDBS: Read default MDDE off disk 3
DDBS: MDDE (Rev 2) on disk 3
DDBS: Read default DDE (0x400010) off disk 3

DDBS: Can’t read MDB from first disk.
DDBS: Can’t read MDB from second disk.
DDBS: Using first disk for boot – but inaccessible.

Utility Partition image (0x0040000F) located at sector LBA 0x00BE804C

Disk Set: 1
ErrorCode: 0x00000187
ErrorDesc:
Device: DIAG MENU
FRU: STORAGE PROCESSOR
Description: Disk not logged in Error!
Target ID: 0x01
Targets Found: 0xF000FF53
EndError:
ErrorTime: 01/19/2010 05:39:52

(the full boot log can be found here)

Okay, so that’s not cool. I had hoped that I would be able to boot from the Utility Partition, because the process to load the Recovery Image either from the repository or via ftp is fairly simple. At this point we started to think of a number of whacky alternatives that could be used, including, but not limited to, reconstructing the FLARE disks from another CX700’s hot spares, using the Vault disks from a CX300 and performing an in-place conversion to a CX700, and begging and pleading with our local EMC office for a Vault pack. None of these options really struck us as awesome ideas. Read Part 2 for the solution to the problem.