OpenMediaVault – Annoying mdadm e-mails after a rebuild

My homebrew NAS running OpenMediaVault (based on Debian) started writing to me recently. I’d had a disk failure and replaced the disk in the RAID set with another one. Everything rebuilt properly, but then this mdadm chap started sending me messages daily.

"This is an automatically generated mail message from mdadm
 running on openmediavault
A SparesMissing event had been detected on md device /dev/md0.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid6] [raid5] [raid4] 
 md0 : active raid6 sdi[0] sda[8] sdb[6] sdc[5] sdd[4] sde[3] sdf[2] sdh[1]
 11720297472 blocks super 1.2 level 6, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
unused devices: <none>"

Which was nice of it to get in touch. But I’d never had spares configured on this md device. The fix is simple, and is outlined here and here. In short, you’ll want to edit /etc/mdadm/mdadm.conf and changes spares=1 to spares=0. This is assuming you don’t want spares configured and are relying on parity for resilience. If you do want spares configured then it’s probably best you look into the problem a little more.

OpenMediaVault – A few notes

Following on from my brief look at FreeNAS here, I thought I’d do a quick article on OpenMediaVault as well. While it isn’t quite as mature as FreeNAS, it is based on Debian. I’ve had a soft spot for Debian ever since I was able to get it running on a DECpc AXP 150 I had lying about many moons ago. The Jensen is no longer with us, but the fond memories remain. Anyway …

Firstly, you can download OpenMediaVault here. It’s recommended that you install it on a hard drive (ideally in a RAID 1 configuration) rather than on USB or SD cards. Theoretically you could put it on a stick and redirect the more frequently written stuff to a RAM disk if you really didn’t want to give up the SATA ports on your board. I decided to use an SSD I had laying about as I couldn’t be bothered with more workarounds and “tweaks”. You can follow this guide to setup some semi-automated backup of the configuration.

Secondly, here’s a list of the hardware I used for this build:

  • Mainboard – ASRock N3700-ITX
  • CPU – Intel Quad-Core Pentium Processor N3700 (on-board)
  • RAM – 2 * Kingston 8GB 1600MHz DDR3 Non-ECC CL11 SODIMM
  • HDDs – 1 * SSD, 8 * Seagate Constellation ES 2TB drives
  • SATA Controller PCIe x1 4-port SATA III controller (non-RAID), using a Marvell 88SE9215 chipset
  • IO Crest Mini PCIe 2-port SATA III controller (RAID capable), using a Syba (?) chipset
  • Case – Fractal Design Node 804
  • PSU – Silverstone Strider Essential 400W

IMG_3054

You’ll notice the lack of ECC RAM, and the board is limited in SATA ports, hence the requirement for a single-lane, 4-port SATA card. I’m really not the best at choosing the right hardware for the job. The case is nice and roomy, but there’s no hot-swap for the disks. A better choice would have been a workstation-class board with support for ECC RAM, a decent CPU and a bunch of SATA ports in a micro-ATX form-factor. I mean, it works, but it could have been better. I’d like to think it’s because the market is a bit more limited in Australia, but it’s more because I’m not very good at this stuff.

Thirdly, if you do end up with the ASRock board, you’ll need to make a change to your grub configuration so that the board will boot headless. To do this, ssh or console onto the machine and edit /etc/default/grub. Uncomment GRUB_TERMINAL=console (by removing the #). You’ll then need to run update-grub and you should be right to boot the machine without a monitor connected.

Finally, the OMV experience has been pretty good thus far. None of these roll-your-own options are as pretty as their QNAP or Synology brethren from a UX perspective, but they do the job in a functional, if somewhat sparse fashion. That said, having been a QNAP user for a about 7 years now, I remember that it wasn’t always the eye candy that it is nowadays. Also of note, OMV has a pretty reasonable plugin ecosystem you can leverage, with Plex and a bunch of extras being fairly simple to install and configure. I’m looking forward to running this thing through its paces and posting the performance and useability results.

 

 

FreeNAS – A few notes

Mat and I have been talking about FreeNAS a lot recently. My QNAP TS-639 Pro is approaching 7 years old and I’m reluctant to invest further money in drives for it. So we’ve been doing a bit of research on what might be good hardware and so forth. I thought I’d put together a few links that I found useful and share some commentary.

Firstly, FreeNAS has been around for a while now, and there is a plethora of useful documentation available via the official documentation, forums and blog posts. While digging through the comments on a post I noticed someone saying that the FreeNAS crowd like to patronise people a lot. It might be a little unfair, although they do sometimes come across as a bit dickish, so be prepared. It’s like anything on the internet really.

Secondly, most of the angst comes about through the choices people make for their DIY hardware builds. There’s a lot of talk about ECC RAM and why it’s critical to a decent build. I have a strong dislike of the word “noobs” and variants, but there are some good points made in this thread. Brian Moses has an interesting counter here, which I found insightful as well. So, your mileage might vary. For what it’s worth, I’m not using ECC RAM in my current build, but I am by no means a shining light when it comes to best practice for IT in the home. If I was going to store data on it that I couldn’t afford to reload from another source (I’m using it to stream mkv files around the house) I would look at ECC.

Thirdly, one of the “folk of the forum”, as I’ll now call them, has a handy primer on FreeNAS that you can view in a few different ways here. It hasn’t been updated in a little while, but it covers off a lot of the salient points when looking at doing your own build and getting started with FreeNAS. If you want a few alternative approaches to what may or may not work for you, have a look at Brian’s post here, as well as this one and this one. Also, if you’re still on the fence about FreeNAS, take a look at Brian’s DIY NAS Software Roundup – it’s well written and covers a number of the important points. The key takeaways when looking at doing your own build are as follows:

  • Do your research before you buy stuff;
  • Don’t go cheap on RAM (ECC if you can);
  • Think about the real requirement for ZIL or L2ARC; and
  • Not everyone on the internet is a prick, but sometimes it will seem like that.

Finally, my experience with FreeNAS itself has been pretty good. I admit that I haven’t used FreeBSD or its variants in quite a few years, but the web interface is pretty easy to navigate. I’ve mucked about a bit with the different zpool configurations, and how to configure the ZIL and L2ARC on a different drive (that post is coming shortly). The installation is straight forward and once I got my head around the concept of jails it was easy to setup Plex and give it a spin too. Performance was good given the hardware I’ve tested on (when the drives weren’t overheating due to the lack of airflow and an Aussie summer). I’m hoping to do the real build this week or next, so I’ll see how it goes then and report back. I might give NexentaStor Community Edition a crack as well. I have a soft spot for them because they gave me some shoes once. In the meantime, if anyone at iXsystems wants to send me a FreeNAS Mini, just let me know.

QNAP – Upgrading Firmware via the CLI

For some reason, I keep persisting with my QNAP TS-639 II, despite the fact that every time something goes wrong with it I spend hours trying to revive it. In any case, I recently had an issue with a disk showing SMART warnings. I figured it would be a good idea to replace it before it became a big problem. I had some disks on the shelf from the last upgrade. When I popped one in, however, it sent me this e-mail.

Server Name: qnap639
IP Address: 192.168.0.110
Date/Time: 28/05/2015 06:27:00
Level: Warning
The firmware versions of the system built-in flash (4.1.3 Build 20150408) and the hard drive (4.1.2 Build 20150126) are not consistent. It is recommended to update the firmware again for higher system stability.

Not such a great result. I ignored the warning and manually rebuilt the /dev/md0 device. When I rebooted, however, I still had the warning. And a missing disk from the md0 device (but that’s a story for later). To get around this problem, it is recommended that you reinstall the array firmware via the shell. I took my instructions from here. In short, you copy the image file to a share, copy that to an update directory, run a script, and reboot. It fixed my problem as it relates to that warning, but I’m still having issues getting a drive to join the RAID device. I’m currently clearing the array again and will put in a new drive next week. Here’s what it looks like when you upgrade the firmware this way.

[/etc/config] # cd /
[/] # mkdir /mnt/HDA_ROOT/update
mkdir: Cannot create directory `/mnt/HDA_ROOT/update': File exists
[/] # cd /mnt/HDA_ROOT/update
[/mnt/HDA_ROOT/update] # ls
[/mnt/HDA_ROOT/update] # cd /
[/] # cp /share/Public/TS-639_20150408-4.1.3.img /mnt/HDA_ROOT/update/
[/] # ln -sf /mnt/HDA_ROOT/update /mnt/update
[/] # /etc/init.d/update.sh /mnt/HDA_ROOT/update/TS-639_20150408-4.1.3.img 
cksum=238546404
Check RAM space available for FW update: OK.
Using 120-bit encryption - (QNAPNASVERSION4)
len=1048576
model name = TS-639
version = 4.1.3
boot/
bzImage
bzImage.cksum
config/
fw_info
initrd.boot
initrd.boot.cksum
libcrypto.so.1.0.0
libssl.so.1.0.0
qpkg.tar
qpkg.tar.cksum
rootfs2.bz
rootfs2.bz.cksum
rootfs_ext.tgz
rootfs_ext.tgz.cksum
update/
update_img.sh
4.1.3 20150408 
OLD MODEL NAME = TS-639
Allow upgrade
Allow upgrade
/mnt/HDA_ROOT/update
1+0 records in
1+0 records out
tune2fs 1.41.4 (27-Jan-2009)
Setting maximal mount count to -1
Setting interval between checks to 0 seconds
Update image using HDD ...
bzImage cksum ... Pass
initrd.boot cksum ... Pass
rootfs2.bz cksum ... Pass
rootfs_ext.tgz cksum ... Pass
rootfs_ext.tgz cksum ... Pass
qpkg.tar cksum ... Pass
Update RFS1...
mke2fs 1.41.4 (27-Jan-2009)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
13832 inodes, 55296 blocks
0 blocks (0.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=56623104
7 block groups
8192 blocks per group, 8192 fragments per group
1976 inodes per group
Superblock backups stored on blocks: 
8193, 24577, 40961
Writing inode tables: done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 21 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
Checking bzImage ... ok
Checking initrd.boot ... ok
Checking rootfs2.bz ... ok
Checking rootfs_ext.tgz ... ok
Update RFS2...
mke2fs 1.41.4 (27-Jan-2009)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
13832 inodes, 55296 blocks
0 blocks (0.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=56623104
7 block groups
8192 blocks per group, 8192 fragments per group
1976 inodes per group
Superblock backups stored on blocks: 
8193, 24577, 40961
Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 31 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
1+0 records in
1+0 records out
Update Finished.
Make a Backup
/share/MD0_DATA
qpkg.tar cksum ... Pass
set cksum [238546404]
[/] # reboot
[/] #


Storage Field Day 7 – Day 3 – Exablox

Disclaimer: I recently attended Storage Field Day 7.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the Exablox presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Exablox website that covers some of what they presented.

Brief Overview

Exablox was founded in 2010 and launched publicly in April 2013. There are two key elements to their solution:

  • OneBlox – scale-out storage for the enterprise, offering converged storage for primary and backup / archival data; and
  • OneSystem – manage on-premises storage exclusively from anywhere, providing visibility, control, and security without cost / complexity of traditional management

Here’s a photo of Tad Hunt (CTO and Co-founder) showing us the internals of the Exablox appliance.

IMG_1214_11

 

Architecture

Exablox started the presentation by talking about what we want from storage re-imagined (my words, not theirs):

  • Scale out;
  • Deduplication;
  • Snapshots;
  • Replication;
  • Be simple yet powerful; and
  • Be managed from everywhere.

The Exablox approach is not your father’s standard storage presentation play. Instead of providing block storage via SMB / NFS, or object storage via APIs, it instead presents file protocols via the front-end and services these with object storage on the back-end.

exablox-architecture-diagram

Technology Vision

Exablox’s approach revolves around software-defined storage (SDS) and storage management, with the following goals:

  • Manage the policy, not the technology;
  • SDS “wrapped in tin” for the mid market;
  • Eliminate complexity;
  • Plug-and-play; and
  • Next generation features.

They deliver NAS features atop object storage:

  • Without metadata servers;
  • Without bolt-on NAS gateways;
  • Without separate data and metadata servers; and
  • To scale capacity, performance, or resilience: just add a node.

 

Technology Benefits

Exablox say they can create scale-out NAS and object clusters atop mixed media – HDD, SSD, Shingled drives. This approach delivers the benefits of object storage technology to traditional applications:

  • By using standard file protocols; and
  • eliminating forklift upgrades – single namespace across the scale of the cluster.

They also use “RAID-free” data protection:

  • Self-healing from multiple drive and node failures;
  • Rebalancing time proportional to the quantity of objects on the failed drive;
  • Mix and match drive types, capacities, technologies; and
  • Introduce next generation drives without long validation cycles.

This provides the ability to scale capacity from TB to PB easily, whilst also offering:

  • Zero configuration expansion; and
  • Manage from anywhere capability.

Exablox say they are able to support all NAS workloads well. Whereas other object stores are designed primarily for large files, a OneBlox 3308 can handle 1B objects. All nodes perform all functions: storage, control, NAS interface, with a node being a single failure domain.

 

Hardware Notes and Thoughts

For the purposes of this post, I wanted to focus on the OneBlox appliance. While the OneSystem architecture is super neat, I still get a bit of a nerd tingle when I see some nice hardware. (BTW if Exablox want me test one long-term I’d be happy to oblige).

Exablox claims to be the sole provider of the following features in a single storage solution:

  • Scale-out deduplication;
  • Scale-out, continuous snapshots;
  • Scale-out, RAID-less capacity;
  • Scale-out, site-to-site disaster recovery; and
  • Bring any drive – one at a time at retail pricing.

They also support auto-clustering, with each node adding:

  • Capacity;
  • Performance; and
  • Resiliency.

The Exablox 3308 appliance:

  • Is seriously bloody quiet;
  • Uses 100W under peak load;
  • Has 8 * 3.5” drive bays, supporting up to 48 raw TB; and
  • Can use a mix of SATA & SAS drives.

Here is a picture of some appliances on a rack.

IMG_1213_cropped

Further Reading

I was impressed with the strategy presented to me by Exablox, and the apparent ease of deployment and overall design of the appliance seemed great on the surface. I’d like to be clear that I haven’t used these in the wild, nor have I had any view of any benchmark data, so I can’t comment as to the effective performance of these devices. Like most things in storage, your mileage might vary. But I will say they seem quite inexpensive for what they do, and I recommend taking a more detailed look at them.

I also recommend you check out Keith’s preview post on Exablox.  For a different perspective on the hardware, have a look at Storage Review’s take on things as well.

EMC announces Isilon enhancements

I sat in on a recent EMC briefing regarding some Isilon enhancements and I thought my three loyal readers might like to read through my notes. As I’ve stated before, I am literally one of the worst tech journalists on the internet, so if you’re after insight and deep analysis, you’re probably better off looking elsewhere. Let’s focus on skimming the surface instead, yeah? As always, if you want to know further about these announcements, the best place to start would be your local EMC account team.

Firstly, EMC have improved what I like to call the “Protocol Spider”, with support for the following new protocols:

  • SMB 3.0
  • HDFS 2.3*
  • OpenStack SWIFT*

* Note that this will be available by the end of the year.

Here’s a picture that says pretty much the same thing as the words above.

isilon_protocols

 

 

 

 

 

 

 

In addition to the OneFS updates, two new hardware models have also been announced.

S210

S210

 

  • Up to 13.8TB globally coherent cache in a single cluster (96GB RAM per node);
  • Dual Quad-Core Intel 2.4GHz Westmere Processors;
  • 24 * 2.5” 300GB or 600GB 10Krpm Serial Attached SCSI (SAS) 6Gb/s Drives; and
  • 10GbE (Copper & Fiber) Front-end Networking Interface.

 

Out with the old and in with the new.

S200vsS210_cropped

X410

X410

 

  • Up to 6.9TB globally coherent cache in a single cluster (48GB RAM per node);
  • Quad-Core Intel Nehalem E5504 Processor;
  • 12 * 3.5” 500GB, 1TB, 2TB, 3TB 7.2Krpm Serial ATA (SATA) Drives; and
  • 10GbE (Copper & Fiber) Front-end Networking Interface.

Some of the key features include:

  • 50% more DRAM in baseline configuration than current 2U X-series platform;
  • Configurable memory (6GB to 48GB) per node to suit specific application & workflow needs;
  • 3x increase in density per RU thus lowering power, cooling and footprint expenses;
  • Enterprise SSD support for latency sensitive namespace acceleration or file storage apps; and
  • Redesigned chassis that delivers superior cooling and vibration control.

 

Here’s a picture that does a mighty job of comparing the new model to the old one.

X400vsX410_cropped

 

Isilon SmartFlash

EMC also announced SmartFlash for Isilon, which uses SSDs as an addition to DRAM for flash capability. The upshot is that you can have 1PB Flash vs 37TB DRAM. It’s also globally coherent, unlike some of my tweets.

Here’s a picture.

Isilon_SmartFlash

What the Dell just happened? – Dell Storage Forum Sydney 2012 – Part 2

Disclaimer: I recently attended the Dell Storage Forum Sydney 2012.  My flights and accommodation were covered by Dell, however there is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Part 2

In this post I’d like to touch briefly on some of the sessions I went to and point you in the direction of some further reading. I’m working on some more content for the near future.

 

Dell AppAssure Physical, Virtual and Cloud Recovery

If you’re unfamiliar with AppAssure, head over to their website for a fairly comprehensive look at what they can do. Version 5 was recently released. Dan Moz has been banging on about this product to me for a while, and it actually looks pretty good. Andrew Diamond presented a large chunk of the content while battling some time constraints thanks to the keynote running over time, while Dan was demo boy. Here’s a picture with words (a diagram, if you will) that gives an idea of what AppAssure can do.

(Image source – http://www.appassure.com/downloads/Transform_Data_Protection_with_Dell_AppAssure.pdf)

Live Recovery is one of my favourite features. With this it’s “not even necessary to wait for a complete restore to be able to access and use the data”. This is really handy when you’re trying to recover 100s of GB of file data but don’t know exactly what the users will want to access first.

Recovery Assure “detects the presence of Microsoft Exchange and SQL and its respective databases and log files and automatically groups the volumes with dependency for comprehensive protection and rapid recovery”. The cool thing here is that you’re going to be told if there’s going to be SNAFU when you recover before you recover. It’s not going to save your bacon every time, but it’s going to help with avoiding awkward conversations with the GM.

In the next few weeks I’m hoping to put together a more detailed brief on what AppAssure can and can’t do.

 

A Day in the Life of a Dell Compellent Page: How Dynamic Capacity, Data Instant Replay and Data Progression Work Together

Compellent bought serious tiering tech to Dell upon acquisition, and has really driven the Fluid Data play that’s going on at the moment. This session was all about “closely following a page from first write to demotion to low-cost disk”. Sound dry? I must admit it was a little. It was also, however, a great introduction to how pages move about the Compellent and what that means to storage workloads and efficiency. You can read some more about the Compellent architecture here.

The second half of the session comprised a customer testimonial (an Australian on-line betting company) and brief Q & A with the customer. It was good to see that the customer was happy to tell the truth when pushed about some of the features of the Compellent stack and how it had helped and hurt in his environment. Kudos to my Dell AE for bringing up the question of how FastTrack has helped only to watch the customer reluctantly admit it was one of the few problems he’d had since deploying the solution.

 

Media Lunch ‘Fluid Data and the Storage Evolution’

When I was first approached about attending this event, the idea was that there’d be a blogger roundtable. For a number of reasons, including availability of key people, that had to be canned and I was invited to attend the media lunch instead. Topics covered during the lunch were basically the same as the keynote, but in a “lite” format. There was also two customers providing testimonials about Dell and how happy they were with their Compellent environments. It wasn’t quite the event that Dell had intended, at least from a blogger perspective, but I think they’re very keen to get more of this stuff happening in the future, with some more focus on the tech rather than the financials. At least, I hope that’s the case.

 

On the Floor

In the exhibition hall I got to look at some bright shinies and talk to some bright folks about new products that have been released. FluidFS (registration required) is available across the Equallogic, Compellent and PowerVault range now. “With FluidFS, our unified storage systems can manage up to 1PB of file data in a single namespace”. Some people were quite excited about this. I had to check out the FS8600, which is the new Compellent Unified offering.

I also had a quick look at the Dell EqualLogic PS-M4110 Blade Array which is basically a PS4000 running in a blade chassis. You can have up to 4 of these things in a single M1000e chassis, and they support 14 2.5″ drives in a variety of combinations. Interestingly you can only have 2 of these in a single group, so you would need 2 groups per chassis if you fully populated it.

Finally I took a brief gander at a PS6500 Series machine. These are 4RU EQL boxes that take up to 48 spindles and basically can give you a bunch of tiering in a big box with a fairly small footprint.

 

Swag

As an attendee at the event I was given a backpack, water bottle, some pens, a SNIA Dictionary and a CommVault yo-yo. I’ll let you know if I won a laptop.

I may or may not have had some problems filling out my registration properly though.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Thanks, etc

For an inaugural event, I thought the Dell Storage Forum was great, and I’m stoked that vendors are starting to see the value in getting like-minded folk in the same place to get into useful tech stuff, rather than marketing fluff. Thanks to @DanMoz for getting me down there as a blogger in the first place and for making sure I had everything I needed while I was there. Thanks also to the Dell PR and Events people and the other Dell folks who took the time to say hi and check that everything was cool. It was also nice to meet Simon Sharwood in real life, after reading his articles on The Register and stalking him on twitter.

What the Dell just happened? – Dell Storage Forum Sydney 2012 – Part 1

Disclaimer: I recently attended the Dell Storage Forum Sydney 2012.  My flights and accommodation were covered by Dell, however there is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

 

 

 

 

Rather than give you an edited transcript of the sessions I attended, I thought it would be easier if I pointed out some of the highlights. In the next few weeks I’m going to do some more detailed posts, particularly on AppAssure and some of the new Compellent stuff. This is the first time I’ve paid attention to what was going on on stage in terms of things to blog about, so it might be a bit rough around the edges. If it comes across as a bit of propaganda from Dell, well, it was their show. There was a metric shedload of good information presented on the day and I don’t think I could do it justice in one post. And if I hear one more person mention “fluid architecture” I’ll probably lose it.

Part 1 

Keynote

Dell is big on the Dell Fluid Data Architecture and they’re starting to execute on that strategy. Introducing the keynote speakers was Jamie Humphrey, Director of Storage and Data Management for Australia & New Zealand. The first speaker introduced was Joe Kremer, Vice President and Managing Director, Dell Australia & New Zealand. He spent some time on the global Dell transformation which involved intellectual property (acquisition and development), progressing Dell’s strategy, and offering solution completeness to customers. He’s also keen to see increased efficiency in the enterprise through standards adoption rather than the use of proprietary systems. Dell are big on simplicity and automation.

Dell is now all about shifting its orientation towards solutions with outcomes rather than the short-term wins they’d previously been focussed on. There have been 24 acquisitions since 2008 (18 since 2010). Perot Systems has apparently contributed significantly in terms of services and reference architectures. There have been 6 storage acquisitions in the last 3 years. Joe also went on to talk about why they went for Equallogic, Compellent, Ocarina, Insite One (a public medical cloud), RNA Networks, AppAssure, Wyse, Force10, and Quest. The mantra seems to be “What do you need? We’ll make it or buy it”. Services people make up the biggest part of the team in Australia now, which is a refreshing change from a few years ago. Dell have also been doing some “on-shoring” of various support teams in Australia, presumably so we’ll feel warm and fuzzy about being that little bit closer to a throat we can choke when we need to.

When Joe was finished, it was time for the expert panel. First up was Brett Roscoe, General Manager and Executive Director, PowerVault and Data Management. He discussed Dell’s opportunity to sell a better “together” story through servers and storage. Nowadays you can buy a closed stack, build it yourself, or do it Dell’s way. Dell wants to put together open storage, server and network to keep costs down, drive automation, ease of use and integration across the product line. The fluid thing is all about everything finding its own level, fitting into whatever container you put it in to. Brett also raised the point that enterprise features from a few years ago are now available in today’s midrange arrays, with midrange prices to match. Dell is keen to keep up the strategy using the following steps: Acquire, Integrate and Innovate. They’re also seeing themselves as the biggest storage start-up in the world, which is a novel concept but makes some sense when you consider the nature of their acquisitions. Dedupe and compression in the filesystem is “coming”. Integration will be the key to Dell successfully executing its strategy. Brett also made some product availability announcements (see On The Floor in Part 2).Brett also had one of the funnier lines of the day – “Before I bring up the smart architect guys, I want to bring up one of our local guys” – when introducing Phil Davis, Vice President, Enterprise Solutions Group, Dell Asia Pacific & Japan to the stage.

They then launched into a series of video-linked whiteboard sessions with a number of “Enterprise Technologists”, with a whiteboard they had setup in front of them being filmed and projected onto the screens in the auditorium so we could see it clearly in the audience. It was a nice way to do the presentation, and a little more engaging than the standard videos and slide deck we normally see with keynotes.

The first discussion was on flash, with a focus on the RNA Networks acquisition. Tim Plaud, Principal Storage Architect at Dell, talked about the move of SSD into the server from the array to avoid the latency. The problem with this? It’s not shared. So why not use it as cache (Fluid Cache)? Devices can communicate with each other over a low latency network using Remote DMA to create a cache pool. Take a 15000 IOPS device in the array, remove the latency (network, controller, SAS) and put it out on the PCI Bus and you can get yourself a 250000 IOPS per device. Now put 4 per server (for Dell 12G servers). How do you protect the write cache? Use cache partners in a physically different server, de-staging in the background in “near real-time”. You can also pick your interface for the cache network. And I’m assuming that Force10 and 40Gb would help here. Servers without the devices can still participate in the cache pool through the use of the software. Cache is de-staged before Replays (snapshots) happen, so the Replays are application- or crash-consistent. Tim also talked about working replication – “Asynchronously, semi-synchronously or truly synchronously”. I’m not sure I want to guess what semi-synchronous is. Upward tiering (to the host), and tiering down / out (to the cloud) is also another strategy that they’re working on.

The second discussion was around how data protection is changing – with RPOs and RTOs getting more insane – driving the adoption of snapshots and replication as protection mechanisms. Mike Davis – Director of Marketing, Storage was called up on stage to talk about AppAssure. He talked about how quickly the application can be back on-line after a failure as the primary driver in a number of businesses. AppAssure promises to do not only the data, but the application state as well, while providing flexible recovery options. AppAssure also promises efficiency through the use of incremental forever and dedupe and compression. AppAssure uses a “Core” server as the primary component – just set one up wherever you might want to recover to – be that a Disaster Recovery site, the cloud, or another environment within the same data centre. You can also use AppAssure to replicate from CMP to EQL to Cloud, etc.

The final topic – software architecture to run in a cloud environment on Equallogic – was delivered by Mark Keating, Director of Storage QA at Dell. He talked about how the array is traditionally comprised of the Management layer / Virtualisation (abstraction) layer / Platform (controllers, drives, RAID, FANs). Dell want to be de-coupling these layers in the future. With Host Virtualized Storage (HVS) they’ll be able to do this, and it’s expected sometime next year. Take the management and virtualisation layer and put them in the cloud as a virtual workload. Use any hardware you want but keep the application integration and scalability of Equallogic (because they love the software on the Equallogic, the rest is just tin). Use cases? Tie it to a virtual application. Make a SAN for Exchange, make one for SQL. Temporary expansion of EQL capacity in the cloud is possible. Use it as a replication target. Multiple “SANs” on the same infrastructure as a means of providing simple multi-tenancy. It’s an interesting concept, and something I’d like to explore further. It also raises a lot of questions about the underlying hardware platform, and just how much you can do with software before being limited by, presumably, the cheap, commodity hardware that it sits on.

QNAP – How to repair RAID brokenness – Redux

I did a post a little while ago (you can see it here) that covered using mdadm to repair a munted RAID config on a QNAP NAS. So I popped another disk recently, and took the opportunity to get some proper output. Ideally you’ll want to use the web interface on the QNAP to do this type of thing but sometimes it no worky. So here you go.

Stop everything on the box.

[~] # /etc/init.d/services.sh stop
Stop service: recycled.sh mysqld.sh atalk.sh ftp.sh bt_scheduler.sh btd.sh ImRd.sh init_iTune.sh twonkymedia.sh Qthttpd.sh crond.sh nfs smb.sh lunportman.sh iscsitrgt.sh nvrd.sh snmp rsyslog.sh qsyncman.sh iso_mount.sh antivirus.sh .
Stop qpkg service: Disable Optware/ipkg
Shutting down SlimServer...
Stopping SqueezeboxServer 7.5.1-30836 (please wait) .... OK.
Stopping thttpd-ssods .. OK.
/etc/rcK.d/QK107Symform: line 48: /share/MD0_DATA/.qpkg/Symform/Symform.sh: No such file or directory

(By the way it really annoys me when I’ve asked software to remove itself and it doesn’t cleanly uninstall – I’m looking at you Symform plugin)

Unmount the volume

[~] # umount /dev/md0

Stop the array

[~] # mdadm -S /dev/md0
mdadm: stopped /dev/md0

Reassemble the volume

[~] # mdadm --assemble /dev/md0 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3 /dev/sde3 /dev/sdf3
mdadm: /dev/md0 has been started with 5 drives (out of 6).

Wait, wha? What about that other disk that I think is okay?

[~] # mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Fri May 22 21:05:28 2009
Raid Level : raid5
Array Size : 9759728000 (9307.60 GiB 9993.96 GB)
Used Dev Size : 1951945600 (1861.52 GiB 1998.79 GB)
Raid Devices : 6
Total Devices : 5
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Wed Dec 14 19:09:25 2011
State : clean, degraded
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 7c440c84:4b9110fe:dd7a3127:178f0e97
Events : 0.4311172
Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 0 0 1 removed
2 8 35 2 active sync /dev/sdc3
3 8 51 3 active sync /dev/sdd3
4 8 67 4 active sync /dev/sde3
5 8 83 5 active sync /dev/sdf3

Or in other words

[~] # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md0 : active raid5 sda3[0] sdf3[5] sde3[4] sdd3[3] sdc3[2]
9759728000 blocks level 5, 64k chunk, algorithm 2 [6/5] [U_UUUU]
md6 : active raid1 sdf2[2](S) sde2[3](S) sdd2[4](S) sdc2[1] sda2[0]
530048 blocks [2/2] [UU]
md13 : active raid1 sdb4[2] sdc4[0] sdf4[5] sde4[4] sdd4[3] sda4[1]
458880 blocks [6/6] [UUUUUU]
bitmap: 0/57 pages [0KB], 4KB chunk
md9 : active raid1 sdf1[1] sda1[0] sdc1[4] sdd1[3] sde1[2]
530048 blocks [6/5] [UUUUU_]
bitmap: 34/65 pages [136KB], 4KB chunk
unused devices: <none>

So, when you see [U_UUUU] you’ve got a disk missing, but you knew that already. You can add it back in to the array thusly.

[~] # mdadm --add /dev/md0 /dev/sdb3
mdadm: re-added /dev/sdb3

So let’s check on the progress.

[~] # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md0 : active raid5 sdb3[6] sda3[0] sdf3[5] sde3[4] sdd3[3] sdc3[2]
9759728000 blocks level 5, 64k chunk, algorithm 2 [6/5] [U_UUUU]
[>....................] recovery = 0.0% (355744/1951945600) finish=731.4min speed=44468K/sec
md6 : active raid1 sdf2[2](S) sde2[3](S) sdd2[4](S) sdc2[1] sda2[0]
530048 blocks [2/2] [UU]
md13 : active raid1 sdb4[2] sdc4[0] sdf4[5] sde4[4] sdd4[3] sda4[1]
458880 blocks [6/6] [UUUUUU]
bitmap: 0/57 pages [0KB], 4KB chunk
md9 : active raid1 sdf1[1] sda1[0] sdc1[4] sdd1[3] sde1[2]
530048 blocks [6/5] [UUUUU_]
bitmap: 34/65 pages [136KB], 4KB chunk
unused devices: <none>
[~] #

And it will rebuild. Hopefully. Unless the disk is really truly dead. You should probably order yourself a spare in any case.

New article added to articles page

I’ve added a new article to the articles page. While I agree that a bunch of screenshots do not a great technical document make, I think this is a useful visual guide for the first timer. Oh yeah, it covers the basic initialisation process used to deploy Dell | EqualLogic PS5xx0 Series arrays using 3.x firmware. Sure, it might be a little dated. Sure, I started writing it last year some time and then left it flailing about for some time. Sure, I probably could have left in my drafts queue forever. But I thought it would be nice to have something to refer back to that didn’t require logging in to the Dell website. You might find some of it useful too.