OpenMediaVault – Good Times With mdadm

Happy 2019. I’ve been on holidays for three full weeks and it was amazing. I’ll get back to writing about boring stuff soon, but I thought I’d post a quick summary of some issues I’ve had with my home-built NAS recently and what I did to fix it.

Where Are The Disks Gone?

I got an email one evening with the following message.

I do enjoy the “Faithfully yours, etc” and the post script is the most enlightening bit. See where it says [UU____UU]? Yeah, that’s not good. There are 8 disks that make up that device (/dev/md0), so it should look more like [UUUUUUUU]. But why would 4 out of 8 disks just up and disappear? I thought it was a little odd myself. I had a look at the ITX board everything was attached to and realised that those 4 drives were plugged in to a PCI SATA-II card. It seems that either the slot on the board or the card are now failing intermittently. I say “seems” because that’s all I can think of, as the S.M.A.R.T. status of the drives is fine.

Resolution, Baby

The short-term fix to get the filesystem back on line and useable was the classic “assemble” switch with mdadm. Long time readers of this blog may have witnessed me doing something similar with my QNAP devices from time to time. After panic rebooting the box a number of times (a silly thing to do, really), it finally responded to pings. Checking out /proc/mdstat wasn’t good though.

[email protected]:~$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
unused devices: <none>

Notice the lack of, erm, devices there? That’s non-optimal. The fix requires a forced assembly of the devices comprising /dev/md0.

[email protected]:~$ sudo mdadm --assemble --force --verbose /dev/md0 /dev/sd[abcdefhi]
[sudo] password for dan:
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdh is identified as a member of /dev/md0, slot 7.
mdadm: /dev/sdi is identified as a member of /dev/md0, slot 6.
mdadm: forcing event count in /dev/sdd(2) from 40639 upto 40647
mdadm: forcing event count in /dev/sdc(3) from 40639 upto 40647
mdadm: forcing event count in /dev/sdf(4) from 40639 upto 40647
mdadm: forcing event count in /dev/sde(5) from 40639 upto 40647
mdadm: clearing FAULTY flag for device 3 in /dev/md0 for /dev/sdd
mdadm: clearing FAULTY flag for device 2 in /dev/md0 for /dev/sdc
mdadm: clearing FAULTY flag for device 5 in /dev/md0 for /dev/sdf
mdadm: clearing FAULTY flag for device 4 in /dev/md0 for /dev/sde
mdadm: Marking array /dev/md0 as 'clean'
mdadm: added /dev/sdb to /dev/md0 as 1
mdadm: added /dev/sdd to /dev/md0 as 2
mdadm: added /dev/sdc to /dev/md0 as 3
mdadm: added /dev/sdf to /dev/md0 as 4
mdadm: added /dev/sde to /dev/md0 as 5
mdadm: added /dev/sdi to /dev/md0 as 6
mdadm: added /dev/sdh to /dev/md0 as 7
mdadm: added /dev/sda to /dev/md0 as 0
mdadm: /dev/md0 has been started with 8 drives.

In this example you’ll see that /dev/sdg isn’t included in my command. That device is the SSD I use to boot the system. Sometimes Linux device conventions confuse me too. If you’re in this situation and you think this is just a one-off thing, then you should be okay to unmount the filesystem, run fsck over it, and re-mount it. In my case, this has happened twice already, so I’m in the process of moving data off the NAS onto some scratch space and have procured a cheap little QNAP box to fill its role.

 

Conclusion

My rush to replace the homebrew device with a QNAP isn’t a knock on the OpenMediaVault project by any stretch. OMV itself has been very reliable and has done everything I needed it to do. Rather, my ability to build semi-resilient devices on a budget has simply proven quite poor. I’ve seen some nasty stuff happen with QNAP devices too, but at least any issues will be covered by some kind of manufacturer’s support team and warranty. My NAS is only covered by me, and I’m just not that interested in working out what could be going wrong here. If I’d built something decent I’d get some alerting back from the box telling me what’s happened to the card that keeps failing. But then I would have spent a lot more on this box than I would have wanted to.

I’ve been lucky thus far in that I haven’t lost any data of real import (the NAS devices are used to store media that I have on DVD or Blu-Ray – the important documents are backed up using Time Machine and Backblaze). It is nice, however, that a tool like mdadm can bring you back from the brink of disaster in a pretty efficient fashion.

Incidentally, if you’re a macOS user, you might have a bunch of .ds_store files on your filesystem. Or stuff like [email protected] or some such. These things are fine, but macOS doesn’t seem to like them when you’re trying to move folders around. This post provides some handy guidance on how to get rid of a those files in a jiffy.

As always, if the data you’re storing on your NAS device (be it home-built or off the shelf) is important, please make sure you back it up. Preferably in a number of places. Don’t get yourself in a position where this blog post is your only hope of getting your one copy of your firstborn’s pictures from the first day of school back.

Random Short Take #8

Here are a few links to some news items and other content that might be useful. Maybe.

OpenMediaVault – Expanding the Filesystem

I recently had the opportunity to replace a bunch of ageing 2TB drives in my OpenMediaVault NAS with some 3TB drives. I run it in a 6+2 RAID-6 configuration (yes, I know, RAID is dead). I was a bit cheeky and replaced 2 drives at a time and let it rebuild. This isn’t something I recommend you do in the real world. Everything came up clean after the drives were replaced. I even got to modify the mdadm.conf file again to tell it I had 0 spares. The problem was that the size of the filesystem in OpenMediaVault was the same as it was before. When you click on Grow it expects you to be adding drives. So, you can grow the filesystem, but you need to expand the device to fill the drives. I recommend taking a backup before you do this. And I unmounted my shares before I did this too.

If you’re using a bitmap, you’ll need to remove it first.

mdadm --grow /dev/md0 --bitmap none
mdadm --grow /dev/md0 --size max
mdadm --wait /dev/md0
mdadm --grow /dev/md0 --bitmap internal

In this example, /dev/md0 is the device you want to grow. It’s likely that your device is called /dev/md0. Note, also, that this will take some time to complete. The next step is to expand the filesystem to fit the RAID device. It’s a good idea to run a filesystem check before you do this.

fsck /dev/md0

Then it’s time to resize (assuming you had no problems in the last step).

resize2fs /dev/md0

You should then be able to remount the device and see the additional capacity. Big thanks to kernel.org for having some useful instructions here.

OpenMediaVault – Updating from 2.2.x to 3.0.x

I recently upgraded my home-brew NAS from OpenMediaVault 2.2.14 (Stone burner) to openmediavault 3.0.86 (Erasmus). It’s recommended that you do a fresh install but I thought I’d give the upgrade a shot as it was only a 10TB recovery if it went pear-shaped (!). They also recommend you disable all your plugins before you upgrade.

 

Apt-get all of the things

It’s an OS upgrade as well as an application upgrade. In an ssh session I ran

apt-get update && apt-get dist-upgrade && omv-update

This gets you up to date, then upgrades your distro (Debian), and then gets the necessary packages for omv. I then ran the omv upgrade.

omv-release-upgrade

This seemed to go well. I rebooted the box and could still access the shared data. Happy days. When I tried to access the web UI, however, I could enter my credentials but I couldn’t get in. I then ran

omv-firstaid

And tried to reconfigure the web interface. It kept complaining about a file not being found. So I ran

dpkg -l | grep openmediavault

This told me that there was still a legacy plugin (openmediavault-dnsmasq) running. I’d read on the forums that this might cause some problems. So I used apt-get to remove it.

apt-get remove openmediavault-dnsmasq

The next time I ran apt-get it told me there were some legacy packages present that I could remove. So I did.

apt-get autoremove dnsmasq dnsmasq-base libnetfilter-conntrack3

After that, I was able to login in to the web UI with no problems and everything now seems to be in order. When my new NAS arrives I’ll evacuate this one and rebuild it from scratch. There are a fair few changes in version 3 and it’s worth checking out. You can download the ISO image from here.

 

DNS Matters

The reason I had the dnsmasq plugin installed in the first place was that I’d been using the NAS as a DHCP / DNS server. This had been going reasonably well, but I’d heard about Pi-hole and wanted to give that a shot. That’s a story for another time, but I did notice that my OMV box hadn’t updated its /etc/resolv.conf feel correctly, despite the fact that I’d reconfigured DNS via the web GUI. If you run into this issue, just run

dpkg-reconfigure resolvconf

And you’ll find that resolv.conf is correctly updated. Incidentally, if you’re a bit old-fashioned and don’t like to run everything through DHCP reservations, you can add a second set of static host entries to dnsmasq on your pi-hole machine by following these instructions.

OpenMediaVault – Modifying Monit Parameters

You can file this article under “not terribly useful but something I may refer to again in the future”. I’ve been migrating a bunch of data from one of my QNAP NAS devices at home to my OpenMediaVault NAS. Monit, my “faithful employee”, sent me an email to let me know I was filling up the filesystem on the OMV NAS.

By default OMV alerts at 80% full. You can change this though. Just jump on a terminal and run the following:

nano /etc/default/openmediavault

Add this line to the file

OMV_MONIT_SERVICE_FILESYSTEM_SPACEUSAGE=95

Then run the following commands to update the configuration

omv-mkconf collectd
monit restart collectd

Of course, you need to determine what level of filesystem usage you’re comfortable with. In this example, I’ve set it to 95% as it’s a fairly static environment. If, however, you’re capable of putting a lot of data on the device quickly, then 5% buffer may be insufficient. I’d also like to clarify that I’m not unhappy with QNAP, but the device I’m migrating off is 8 years old now and it would be a pain to have to recover if something went wrong. If you’re interested in reading more about Monit you can find documentation here.

OpenMediaVault – Annoying mdadm e-mails after a rebuild

My homebrew NAS running OpenMediaVault (based on Debian) started writing to me recently. I’d had a disk failure and replaced the disk in the RAID set with another one. Everything rebuilt properly, but then this mdadm chap started sending me messages daily.

"This is an automatically generated mail message from mdadm
 running on openmediavault
A SparesMissing event had been detected on md device /dev/md0.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid6] [raid5] [raid4] 
 md0 : active raid6 sdi[0] sda[8] sdb[6] sdc[5] sdd[4] sde[3] sdf[2] sdh[1]
 11720297472 blocks super 1.2 level 6, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
unused devices: <none>"

Which was nice of it to get in touch. But I’d never had spares configured on this md device. The fix is simple, and is outlined here and here. In short, you’ll want to edit /etc/mdadm/mdadm.conf and changes spares=1 to spares=0. This is assuming you don’t want spares configured and are relying on parity for resilience. If you do want spares configured then it’s probably best you look into the problem a little more.

OpenMediaVault – A few notes

Following on from my brief look at FreeNAS here, I thought I’d do a quick article on OpenMediaVault as well. While it isn’t quite as mature as FreeNAS, it is based on Debian. I’ve had a soft spot for Debian ever since I was able to get it running on a DECpc AXP 150 I had lying about many moons ago. The Jensen is no longer with us, but the fond memories remain. Anyway …

Firstly, you can download OpenMediaVault here. It’s recommended that you install it on a hard drive (ideally in a RAID 1 configuration) rather than on USB or SD cards. Theoretically you could put it on a stick and redirect the more frequently written stuff to a RAM disk if you really didn’t want to give up the SATA ports on your board. I decided to use an SSD I had laying about as I couldn’t be bothered with more workarounds and “tweaks”. You can follow this guide to setup some semi-automated backup of the configuration.

Secondly, here’s a list of the hardware I used for this build:

  • Mainboard – ASRock N3700-ITX
  • CPU – Intel Quad-Core Pentium Processor N3700 (on-board)
  • RAM – 2 * Kingston 8GB 1600MHz DDR3 Non-ECC CL11 SODIMM
  • HDDs – 1 * SSD, 8 * Seagate Constellation ES 2TB drives
  • SATA Controller PCIe x1 4-port SATA III controller (non-RAID), using a Marvell 88SE9215 chipset
  • IO Crest Mini PCIe 2-port SATA III controller (RAID capable), using a Syba (?) chipset
  • Case – Fractal Design Node 804
  • PSU – Silverstone Strider Essential 400W

IMG_3054

You’ll notice the lack of ECC RAM, and the board is limited in SATA ports, hence the requirement for a single-lane, 4-port SATA card. I’m really not the best at choosing the right hardware for the job. The case is nice and roomy, but there’s no hot-swap for the disks. A better choice would have been a workstation-class board with support for ECC RAM, a decent CPU and a bunch of SATA ports in a micro-ATX form-factor. I mean, it works, but it could have been better. I’d like to think it’s because the market is a bit more limited in Australia, but it’s more because I’m not very good at this stuff.

Thirdly, if you do end up with the ASRock board, you’ll need to make a change to your grub configuration so that the board will boot headless. To do this, ssh or console onto the machine and edit /etc/default/grub. Uncomment GRUB_TERMINAL=console (by removing the #). You’ll then need to run update-grub and you should be right to boot the machine without a monitor connected.

Finally, the OMV experience has been pretty good thus far. None of these roll-your-own options are as pretty as their QNAP or Synology brethren from a UX perspective, but they do the job in a functional, if somewhat sparse fashion. That said, having been a QNAP user for a about 7 years now, I remember that it wasn’t always the eye candy that it is nowadays. Also of note, OMV has a pretty reasonable plugin ecosystem you can leverage, with Plex and a bunch of extras being fairly simple to install and configure. I’m looking forward to running this thing through its paces and posting the performance and useability results.