Random Short Take #23

Want some news? In a shorter format? And a little bit random? This listicle might be for you.

  • Remember Retrospect? They were acquired by StorCentric recently. I hadn’t thought about them in some time, but they’re still around, and celebrating their 30th anniversary. Read a little more about the history of the brand here.
  • Sometimes size does matter. This article around deduplication and block / segment size from Preston was particularly enlightening.
  • This article from Russ had some great insights into why it’s not wise to entirely rule out doing things the way service providers do just because you’re working in enterprise. I’ve had experience in both SPs and enterprise and I agree that there are things that can be learnt on both sides.
  • This is a great article from Chris Evans about the difficulties associated with managing legacy backup infrastructure.
  • The Pure Storage VM Analytics Collector is now available as an OVA.
  • If you’re thinking of updating your Mac’s operating environment, this is a fairly comprehensive review of what macOS Catalina has to offer, along with some caveats.
  • Anthony has been doing a bunch of cool stuff with Terraform recently, including using variable maps to deploy vSphere VMs. You can read more about that here.
  • Speaking of people who work at Veeam, Hal has put together a great article on orchestrating Veeam recovery activities to Azure.
  • Finally, the Brisbane VMUG meeting originally planned for Tuesday 8th has been moved to the 15th. Details here.

Random Short Take #17

Here are some links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 17 – am I over-sharing? There’s so much I want you to know about.

  • I seem to always be including a link from the Backblaze blog. That’s mainly because they write about things I’m interested in. In this case, they’ve posted an article discussing the differences between availability and durability that I think is worth your time.
  • Speaking of interesting topics, Preston posted an article on NetWorker Pools with Data Domain that’s worth looking at if you’re into that kind of thing.
  • Maintaining the data protection theme, Alastair wrote an interesting article titled “The Best Automation Is One You Don’t Write” (you know, like the best IO is one you don’t need to do?) as part of his work with Cohesity. It’s a good article, and not just because he mentions my name in it.
  • I recently wanted to change the edition of Microsoft Office I was using on my MacBook Pro and couldn’t really work out how to do it. In the end, the answer is simple. Download a Microsoft utility to remove your Office licenses, and then fire up an Office product and it will prompt you to re-enter your information at that point.
  • This is an old article, but it answered my question about validating MD5 checksums on macOS.
  • Excelero have been doing some cool stuff with Imperial College London – you can read more about that here.
  • Oh hey, Flixster Video is closing down. I received this in my inbox recently: “[f]ollowing the announcement by UltraViolet that it will be discontinuing its service on July 31, 2019, we are writing to provide you notice that Flixster Video is planning to shut down its website, applications and operations on October 31, 2019”. It makes sense, obviously, given UltraViolet’s demise, but it still drives me nuts. The ephemeral nature of digital media is why I still have a house full of various sized discs with various kinds of media stored on them. I think the answer is to give yourself over to the streaming lifestyle, and understand that you’ll never “own” media like you used to think you did. But I can’t help but feel like people outside of the US are getting shafted in that scenario.
  • In keeping up with the “random” theme of these posts, it was only last week that I learned that “Television, the Drug of the Nation” from the very excellent album “Hypocrisy Is the Greatest Luxury” by The Disposable Heroes of Hiphoprisy was originally released by Michael Franti and Rono Tse when they were members of The Beatnigs. If you’re unfamiliar with any of this I recommend you check them out.

Random Short Take #16

Here are a few links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 16 – please enjoy these semi-irregular updates.

  • Scale Computing has been doing a bit in the healthcare sector lately – you can read news about that here.
  • This was a nice roundup of the news from Apple’s recent WWDC from Six Colors. Hat tip to Stephen Foskett for the link. Speaking of WWDC news, you may have been wondering what happened to all of your purchased content with the imminent demise of iTunes on macOS. It’s still a little fuzzy, but this article attempts to shed some light on things. Spoiler: you should be okay (for the moment).
  • There’s a great post on the Dropbox Tech Blog from James Cowling discussing the mission versus the system.
  • The more things change, the more they remain the same. For years I had a Windows PC running Media Center and recording TV. I used IceTV as the XMLTV-based program guide provider. I then started to mess about with some HDHomeRun devices and the PC died and I went back to a traditional DVR arrangement. Plex now has DVR capabilities and it has been doing a reasonable job with guide data (and recording in general), but they’ve decided it’s all a bit too hard to curate guides and want users (at least in Australia) to use XMLTV-based guides instead. So I’m back to using IceTV with Plex. They’re offering a free trial at the moment for Plex users, and setup instructions are here. No, I don’t get paid if you click on the links.
  • Speaking of axe-throwing, the Cohesity team in Queensland is organising a social event for Friday 21st June from 2 – 4 pm at Maniax Axe Throwing in Newstead. You can get in contact with Casey if you’d like to register.
  • VeeamON Forum Australia is coming up soon. It will be held at the Hyatt Regency Hotel in Sydney on July 24th and should be a great event. You can find out more information and register for it here. The Vanguards are also planning something cool, so hopefully we’ll see you there.
  • Speaking of Veeam, Anthony Spiteri recently published his longest title in the Virtualization is Life! catalogue – Orchestration Of NSX By Terraform For Cloud Connect Replication With vCloud Director. It’s a great article, and worth checking out.
  • There’s a lot of talk and slideware devoted to digital transformation, and a lot of it is rubbish. But I found this article from Chin-Fah to be particularly insightful.

Liqid Are Dynamic In The DC

Disclaimer: I recently attended Dell Technologies World 2019.  My flights, accommodation and conference pass were paid for by Dell Technologies via the Media, Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

As part of my attendance at Dell Technologies World 2019 I had the opportunity to attend Tech Field Day Extra sessions. You can view the videos from the session here, and download my rough notes from here.

 

Liqid

One of the presenters at Tech Field Day extra was Liqid, a company that specialises in composable infrastructure. So what does that mean then? Liqid “enables Composable Infrastructure with a PCIe fabric and software that orchestrates and manages bare-metal servers – storage, GPU, FPGA / TPU, Compute, Networking”. They say they’re not disaggregating DRAM as the industry’s not ready for that yet. Interestingly, Liqid have made sure they can do all of this with bare metal, as “[c]omposability without bare metal, with disaggregation, that’s just hyper-convergence”.

 

[image courtesy of Liqid]

The whole show is driven through Liqid Command Center, and there’s a switching PCIe fabric as well. You then combine this with various hardware elements, such as:

  • JBoF – Flash;
  • JBoN – Network;
  • JBoG – GPU; and
  • Compute nodes.

There are various expansion chassis options (network, storage, and graphics) and you can add in standard x86 servers. You can read about Liqid’s announcement around Dell EMC PowerEdge servers here.

Other Interesting Use Cases

Some of the more interesting use cases discussed by Liqid included “brownfield” deployments where customers don’t want to disaggregate everything. If they just want to disaggregate GPUs, for example, they can add a GPU pool to a Fabric. This can be done with storage as well. Why would you want to do this kind of thing with networking? There are apparently a few service providers that like the composable networking use case. You can also have multiple fabric types with Liquid managing cross composability.

[image courtesy of Liqid]

Customers?

Liqid have customers across a variety of workload types, including:

  • AI & Deep Learning
    • GPU Scale out
    • Enable GPU Peer-2-Peer at scale
    • GPU Dynamic Reallocation/Sharing
  • Dynamic Cloud
    • CSP, ISP, Private Cloud
    • Flexibility, Resource Utilisation, TCO
    • Bare Metal Cloud Product Offering
  • HPC & Clustering
    • High Performance Computing
    • Lowest Latency Interconnect
    • Enables Massive Scale Out
  • 5G Edge
    • Utilisation & Reduced Foot Print
    • High Performance Edge Compute
    • Flexibility and Ease of Scale Out

Thoughts and Further Reading

I’ve written enthusiastically about composable infrastructure in the past, and it’s an approach to infrastructure that continues to fascinate me. I love the idea of being able to move pools of resources around the DC based on workload requirements. This isn’t just moving VMs to machines that are bigger as required (although I’ve always thought that was cool). This is moving resources to where they need to be. We have the kind of interconnectivity technology available now that means we don’t need to be beholden to “traditional” x86 server architectures. Of course, the success of this approach is in no small part dependent on the maturity of the organisation. There are some workloads that aren’t going to be a good fit with composable infrastructure. And there are going to be some people that aren’t going to be a good fit either. And that’s fine. I don’t think we’re going to see traditional rack mount servers and centralised storage disappear off into the horizon any time soon. But the possibilities that composable infrastructure present to organisations that have possibly struggled in the past with getting the right resources to the right workload at the right time are really interesting.

There are still a small number of companies that are offering composable infrastructure solutions. I think this is in part because it’s viewed as a niche requirement that only certain workloads can benefit from. But as companies like Liqid are demonstrating, the technology is maturing at a rapid pace and, much like our approach to on-premises infrastructure versus the public cloud, I think it’s time that we take a serious look at how this kind of technology can help businesses worry more about their business and less about the resources needed to drive their infrastructure. My friend Max wrote about Liqid last year, and I think it’s worth reading his take if you’re in any way interested in what Liqid are doing.

Random Short Take #11

Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe. Happy New Year too. I hope everyone’s feeling fresh and ready to tackle 2019.

  • I’m catching up with the good folks from Scale Computing in the next little while, but in the meantime, here’s what they got up to last year.
  • I’m a fan of the fruit company nowadays, but if I had to build a PC, this would be it (hat tip to Stephen Foskett for the link).
  • QNAP announced the TR-004 over the weekend and I had one delivered on Tuesday. It’s unusual that I have cutting edge consumer hardware in my house, so I’ll be interested to see how it goes.
  • It’s not too late to register for Cohesity’s upcoming Helios webinar. I’m looking forward to running through some demos with Jon Hildebrand and talking about how Helios helps me manage my Cohesity environment on a daily basis.
  • Chris Evans has published NVMe in the Data Centre 2.0 and I recommend checking it out.
  • I went through a basketball card phase in my teens. This article sums up my somewhat confused feelings about the card market (or lack thereof).
  • Elastifile Cloud File System is now available on the AWS Marketplace – you can read more about that here.
  • WekaIO have posted some impressive numbers over at spec.org if you’re into that kind of thing.
  • Applications are still open for vExpert 2019. If you haven’t already applied, I recommend it. The program is invaluable in terms of vendor and community engagement.

 

 

OpenMediaVault – Good Times With mdadm

Happy 2019. I’ve been on holidays for three full weeks and it was amazing. I’ll get back to writing about boring stuff soon, but I thought I’d post a quick summary of some issues I’ve had with my home-built NAS recently and what I did to fix it.

Where Are The Disks Gone?

I got an email one evening with the following message.

I do enjoy the “Faithfully yours, etc” and the post script is the most enlightening bit. See where it says [UU____UU]? Yeah, that’s not good. There are 8 disks that make up that device (/dev/md0), so it should look more like [UUUUUUUU]. But why would 4 out of 8 disks just up and disappear? I thought it was a little odd myself. I had a look at the ITX board everything was attached to and realised that those 4 drives were plugged in to a PCI SATA-II card. It seems that either the slot on the board or the card are now failing intermittently. I say “seems” because that’s all I can think of, as the S.M.A.R.T. status of the drives is fine.

Resolution, Baby

The short-term fix to get the filesystem back on line and useable was the classic “assemble” switch with mdadm. Long time readers of this blog may have witnessed me doing something similar with my QNAP devices from time to time. After panic rebooting the box a number of times (a silly thing to do, really), it finally responded to pings. Checking out /proc/mdstat wasn’t good though.

dan@openmediavault:~$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
unused devices: <none>

Notice the lack of, erm, devices there? That’s non-optimal. The fix requires a forced assembly of the devices comprising /dev/md0.

dan@openmediavault:~$ sudo mdadm --assemble --force --verbose /dev/md0 /dev/sd[abcdefhi]
[sudo] password for dan:
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdh is identified as a member of /dev/md0, slot 7.
mdadm: /dev/sdi is identified as a member of /dev/md0, slot 6.
mdadm: forcing event count in /dev/sdd(2) from 40639 upto 40647
mdadm: forcing event count in /dev/sdc(3) from 40639 upto 40647
mdadm: forcing event count in /dev/sdf(4) from 40639 upto 40647
mdadm: forcing event count in /dev/sde(5) from 40639 upto 40647
mdadm: clearing FAULTY flag for device 3 in /dev/md0 for /dev/sdd
mdadm: clearing FAULTY flag for device 2 in /dev/md0 for /dev/sdc
mdadm: clearing FAULTY flag for device 5 in /dev/md0 for /dev/sdf
mdadm: clearing FAULTY flag for device 4 in /dev/md0 for /dev/sde
mdadm: Marking array /dev/md0 as 'clean'
mdadm: added /dev/sdb to /dev/md0 as 1
mdadm: added /dev/sdd to /dev/md0 as 2
mdadm: added /dev/sdc to /dev/md0 as 3
mdadm: added /dev/sdf to /dev/md0 as 4
mdadm: added /dev/sde to /dev/md0 as 5
mdadm: added /dev/sdi to /dev/md0 as 6
mdadm: added /dev/sdh to /dev/md0 as 7
mdadm: added /dev/sda to /dev/md0 as 0
mdadm: /dev/md0 has been started with 8 drives.

In this example you’ll see that /dev/sdg isn’t included in my command. That device is the SSD I use to boot the system. Sometimes Linux device conventions confuse me too. If you’re in this situation and you think this is just a one-off thing, then you should be okay to unmount the filesystem, run fsck over it, and re-mount it. In my case, this has happened twice already, so I’m in the process of moving data off the NAS onto some scratch space and have procured a cheap little QNAP box to fill its role.

 

Conclusion

My rush to replace the homebrew device with a QNAP isn’t a knock on the OpenMediaVault project by any stretch. OMV itself has been very reliable and has done everything I needed it to do. Rather, my ability to build semi-resilient devices on a budget has simply proven quite poor. I’ve seen some nasty stuff happen with QNAP devices too, but at least any issues will be covered by some kind of manufacturer’s support team and warranty. My NAS is only covered by me, and I’m just not that interested in working out what could be going wrong here. If I’d built something decent I’d get some alerting back from the box telling me what’s happened to the card that keeps failing. But then I would have spent a lot more on this box than I would have wanted to.

I’ve been lucky thus far in that I haven’t lost any data of real import (the NAS devices are used to store media that I have on DVD or Blu-Ray – the important documents are backed up using Time Machine and Backblaze). It is nice, however, that a tool like mdadm can bring you back from the brink of disaster in a pretty efficient fashion.

Incidentally, if you’re a macOS user, you might have a bunch of .ds_store files on your filesystem. Or stuff like .@Thumb or some such. These things are fine, but macOS doesn’t seem to like them when you’re trying to move folders around. This post provides some handy guidance on how to get rid of a those files in a jiffy.

As always, if the data you’re storing on your NAS device (be it home-built or off the shelf) is important, please make sure you back it up. Preferably in a number of places. Don’t get yourself in a position where this blog post is your only hope of getting your one copy of your firstborn’s pictures from the first day of school back.

Google WiFi – A Few Notes

Like a lot of people who work in IT as their day job, the IT situation at my house is a bit of a mess. I think the real reason for this is because, once the working day is done, I don’t want to put any thought into doing this kind of stuff. As a result, like a lot of tech folk, I have way more devices and blinking lights in my house than I really need. And I’m always sure to pile on a good helping of technical debt any time I make any changes at home. It wouldn’t be any fun without random issues to deal with from time to time.

Some Background – Apple Airport

I’ve been running an Apple Airport Extreme and a number of Airport Express devices in my house for a while in a mesh network configuration. Our house is 2 storeys and it was too hard to wire up properly with Ethernet after we bought it. I liked the Apple devices primarily because of the easy to use interface (via browser or phone), and Airplay, in my mind at least, was a killer feature. So I’ve stuck with these things for some time, despite the frequent flakiness I experienced with the mesh network (I’d often end up connected to an isolated access point with no network access – a reboot of the base station seemed to fix this) and the sometimes frustrating lack of visibility into what was going on in the network. 

Enter Google Wifi

I had some Frequent Flier points available that meant I could get a 3-pack of Google access points for under $200 AU (I think that’s about $15 in US currency). I’d already put up the Christmas tree, so I figured I could waste a few hours on re-doing the home network. I’m not going to do a full review of the Google Wifi solution, but if you’re interested in that kind of thing, Josh Odgers does a great job of that here. In short, it took me about an hour to place the three access points in the house and get everything connected. I have about 30 – 40 devices running, some of which are hardwired to a switch connected to my ISP’s NBN gateway, and most of which connect wirelessly. 

So What’s The Problem?

The problem was that I’d kind of just jammed the primary Google Wifi point into the network (attached to a dumb switch downstream of the modem). As a result, everything connecting wirelessly via the Google network had an IP range of 192.168.86.x, and all of my other devices were in the existing 10.x.x.x range. This wasn’t a massive problem, as the Google solution does a great job of routing stuff between the “wan” and “lan” subnets, but I started to notice that my pi-hole device wasn’t picking up hostnames properly, and some devices were getting confused about which DNS to use. Oh, and my port mapping for Plex was a bit messed up too. I also had wired devices (i.e. my desktop machine) that couldn’t see Airplay devices on the wireless network without turning on Wifi.

The Solution?

After a lot of Googling, I found part of the solution via this Reddit thread. Basically, what I needed to do was follow a more structured topology, with my primary Google device hanging off my ISP’s switch (and connected via the “wan” port on the Google Wifi device). I then connected the “lan” port on the Google device to my downstream switch (the one with the pi-hole, NAS devices, and other stuff connected to it). 

Now the pi-hole could play nicely on the network, and I could point my devices to it as the DNS server via the Google interface. I also added a few more reservations into my existing list of hostnames on the pi-hole (instructions here) so that it could correctly identify any non-DHCP clients. I also changed the DHCP range on the Google Wifi to a single IP address (the one used by the pi-hole) and made sure that there was a reservation set for the pi-hole on the Google side of things. The reason for this (I think) is that you can’t disable DHCP on the Google Wifi device. To solve the Plex port mapping issue, I set a manual port mapping on my ISP modem and pointed it to the static IP address of the primary Google Wifi device. I then created a port mapping on the Google side of things to point to my Plex Media Server. It took a little while, but eventually everything started to work. 

It’s also worth noting that I was able to reconfigure the Airport Express devices connected to speakers to join the new Wifi network and I can still use Airplay around the house as I did before.

Conclusion 

This seems like a lot of mucking about for what is meant to be a plug and play wireless solution. In Google’s defence though, my home network topology is a bit more fiddly than the average punter’s would be. If I wasn’t so in love with pi-hole, and didn’t have devices that I wanted to use static IP addresses and DNS, then I wouldn’t have had as many problems as I did with the setup. From a performance and usability standpoint, I think the Google solution is excellent. Of course, this might all go to hell in a hand basket when I ramp up IPv6 in the house, but for now it’s been working well. Coupled with the fact that my networking skills are pretty subpar and we should all just be happy I was able to post this article on the Internet from my house.

Random Short Take #9

Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe.

 

 

Random Short Take #7

Here are a few links to some random things that I think might be useful, to someone. Maybe.

Random Short Take #6

Welcome to the sixth edition of the Random Short Take. Here are a few links to a few things that I think might be useful, to someone.