Data protection is a funny thing. Much like insurance, most folks understand that it’s important, normally dread having to use it, and dislike the fact that it costs money “just in case something goes wrong”. But then they get hit by ransomware, or Judy in Accounting absolutely destroys a critical spreadsheet, and they realise it’s probably not such a bad thing to have this “data protection”. Books are weird too. Not the idea that we’ll put a whole bunch of information in a file and make it accessible to people. Rather, that sometimes that information is given context and then printed out, sold, read, and stuck on a shelf somewhere for future reference. Indeed, I was a voracious consumer of technical books early in my career, particularly when many vendors were insisting that this was the way to share knowledge with end users. YouTube wasn’t a thing, and access to manuals and reference guides was limited to partners or the vendors themselves. The problem with technical books, however, is that if they cover a specific version of software (or hardware or whatever), they very quickly become outdated in potentially significant ways. As enjoyable as some of those books about Windows NT 4.0 might have been for us all, they quickly became monitor stands when Windows 2000 was released. The more useful books were the ones that shared more of the how, what, when, and why of the topic, rather than digging in to specific guidance on how to do an activity with a particular solution. Particularly when that solution was re-written by the vendor between major versions.
Early on in my career I got involved in my employer’s backup and recovery solution. At the time it was all about GFS backup schemes and DDS-2 drives and per-server protection schemes that mostly worked. It was viewed as an unnecessary expense and given to junior staff to look after. There was a feeling, at least with some of the Windows stuff, that if anything went wrong it would likely go wrong in a big way. I generally felt ill at ease when recovery requests would hit the service desk queue. As a result of this, my interest in being able to bring data back from human error, disaster, or other kinds of failure was piqued, and I went out and bought a copy of Unix Backup and Recovery. As a system administrator, it was a great book to have at hand. There was a nice combination of understandable examples and practical application of backup and recovery principles covered throughout that book. I used to joke that it even had a happy ending, and everyone got their data back. As I moved through my career, I maintained an interest in data protection (it seemed, at one stage, to go hand in hand with storage for whatever reason), and I’ve often wondered what people do when they aren’t given the appropriate guidance on how to best do data protection to meet their needs.
All of this is an extremely long-winded way of saying that my friend W. Curtis Preston has released his fourth book, the snappily titled “Modern Data Protection“, and it makes for some excellent reading. If you listen to him talk about why he wrote another book on his podcast, you’ll appreciate that this thing was over 10 years in the making, had an extensive outline developed for it, and really took a lot of effort to get done. As Curtis points out, he goes out of his way not to name vendors or solutions in the book (he works for Druva). Instead, he spends time on the basics (why backup?), what you should backup, how to backup, and even when you should be backing up things.
This one doesn’t just cover off the traditional centralised server / tape library combo so common for many years in enterprise shops. It also goes into more modern on-premises solutions (I think the kids call them hyper-converged) and cloud-native solutions of all different shapes and sizes. He talks about how to protect a wide variety of workloads and solution architectures, drills in on the importance of recovery testing, and even covers off the difference between backup and archive. Yes, they are different, and I’m not just saying that because I contributed that particular chapter. There’s talk of traditional data sources, deduplication technologies, and more fashionable stuff like Docker and Kubernetes.
The book comes in at a svelte 350ish pages, and you know that each chapter could have almost been a book on its own (or at least a very long whitepaper). That said, Preston does a great job of sticking to the topic at hand, and breaking down potentially complex scenarios in a concise and simple to digest fashion. As I like to say to anyone who’ll listen, this stuff can be hard to get right, and you want to get it right, so it helps if the book you’re using gets it right too.
Should you read this book? Yes. Particularly if you have data or know someone who has data. You may be a seasoned industry veteran or new to the game. It doesn’t matter. You might be a consultant, an architect, or an end user. You might even work at a data protection vendor. There’s something in this for everyone. I was one of the technical editors on this book, fancy myself as knowing about about data protection, and I learnt a lot of stuff. Even if you’re not directly in charge of data protection for your own data or your organisation’s data, this is an extremely useful guide that covers off the things you should be looking at with your existing solution or with a new solution. You can buy it directly from O’Reilly, or from big book sellers. It comes in electronic and physical versions and is well worth checking out. If you don’t believe me, ask Mellor, or Leib – they’ll tell you the same thing.
- Publisher: O’Reilly
- ISBN: 9781492094050
Finally, thanks to Preston for getting me involved in this project, for putting up with my English (AU) spelling, and for signing my copy of Unix Backup and Recovery.