StorONE Announces S1:Backup

StorONE recently announced details of its S1:Backup product. I had the opportunity to talk about the announcement with Gal Naor and George Crump about the news and thought I’d share some brief thoughts here.

 

The Problem

Talk to people in the tech sector today, and you’ll possibly hear a fair bit about how ransomware is a real problem for them, and a scary one at that. Most all of the data protection solution vendors are talking about how they can help customers quickly recover from ransomware events, and some are particularly excited about how they can let you know you’ve been hit in a timely fashion. Which is great. A good data protection solution is definitely important to an organisation’s ability to rapidly recover when things go pop. But what about those software-based solutions that themselves have become targets of the ransomware gangs? What do you do when someone goes after both your primary and secondary storage solution? It costs a lot of money to deliver immutable solutions that are resilient to the nastiness associated with ransomware. Unfortunately, most organisations continue to treat data protection as an overpriced insurance policy and are reluctant to spend more than the bare minimum to keep these types of solutions going. It’s alarming the number of times I’ve spoken to customers using software-based data protection solutions that are out of support with the vendor just to save a few thousand dollars a year in maintenance costs.

 

The StorONE Solution

So what do you get with S1:Backup? Quite a bit, as it happens.

[image courtesy of StorONE]

You get Flash-based data ingestion in an immutable format, with snapshots being taken every 30 seconds.

[image courtesy of StorONE]

You also get fast consolidation of multiple incremental backup jobs (think synthetic fulls, etc.), thanks to the high performance of the StorONE platform. Speaking of performance, you also get quick recovery capabilities, and the other benefits of the StorONE platform (namely high availability and high performance).

And if you’re looking for long term retention that’s affordable, you can take advantage of StorONE’s ability to cope well with 90% capacity utilisation, rapid RAID rebuild times, and the ability to start small and grow.

 

Thoughts and Further Reading

Ransomware is a big problem, particularly when it hits you across both primary and secondary storage platforms. Storage immutability has become a super important piece of the puzzle that vendors are trying to solve. Like many things though, it does require some level of co-operation to make sure non-integrated systems are functioning across the tack in an integrated fashion. There are all kinds of ways to attack this issue, with some hardware vendors insisting that they’re particular interpretation of immutability is the only way to go, while some software vendors are quite keen on architecting air gaps into solutions to get around the problem. And I’m sure there’s a tape guy sitting up the back muttering about how tape is the ultimate air gap. Whichever way you want to look at it, I don’t think any one vendor has the solution that is 100% guaranteed to keep you safe from the folks in hoodies intent on trashing your data. So I’m pleased that StorONE is looking at this problem and wanting to work with the major vendors to develop a cost-effective solution to the issue. It may not be right for everyone, and that’s fine. But on the face of it, it certainly looks like a compelling solution when compared to rolling your own storage platforms and hoping that you don’t get hit.

Doing data protection well is hard, and made harder by virtue of the fact that many organisations treat it as a necessary evil. Sadly, it seems that CxOs only really start to listen after they’ve been rolled, not beforehand. Sometimes the best you can do is be prepared for when disaster strikes. If something like the StorONE solution is going to be the difference between losing the whole lot, or coming back from an attack quickly, it seems like it’s worth checking out. I can assure you that ignoring the problem will only end in tears. It’s also important to remember that a robust data protection solution is just another piece of the puzzle. You still need to need to look at your overall security posture, including securing your assets and teaching your staff good habits. Finally, if it seems like I’m taking aim at software-based solutions, I’m not. I’m the first to acknowledge that any system is susceptible if it isn’t architected and deployed in a secure fashion – regardless of whether it’s integrated or not. Anyway, if you’d like another take on the announcement, Mellor covered it here.

Datadobi, DobiProtect, and Forward Progress

I recently had the opportunity to speak Carl D’Halluin from Datadobi about DobiProtect, and thought I’d share some thoughts here. I wrote about DobiProtect in the past, particularly in relation to disaster recovery and air gaps. Things have progressed since then, as they invariably do, and there’s a bit more to the DobiProtect story now.

 

Ransomware Bad, Data Protection Good

If you’re paying attention to any data protection solution vendors at the moment, you’re no doubt hearing about ransomware attacks. These are considered to be Very Bad Things (™).

What Happens

  • Ransomware comes in through zero-day exploit or email attachments
  • Local drive content encrypted
  • Network shares encrypted – might be fast, might be slow
  • Encrypted file accessed and ransom message appears

How It Happens

Ransomware attacks are executed via many means, including social engineering, software exploits, and “malvertising” (my second favourite non-word next to performant). The timing of these attacks is important to note as well, as some ransomware will lay dormant and launch during a specific time period (a public holiday, for example). Sometimes ransomware will slowly and periodically encrypt content , but generally speaking it will begin encrypting files as quickly as possible. It might not encrypt everything either, but you can bet that it will be a pain regardless.

Defense In Depth

Ransomware protection isn’t just about data protection though. There are many layers you need to consider (and protect), including:

  • Human – hard to control, not very good at doing what they’re told.
  • Physical – securing the locations where data is stored is important.
  • End Points – BYOD can be a pain to manage effectively, and keeping stuff up to date seems to be challenging for the most mature organisations.
  • Networks – there’s a lot of work that needs to go into making sure workloads are both secure and accessible.
  • Application – sometimes they’re just slapped in there and we’re happy they run.
  • Data – It’s everything, but super exposed if you don’t get the rest of this right.

 

DobiProtect Then?

The folks at Datadobi tell me DobiProtect is the ideal solution for protecting the data layer as part of your defence in depth strategy as it is:

  • Software defined
  • Designed for the scale and complexity of file and / or object datasets
  • A solution that compliments existing capabilities such as storage system snapshots
  • Easy to deploy and does not impact existing configurations
  • A solution that is cost effective and flexible

 

Where Does It Fit?

DobiProtect plays to the strength of Datadobi – file and object storage. As such, it’s not designed to handle your traditional VM and DB protection, this remains the domain of the usual suspects.

[image courtesy of Datadobi]

Simple Deployment

The software-only nature of the solution, and the flexibility of going between file and object, means that it’s pretty easy to deploy as well.

[image courtesy of Datadobi]

Architecture

From an architecture perspective, it’s pretty straight forward as well, with the Core handling the orchestration and monitoring, and software proxies used for data movement.

[image courtesy of Datadobi]

 

Thoughts

I’ve been involved in the data protection business in some form or another for over two decades now. As you can imagine, I’ve seen a whole bunch of different ways to solve problems. In my day job I generally promote modern approaches to solving the challenge of protecting data in an efficient and cost-effective fashion. It can be hard to do this well, at scale, across the variety of workloads that you find in the modern enterprise nowadays. It’s not just some home directories, a file server, and one database that you have to protect. Now there’s SaaS workloads, 5000 different database options, containers, endpoints, and all kinds of other crazy stuff. The thing linking that all together is data, and the requirement to protect that data in order for the business to do its business – whether that’s selling widgets or providing services to the general public.

Protecting file and object workloads can be a pain. But why not just use a vendor that can roughly do the job rather than using a very specific solution like DobiProtect? I asked D’Halluin the same question, and his response was along the following lines. The kind of customers Datadobi is working with on a regular basis have petabytes of unstructured data they need to protect, and they absolutely need to be sure that it’s being protected properly. Not just from a quality of recovered data perspective, but also from a defensible compliance position. It’s not just about pointing out to the auditors that the data protection solution “should” be working. There’s a lot of legislation and stuff in place to ensure that it needs to be more than that. So it’s oftentimes worth investing in a solution that can reliably deliver against that compliance requirement.

Ransomware attacks can be the stuff of nightmares, particularly if you aren’t prepared. Any solution that is helping you to protect yourself (and, more importantly, recover) from attacks is a Very Good Thing™. Just be sure to check that the solution you’re looking at does what you think it will do. And then check again, because it’s not a matter of if, but when.

Random Short Take #60

Welcome to Random Short take #60.

  • VMware Cloud Director 10.3 went GA recently, and this post will point you in the right direction when it comes to planning the upgrade process.
  • Speaking of VMware products hitting GA, VMware Cloud Foundation 4.3 became available about a week ago. You can read more about that here.
  • My friend Tony knows a bit about NSX-T, and certificates, so when he bumped into an issue with NSX-T and certificates in his lab, it was no big deal to come up with the fix.
  • Here’s everything you wanted to know about creating an external bootable disk for use with macOS 11 and 12 but were too afraid to ask.
  • I haven’t talked to the good folks at StarWind in a while (I miss you Max!), but this article on the new All-NVMe StarWind Backup Appliance by Paolo made for some interesting reading.
  • I loved this article from Chin-Fah on storage fear, uncertainty, and doubt (FUD). I’ve seen a fair bit of it slung about having been a customer and partner of some big storage vendors over the years.
  • This whitepaper from Preston on some of the challenges with data protection and long-term retention is brilliant and well worth the read.
  • Finally, I don’t know how I came across this article on hacking Playstation 2 machines, but here you go. Worth a read if only for the labels on some of the discs.

Infrascale Puts The Customer First

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Infrascale recently presented at Storage Field Day 22. You can see videos of the presentation here, and download my rough notes from here.

 

Infrascale and Customer Experience

Founded in 2011, Infrascale is headquartered is in Reston, Virginia, with around 170 employees and offices in the Ukraine and India as well. As COO Brian Kuhn points out in the presentation, the company is “[a]ll about customers and their data”. Infrascale’s vision is “to be the most trusted data protection provider”.

Build Trust via Four Ps

Predictable

  • Reliable connections, response time, product
  • Work side by side like a dependable friend

Personal

  • People powered – partners, not numbers
  • Your success is our success

Proficient

  • Support and product experts with the right tools
  • Own the issue from beginning to end

Proactive

  • Onboarding, outreach to proactively help you
  • Identify issues before they impact your business

“Human beings dealing with human beings”

 

Product Portfolio

Infrascale Cloud Application Backup (ICAB)

SaaS Backup

  • Backup Microsoft 365, Google Workspace, Salesforce, Box, and Dropbox
  • Recover individual items (mail, file, or record) or entire mailboxes, folders, or databases
  • Close the retention gap between the SaaS provider and corporate, legal, and / or regulatory policy

Infrascale Cloud Backup (ICB)

Endpoint Backup

  • Backup desktop, laptop, or mobile devices directly to the cloud – wherever you work
  • Recover data in seconds – and with ease
  • Optimised for branch office and remote / home workers
  • Provides ransomware detection and remediation

Infrascale Backup and Disaster Recovery (IBDR)

Backup and DR / DRaaS for Servers

  • Backup mission-critical servers to both an on-premises and bootable cloud appliance
  • Boot ready in ~2 minutes (locally or in the cloud)
  • Restore system images or files / folders
  • Optimised for VMware and Hyper-V VMs and Windows bare metal

 

Digging Deeper with IBDR

What Is It?

Infrascale describes IBDR as a hybrid-cloud solution, with hardware and software on-premises, and service infrastructure in the cloud. In terms of DR as a service, Infrascale provides the ability to backup and replicate your data to a secondary location. In the event of a disaster, customers have the option to restore individual files and folders, or the entire infrastructure if required. Restore locations are flexible as well, with a choice of on-premises or in the cloud. Importantly, you also have the ability to failback when everything’s sorted out.

One of the nice features of the service is unlimited DR and failover testing, and there are no fees attached to testing, recovery, or disaster failover.

Range

The IBDR solution also comes in a few different versions, as the table below shows.

[image courtesy of Infrascale]

The appliances are also available in a range of shapes and sizes.

[image courtesy of Infrascale]

Replication Options

In terms of replication, there are multiple destinations available, and you can fairly easily fire up workloads in the Infrascale cloud if need be.

[image courtesy of Infrascale]

 

Thoughts and Further Reading

Anyone who’s worked with data protection solutions will understand that it can be difficult to put together a combination of hardware and software that meets the needs of the business from a commercial, technical, and process perspective – particularly when you’re starting at a small scale and moving up from there. Putting together a managed service for data protection and disaster recovery is possibly harder still, given that you’re trying to accommodate a wide variety of use cases and workloads. And doing this using commercial off-the-shelf offerings can be a real pain. You’re invariably tied to the roadmap of the vendor in terms of features, and your timeframes aren’t normally the same as your vendor (unless you’re really big). So there’s a lot to be said for doing it yourself. If you can get the software stack right, understand what your target market wants, and get everything working in a cost-effective manner, you’re onto a winner.

I commend Infrascale for the level of thought the company has given to this solution, its willingness to work with partners, and the fact that it’s striving to be the best it can in the market segment it’s targeting. My favourite part of the presentation was hearing the phrase “we treat [data] like it’s our own”. Data protection, as I’ve no doubt rambled on about before, is hard, and your customers are trusting you with getting them out of a pickle when something goes wrong. I think it’s great that the folks at Infrascale have this at the centre of everything they’re doing. I get the impression that it’s “all care, all responsibility” when it comes to the approach taken with this offering. I think this counts for a lot when it comes to data protection and DR as a service offerings. I’ll be interested to see how support for additional workloads gets added to the platform, but what they’re doing now seems to be enough for many organisations. If you want to know more about the solution, the resource library has some handy datasheets, and you can get an idea of some elements of the recommended retail pricing from this document.

Cohesity DataProtect Delivered As A Service – SaaS Connector

I recently wrote about my experience with Cohesity DataProtect Delivered as a Service. One thing I didn’t really go into in that article was the networking and resource requirements for the SaaS Connector deployment. It’s nothing earth-shattering, but I thought it was worthwhile noting nonetheless.

In terms of the VM that you deploy for each SaaS Connector, it has the following system requirements:

  • 4 CPUs
  • 10 GB RAM
  • 20 GB disk space (100 MB throughput, 100 IOPs)
  • Outbound Internet connection

In terms of scaleability, the advice from Cohesity at the time of writing is to deploy “one SaaS Connector for each 160 VMs or 16 TB of source data. If you have more data, we recommend that you stagger their first full backups”. Note that this is subject to change. The outbound Internet connectivity is important. You’ll (hopefully) have some kind of firewall in place, so the following ports need to be open.

Port
Protocol
Target
Direction (from Connector)
Purpose

443

TCP

helios.cohesity.com

Outgoing

Connection used for control path

443

TCP

helios-data.cohesity.com

Outgoing

Used to send telemetry data

22, 443

TCP

rt.cohesity.com

Outgoing

Support channel

11117

TCP

*.dmaas.helios.cohesity.com

Outgoing

Connection used for data path

29991

TCP

*.dmaas.helios.cohesity.com

Outgoing

Connection used for data path

443

TCP

*.cloudfront.net

Outgoing

To download upgrade packages

443

TCP

*.amazonaws.com

Outgoing

For S3 data traffic

123, 323

UDP

ntp.google.com or internal NTP

Outgoing

Clock sync

53

TCP & UDP

8.8.8.8 or internal DNS

Bidirectional

Host resolution

Cohesity recommends that you deploy more than one SaaS Connector, and you can scale them out depending on the number of VMs / how much data you’re protecting with the service.

If you’re having concerns with bandwidth, you can configure the bandwidth used by the SaaS Connector via Helios.

Navigate to Settings -> SaaS Connections and click on Bandwidth Usage Options. You can then add a rule.

You then schedule bandwidth usage, potentially for quiet times (particularly useful in small environments where Internet connections may be shared with end users). There’s support for upload and download traffic, and multiple schedules as well.

And that’s pretty much it. Once you have your SaaS Connectors deployed you can monitor everything from Helios.

 

Random Short Take #58

Welcome to Random Short take #58.

  • One of the many reasons I like Chin-Fah is that he isn’t afraid to voice his opinion on various things. This article on what enterprise storage is (and isn’t) made for some insightful reading.
  • VMware Cloud Director 10.3 is now GA – you can read more about it here.
  • Feeling good about yourself? That’ll be quite enough of that thanks. This article from Tom on Value Added Resellers (VARs) and technical debt goes in a direction you might not expect. (Spoiler: staff are the technical debt). I don’t miss that part of the industry at all.
  • Speaking of work, this article from Preston on being busy was spot on. I’ve worked in many places in my time where it’s simply alarming how much effort gets expended in not achieving anything. It’s funny how people deal with it in different ways too.
  • I’m not done with articles by Preston though. This one on configuring a NetWorker AFTD target with S3 was enlightening. It’s been a long time since I worked with NetWorker, but this definitely wasn’t an option back then.  Most importantly, as Preston points out, “we backup to recover”, and he does a great job of demonstrating the process end to end.
  • I don’t think I talk about data protection nearly enough on this weblog, so here’s another article from a home user’s perspective on backing up data with macOS.
  • Do you have a few Rubrik environments lying around that you need to report on? Frederic has you covered.
  • Finally, the good folks at Backblaze are changing the way they do storage pods. You can read more about that here.

*Bonus Round*

I think this is the 1000th post I’ve published here. Thanks to everyone who continues to read it. I’ll be having a morning tea soon.

Cohesity DataProtect Delivered As A Service – A Few Notes

As part of a recent vExpert giveaway the folks at Cohesity gave me a 30-day trial of the Cohesity DataProtect Delivered as a Service offering. This is a component of Cohesity’s Data Management as a Service (DMaaS) offering and, despite the slightly unwieldy name, it’s a pretty neat solution. I want to be clear that it’s been a little while since I had any real stick time with Cohesity’s DataProtect offering, and I’m looking at this in a friend’s home lab, so I’m making no comments or assertions regarding the performance of the service. I’d also like to be clear that I’m not making any recommendation one way or another with regards to the suitability of this service for your organisation. Every organisation has its own requirements and it’s up to you to determine whether this is the right thing for you.

 

Overview

I’ve added a longer article here that explains the setup process in more depth, but here’s the upshot of what you need to do to get up and running. In short, you sign up, select the region you want to backup workloads to, configure your SaaS Connectors for the particular workloads you’d like to protect, and then go nuts. It’s really pretty simple.

Workloads

In terms of supported workloads, the following environments are currently supported:

  • Hypervisors (VMware and Hyper-V);
  • NAS (generic SMB and NFS, Isilon, and NetApp);
  • Microsoft SQL Server;
  • Oracle;
  • Microsoft 365;
  • Amazon AWS; and
  • Physical hosts.

This list will obviously grow as some of the support for particular workloads with DataProtect and Helios improves over time.

Regions

The service is currently available in seven AWS Regions:

  • US East (Ohio)
  • US East (N. Virginia)
  • US West (Oregon)
  • US West (N. California)
  • Canada (Central)
  • Asia Pacific (Sydney)
  • Europe (Frankfurt)

You’ve got some flexibility in terms of where you store your data, but it’s my understanding that the telemetry data (i.e. Helios) goes to one of the US East Regions. It’s also important to note that once you’ve put data in a particular Region, you can’t then move that data to another Region.

Encryption

Data is encrypted in-flight and at rest, and you have a choice of KMS solutions (Cohesity-managed or DIY AWS KMS). Note that once you choose a KMS, you cannot change your mind. Well, you can, but you can’t do anything about it.

 

Thoughts

Data protection as a service offerings are proving increasingly popular with customers, data protection vendors, and service providers. The appeal for the punters is that they can apply some of the same thinking to protecting their investment in their cloud as they did to standing it up in the first place. The appeal for the vendors and SPs is that they can deliver service across a range of platforms without shipping tin anywhere, and build up annuity business as well.

With regards to this particular solution, it still has some rough edges, but it’s great to see just how much can already be achieved. As I mentioned, it’s been a while since I had some time with DataProtect, and some of the usability and functionality of both it and Helios has really come along in leaps and bounds. And the beauty of this being a vendor-delivered as a Service offering is that features can be rolled out on a frequent basis, rather than waiting for quarterly improvements to arrive via regularly scheduled software maintenance releases. Once you get your head around the workload, things tend to work as expected, and it was fairly simple to get everything setup and working in a short period of time.

This isn’t for everyone, obviously. If you’re not a fan of doing things in AWS, then you’re really not going to like how this works. And if you don’t operate near one of the currently supported Regions, then the tyranny of bandwidth (i.e. physics) may prevent reasonable recovery times from being achievable for you. It might seem a bit silly, but these are nonetheless things you need to consider when looking at adopting a service like this. It’s also important to think of the security posture of these kinds of services. Sure, things are encrypted, and you can use MFA with Helios, but folks outside the US sometimes don’t really dig the idea of any of their telemetry data living in the US. Sure, it’s a little bit tinfoil hat but it you’d be surprised how much it comes up. And it should be noted that this is the same for on-premises Cohesity solutions using Helios. Then again, Cohesity is by no means alone in sending telemetry data back for support and analysis purposes. It’s fairly common and something your infosec will likely already be across how to deal with it.

If you’re fine with that (and you probably should be), and looking to move away from protecting your data with on-premises solutions, or looking for something that gives you some flexible deployment and management options, this could be of interest. As I mentioned, the beauty of SaaS-based solutions is that they’re more frequently updated by the vendor with fixes and features. Plus you don’t need to do a lot of the heavy lifting in terms of care and feeding of the environment. You’ll also notice that this is the DataProtect component, and I imagine that Cohesity has plans to fill out the Data Management part of the solution more thoroughly in the future. If you’d like to try it for yourself, I believe there’s a trial you can sign up for. Finally, thanks to the Cohesity TAG folks for the vExpert giveaway and making this available to people like me.

Ransomware? More Like Ransom Everywhere …

Stupid title, but ransomware has been in the news quite a bit recently. I’ve had some tabs open in my browser for over twelve months with articles about ransomware that I found interesting. I thought it was time to share them and get this post out there. This isn’t comprehensive by any stretch, but rather it’s a list of a few things to look at when looking into anti-ransomware solutions, particularly for NAS environments.

 

It Kicked Him Right In The NAS

The way I see it (and I’m really not the world’s strongest security person), there are (at least) three approaches to NAS and ransomware concerns.

The Endpoint

This seems to be where most companies operate – addressing ransomware as it enters the organisation via the end users. There are a bunch of solutions out there that are designed to protect humans from themselves. But this approach doesn’t always help with alternative attack vectors and it’s only as good as the update processes you have in place to keep those endpoints updated. I’ve worked in a few shops where endpoint protection solutions were deployed and then inadvertently clobbered by system updates or users with too many privileges. The end result was that the systems didn’t do what they were meant to and there was much angst.

The NAS Itself

There are things you can do with NetApp solutions, for example, that are kind of interesting. Something like Stealthbits looks neat, and Varonis also uses FPolicy to get a similar result. Your mileage will vary with some of these solutions, and, again, it comes down to the ability to effectively ensure that these systems are doing what they say they will, when they will.

Data Protection

A number of the data protection vendors are talking about their ability to recover quickly from ransomware attacks. The capabilities vary, as they always do, but most of them have a solid handle on quick recovery once an infection is discovered. They can even help you discover that infection by analysing patterns in your data protection activities. For example, if a whole bunch of data changes overnight, it’s likely that you have a bit of a problem. But, some of the effectiveness of these solutions is limited by the frequency of data protection activity, and whether anyone is reading the alerts. The challenge here is that it’s a reactive approach, rather than something preventative. That said, companies like Rubrik are working hard to enhance its Radar capability into something a whole lot more interesting.

Other Things

Other things that can help limit your exposure to ransomware include adopting generally robust security practices across the board, monitoring all of your systems, and talking to your users about not clicking on unknown links in emails. Some of these things are easier to do than others.

 

Thoughts

I don’t think any of these solutions provide everything you need in isolation, but the challenge is going to be coming up with something that is supportable and, potentially, affordable. It would also be great if it works too. Ransomware is a problem, and becoming a bigger problem every day. I don’t want to sound like I’m selling you insurance, but it’s almost not a question of if, but when. But paying attention to some of the above points will help you on your way. Of course, sometimes Sod’s Law applies, and things will go badly for you no matter how well you think you’ve designed your systems. At that point, it’s going to be really important that you’ve setup your data protection systems correctly, otherwise you’re in for a tough time. Remember, it’s always worth thinking about what your data is worth to you when you’re evaluating the relative value of security and data protection solutions. This article from Chin-Fah had some interesting insights into the problem. And this article from Cohesity outlined a comprehensive approach to holistic cyber security. This article from Andrew over at Pure Storage did a great job of outlining some of the challenges faced by organisations when rolling out these systems. This list of NIST ransomware resources from Melissa is great. And if you’re looking for a useful resource on ransomware from VMware’s perspective, check out this site.

ComplyTrust And The Right To Be Forgotten

I came across a solution from ComplyTrust a little while ago and thought it was worth mentioning here. I am by no means any kind of authority with this kind of stuff so this is very much a high-level view.

 

The Problem

Over the last little while (decades even?), a number of countries and local authorities have tightened up privacy regulations in the hope that citizens would have some level of protection from big corporations mercilessly exploiting their personal information for commercial gain. A number of these regulations (General Data Protection Regulation, California Consumer Privacy Act, etc.) include the idea of “the right to be forgotten”. This gives citizens the right to request, in particular circumstances, that data about them is not kept by particular organisations. Why is this important? We have pretty good privacy protection in Australia, but I still get recruiters moving from one organisation to another and taking contacts with them.

How Does This Happen? 

Think of all the backups of data that organisations make. Now think of how long some of those get kept for. For every 1 restore you do, you might have made 100 backups. Depending on what an organisation is doing for data protection, there are potentially thousands of copies of records relating to you stored on their infrastructure. And then, when a company gets acquired, all that data gets passed on to the acquiring company. Suddenly it becomes that much more difficult to keep track of which company has your data on file.

Not a week goes by where I don’t get an offer to buy contact details of VMware users or people interested in cloud products. There is a whole world of B2B marketing going on where your details are being sold for a very low price. Granted, some of this is illegitimate in the first place, so regulations aren’t really going to help you. But the right to be removed from various databases around the place is still important, and something that governments are starting to pay more attention to.

The challenge for these organisations is that they can’t exactly keep a database of people they’re meant to forget – it defeats the purpose of the exercise.

 

The Solution?

So what’s one possible solution? Forget-Me-Yes (FMY) is a “Software-as-a-Service API Platform specifically manages both organizational and individual Right-to-be-Forgotten (RtbF) and Right-of-Erase (RoE) compliance of structured data for Brazil’s LGPD, Europe’s GDPR, California Consumer Privacy Act (CCPA),  Virginia CDPA, Nevada SB220, and Washington Privacy Act (WPA)”.

It’s a SaaS offering going for US $39.99 per month. To get started, you authenticate the service with one or more databases you want to manage. In version 1 of the software, it only supports Salesforce. I understand that ComplyTrust is looking to expand support to get the solution working with Shopify, Marketo, and a generic SQL plugin. It stores just enough information to uniquely identify the person, and no more than that.

 

Thoughts and Further Reading

Some of us want to be remembered forever, but most of us place more value on the choice not to be remembered forever. As I said at the start, I have very little real understanding of the depth and breadth of some of the privacy issues facing both citizens and corporations alike. That said, working closely with data protection offerings on a daily basis, and being focused on data retention for fun and profit, I can see how this is going to become something of a hot topic as the world gets back to spending time trying to understand the implications of keeping scads of data on folks without their consent. Clearly, a solution like this from ComplyTrust isn’t the final word in addressing the issue, but it’s nice to see that folks are taking this problem seriously. I’m looking forward to hearing more about this product as it evolves in the next little while.

MDP – Yeah You Know Me

Data protection is a funny thing. Much like insurance, most folks understand that it’s important, normally dread having to use it, and dislike the fact that it costs money “just in case something goes wrong”. But then they get hit by ransomware, or Judy in Accounting absolutely destroys a critical spreadsheet, and they realise it’s probably not such a bad thing to have this “data protection”. Books are weird too. Not the idea that we’ll put a whole bunch of information in a file and make it accessible to people. Rather, that sometimes that information is given context and then printed out, sold, read, and stuck on a shelf somewhere for future reference. Indeed, I was a voracious consumer of technical books early in my career, particularly when many vendors were insisting that this was the way to share knowledge with end users. YouTube wasn’t a thing, and access to manuals and reference guides was limited to partners or the vendors themselves. The problem with technical books, however, is that if they cover a specific version of software (or hardware or whatever), they very quickly become outdated in potentially significant ways. As enjoyable as some of those books about Windows NT 4.0 might have been for us all, they quickly became monitor stands when Windows 2000 was released. The more useful books were the ones that shared more of the how, what, when, and why of the topic, rather than digging in to specific guidance on how to do an activity with a particular solution. Particularly when that solution was re-written by the vendor between major versions.

Early on in my career I got involved in my employer’s backup and recovery solution. At the time it was all about GFS backup schemes and DDS-2 drives and per-server protection schemes that mostly worked. It was viewed as an unnecessary expense and given to junior staff to look after. There was a feeling, at least with some of the Windows stuff, that if anything went wrong it would likely go wrong in a big way. I generally felt ill at ease when recovery requests would hit the service desk queue. As a result of this, my interest in being able to bring data back from human error, disaster, or other kinds of failure was piqued, and I went out and bought a copy of Unix Backup and Recovery. As a system administrator, it was a great book to have at hand. There was a nice combination of understandable examples and practical application of backup and recovery principles covered throughout that book. I used to joke that it even had a happy ending, and everyone got their data back. As I moved through my career, I maintained an interest in data protection (it seemed, at one stage, to go hand in hand with storage for whatever reason), and I’ve often wondered what people do when they aren’t given the appropriate guidance on how to best do data protection to meet their needs.

All of this is an extremely long-winded way of saying that my friend W. Curtis Preston has released his fourth book, the snappily titled “Modern Data Protection“, and it makes for some excellent reading. If you listen to him talk about why he wrote another book on his podcast, you’ll appreciate that this thing was over 10 years in the making, had an extensive outline developed for it, and really took a lot of effort to get done. As Curtis points out, he goes out of his way not to name vendors or solutions in the book (he works for Druva). Instead, he spends time on the basics (why backup?), what you should backup, how to backup, and even when you should be backing up things.

This one doesn’t just cover off the traditional centralised server / tape library combo so common for many years in enterprise shops. It also goes into more modern on-premises solutions (I think the kids call them hyper-converged) and cloud-native solutions of all different shapes and sizes. He talks about how to protect a wide variety of workloads and solution architectures, drills in on the importance of recovery testing, and even covers off the difference between backup and archive. Yes, they are different, and I’m not just saying that because I contributed that particular chapter. There’s talk of traditional data sources, deduplication technologies, and more fashionable stuff like Docker and Kubernetes.

The book comes in at a svelte 350ish pages, and you know that each chapter could have almost been a book on its own (or at least a very long whitepaper). That said, Preston does a great job of sticking to the topic at hand, and breaking down potentially complex scenarios in a concise and simple to digest fashion. As I like to say to anyone who’ll listen, this stuff can be hard to get right, and you want to get it right, so it helps if the book you’re using gets it right too.

Should you read this book? Yes. Particularly if you have data or know someone who has data. You may be a seasoned industry veteran or new to the game. It doesn’t matter. You might be a consultant, an architect, or an end user. You might even work at a data protection vendor. There’s something in this for everyone. I was one of the technical editors on this book, fancy myself as knowing about about data protection, and I learnt a lot of stuff. Even if you’re not directly in charge of data protection for your own data or your organisation’s data, this is an extremely useful guide that covers off the things you should be looking at with your existing solution or with a new solution. You can buy it directly from O’Reilly, or from big book sellers. It comes in electronic and physical versions and is well worth checking out. If you don’t believe me, ask Mellor, or Leib – they’ll tell you the same thing.

  • Publisher: O’Reilly
  • ISBN: 9781492094050

Finally, thanks to Preston for getting me involved in this project, for putting up with my English (AU) spelling, and for signing my copy of Unix Backup and Recovery.