Random Short Take #77

Welcome to Random Short Take #77. Spring has sprung. Let’s get random.

Finally, the blog turned 15 years old recently (about a month ago). I’ve been so busy with the day job that I forgot to appropriately mark the occasion. But I thought we should do something. So if you’d like some stickers (I have some small ones for laptops, and some big ones because I can’t measure things properly), send me your address via this contact form and I’ll send you something as a thank you for reading along.

Random Short Take #75

Welcome to Random Short Take #75. Half the year has passed us by already. Let’s get random.

  • I talk about GiB all the time when sizing up VMware Cloud on AWS for customers, but I should take the time to check in with folks if they know what I’m blithering on about. If you don’t know, this explainer from my friend Vincent is easy to follow along with – A little bit about Gigabyte (GB) and Gibibyte (GiB) in computer storage.
  • MinIO has been in the news a bit recently, but this article from my friend Chin-Fah is much more interesting than all of that drama – Beyond the WORM with MinIO object storage.
  • Jeff Geerling seems to do a lot of projects that I either can’t afford to do, or don’t have the time to do. Either way, thanks Jeff. This latest one – Building a fast all-SSD NAS (on a budget) – looked like fun.
  • You like ransomware? What if I told you you can have it cross-platform? Excited yet? Read Melissa’s article on Multiplatform Ransomware for a more thorough view of what’s going on out there.
  • Speaking of storage and clouds, Chris M. Evans recently published a series of videos over at Architecting IT where he talks to NetApp’s Matt Watt about the company’s hybrid cloud strategy. You can see it here.
  • Speaking of traditional infrastructure companies doing things with hyperscalers, here’s the July 2022 edition of What’s New in VMware Cloud on AWS.
  • In press release news, Aparavi and Backblaze have joined forces. You can read more about that here.
  • I’ve spent a lot of money over the years trying to find the perfect media streaming device for home. I currently favour the Apple TV 4K, but only because my Boxee Box can’t keep up with more modern codecs. This article on the Best Device for Streaming for Any User – 2022 seems to line up well with my experiences to date, although I admit I haven’t tried the NVIDIA device yet. I do miss playing ISOs over the network with the HD Mediabox 100, but those were simpler times I guess.

Random Short Take #72

This one is a little behind thanks to some work travel, but whatever. Let’s get random.

Random Short Take #70

Welcome to Random Short Take #70. Let’s get random.

Aparavi Announces File Protect & Insight – Helps With Third Drawer Down

I recently had the opportunity to speak to Victoria Grey (CMO), Darryl Richardson (Chief Product Evangelist), and Jonathan Calmes (VP Business Development) from Aparavi regarding their File Protect and Insight solution. If you’re a regular reader, you may remember I’m quite a fan of Aparavi’s approach and have written about them a few times. I thought I’d share some of my thoughts on the announcement here.

 

FPI?

The title is a little messy, but think of your unstructured data in the same way you might look at the third drawer down in your kitchen. There’s a bunch of stuff in there and no-one knows what it all does, but you know it has some value. Aparavi describes File Protect and Insight (FPI), as “[f]ile by file data protection and archive for servers, endpoints and storage devices featuring data classification, content level search, and hybrid cloud retention and versioning”. It takes the data you’re not necessarily sure about, and makes it useful. Potentially.

It comes with a range of features out of the box, including:

  • Data Awareness
    • Data classification
    • Metadata aggregation
    • Policy driven workflows
  • Global Security
    • Role-based permissions
    • Encryption (in-flight and at rest)
    • File versioning
  • Data Search and Access
    • Anywhere / anytime file access
    • Seamless cloud integration
    • Full-content search

 

How Does It Work?

The solution is fairly simple to deploy. There’s a software appliance installed on-premises (this is known as the aggregator). There’s a web-accessible management console, and you configure your sources to be protected via network access.

[image courtesy of Aparavi]

You get the ability to mount backup data from any point in time, and you can provide a path that can be shared via the network to users to access that data. Regardless of where you end up storing the data, you leave the index on-premises, and search against the index, not the source. This saves you in terms of performance and speed. There’s also a good story to be had in terms of cloud provider compatibility. And if you’re looking to work with an on-premises / generic S3 provider, chances are high that the solution won’t have too many issues with that either.

 

Thoughts

Data protection is hard to do well at the best of times, and data management is even harder to get right. Enterprises are busy generating terabytes of data and are struggling to a) protect it successfully, and b) make use of that protected data in an intelligent fashion. It seems that it’s no longer enough to have a good story around periodic data protection – most of the vendors have proven themselves capable in this regard. What differentiates companies is the ability to make use of that protected data in new and innovative ways that can increase the value to that data to the business that’s generating it.

Companies like Aparavi are doing a pretty good job of taking the madness that is your third drawer down and providing you with some semblance of order in the chaos. This can be a real advantage in the enterprise, not only for day to day data protection activities, but also for extended retention and compliance challenges, as well as storage optimisation challenges that you may face. You still need to understand what the data is, but something like FPI can help you to declutter what that data is, making it easier to understand.

I also like some of the ransomware detection capabilities being built into the product. It’s relatively rudimentary for the moment, but keeping a close eye on the percentage of changed data is a good indicator of wether or not something is going badly wrong with the data sources you’re trying to protect. And if you find yourself the victim of a ransomware attack, the theory is that Aparavi has been storing a secondary, immutable copy of your data that you can recover from.

People want a lot of different things from their data protection solutions, and sometimes it’s easy to expect more than is reasonable from these products without really considering some of the complexity that can arise from that increased level of expectation. That said, it’s not unreasonable that your data protection vendors should be talking to you about data management challenges and deriving extra value from your secondary data. A number of people have a number of ways to do this, and not every way will be right for you. But if you’ve started noticing a data sprawl problem, or you’re looking to get a bit more from your data protection solution, particularly for unstructured data, Aparavi might be of some interest. You can read the announcement here.

Aparavi Announces Enhancements, Makes A Good Thing Better

I recently had the opportunity to speak to Victoria Grey (CMO) and Jonathan Calmes (VP Business Development) from Aparavi regarding some updates to their Active Archive solution. If you’re a regular reader, you may remember I’m quite a fan of Aparavi’s approach. I thought I’d share some of my thoughts on the announcement here.

 

Aparavi?

According to Aparavi, Active Archive delivers “SaaS-based Intelligent, Multi-Cloud Data Management”. The idea is that:

  • Data is archived to cloud or on-premises based on policies for long-term lifecycle management;
  • Data is organised for easy access and retrieval; and
  • Data is accessible via Contextual Search.

Sounds pretty neat. So what’s new?

 

What’s New?

Direct-to-cloud

Direct-to-cloud provides the ability to archive data directly from source systems to the cloud destination of choice, with minimal local storage requirements. Instead of having to sotre archive data locally, you can now send bits of it straight to cloud, minimising your on-premises footprint.

  • Now supporting AWS, Backblaze B2, Caringo, Google, IBM Cloud, Microsoft Azure, Oracle Cloud, Scality, and Wasabi;
  • Trickle or bulk data migration – Adding bulk migration of data from one storage destination to another; and
  • Dynamic translation from cloud to cloud.

[image courtesy of Aparavi]

Data Classification

The Active Archive solution can now index, classify, and tag archived data. This makes it simple to classify data based on individual words, phrases, dates, file types, and patterns. Users can easily identify and tag data for future retrieval purposes such as compliance, reference, or analysis.

  • Customisable taxonomy using specific words, phrases, patterns, or meta-data
  • Pre-set classifications of “legal”, “confidential”, and PII
  • Easy to add new ad–hoc classifications at any time

Advanced Archive Search

Intuitive query interface

  • Search by metadata including classifications, tag, dates, file name, file type, optionally with wildcards
  • Search within document content using words, phrases, patterns, and complex queries
  • Searches across all locations
  • Contextual Search: produces results of the match within context
  • No retrieval until file is selected; no egress fees until retrieved

 

Conclusion

I was pretty enthusiastic about Aparavi when they came out of stealth, and I’m excited about some of the new features they’ve added to the solution. Data management is a hard nut to crack. Primarily because a lot of different organisations have a lot of different requirements for storing data long term. And there are a lot of different types of data that need to be stored. Aparavi isn’t a silver bullet for data management by any stretch, but it certainly seems to meet a lot of the foundational requirements for a solid archive strategy. There are some excellent options in terms of storage by location, search, and organisation.

The cool thing isn’t just that they’ve developed a solid multi-cloud story. Rather, it’s that there are options when it comes to the type of data mobility the user might require. They can choose to do bulk migrations, or take it slower by trickling data to the destination. This provides for some neat flexibility in terms of infrastructure requirements and windows of opportunity. It strikes me that it’s the sort of solution that can be tailored to work with a business’s requirements, rather than pushing it in a certain direction.

I’m also a big fan of Aparavi’s “Open Data” access approach, with an open API that “enables access to archived data for use outside of Aparavi”, along with a published data format for independent data access. It’s a nice change from platforms that feel the need to lock data into proprietary formats in order to store them long term. There’s a good chance the type of data you want to archive in the long term will be around longer than some of these archive solutions, so it’s nice to know you’ve got a chance of getting the data back if something doesn’t work out for the archive software vendor. I think it’s worth keeping an eye on Aparavi, they seem to be taking a fresh approach to what has become a vexing problem for many.

Aparavi Comes Out Of Stealth. Dazzles.

Santa Monica-based (I love that place) SaaS data protection outfit, Aparavi, recently came out of stealth, and I thought it was worthwhile covering their initial offering.

 

So Latin Then?

What’s an Aparavi? It’s apparently Latin and means “[t]o prepare, make ready, and equip”. The way we consume infrastructure has changed, but a lot of data protection products haven’t changed to accommodate this. Aparavi are keen to change that, and tell me that their product is “designed to work seamlessly alongside your business continuity plan to ease the burden of compliance and data protection for mid market companies”. Sounds pretty neat, so how does it work?

 

Architecture

Aparavi uses a three tiered architecture written in Node.js and C++. It consists of:

  • The Aparavi hosted platform;
  • An on-premises software appliance; and
  • A source client.

[image courtesy of Aparavi]

The platform is available as a separate module if required, otherwise it’s hosted on Aparavi’s infrastructure. The software appliance is the relationship manager in the solution. It performs in-line deduplication and compression. The source client can be used as a temporary recovery location if required. AES-256 encryption is done at the source, and the metadata is also encrypted. Key storage is all handled via keyring-style encryption mechanisms. There is communication between the web platform and the appliance, but the appliance can operate when the platform is off-line if required.

 

Cool Features

There are a number of cool features of the Aparavi solution, including:

  • Patented point-in-time recovery – you can recover data from any combination of local and cloud storage (you don’t need the backup set to live in one place);
  • Cloud active data pruning – will automatically remove files, and portions of files no longer needed from cloud locations;
  • Multi-cloud agile retention (this is my favourite) – you can use multiple cloud locations without the need to move data from one to the other;
  • Open data format – open source published, with Aparavi providing a reader so data can be read by any tool; and
  • Multi-tier, multi-tenancy – Aparavi are very focused on delivering a multi-tier and multi-tenant environment for service providers and folks who like to scale.

 

Retention Simplified

  • Policy Engine – uses file exclusion and inclusion lists
  • Comprehensive Search – search by user name and appliance name as well as file name
  • Storage Analytics – how much you’re saving by pruning, data growth / shrinkage over time, % change monitor
  • Auditing and Reporting Tools
  • RESTful API – anything in the UI can be automated

 

What Does It Run On?

Aparavi runs on all Microsoft-supported Windows platforms as well as most major Linux distributions (including Ubuntu and RedHat). They use the Amazon S3 API, and support GCP and are working on OpenStack and Azure. They’ve also got some good working relationships with Cloudian and Scality, amongst others.

[image courtesy of Aparavi]

 

Availability?

Aparavi are having a “soft launch” on October 25th. The product is licensed on the amount of source data protected. From a pricing perspective, the first TB is always free. Expect to pay US $999/year for 3TB.

 

Conclusion

Aparavi are looking to focus on the mid-market to begin with, and stressed to me that it isn’t really a tool that will replace your day to day business continuity tool. That said, they recognize that customers may end up using the tool in ways that they hadn’t anticipated.

Aparavi’s founding team of Adrian Knapp, Rod Christensen, Jonathan Calmes and Jay Hill have a whole lot of experience with data protection engines and a bunch of use cases. Speaking to Jonathan it feels like they’ve certainly thought about a lot the issues facing folks leveraging cloud for data protection. I like the open approach to storing the data, and the multi-cloud friendliness takes the story well beyond the hybrid slideware I’m accustomed to seeing from some companies.

Cloud has opened up a lot of possibilities for companies that were traditionally constrained by their own ability to deliver functional, scalable and efficient infrastructure internally. It’s since come to people’s attention that, much like the days of internal-only deployments, a whole lot of people who should know better still don’t understand what they’re doing with data protection, and there’s crap scattered everywhere. Products like Aparavi are a positive step towards taking control of data protection in fluid environments, potentially helping companies to get it together in an effective manner. I’m looking forward to diving further into the solution, and am interested to see how the industry reacts to Aparavi over the coming months.