Zerto Announces 8.5 and Zerto Data Protection

Zerto recently announced 8.5 of its product, along with a new offering, Zerto Data Protection (ZDP). I had the good fortune to catch up with Caroline Seymour (VP, Product Marketing) about the news and thought I’d share some thoughts here.

 

ZDP, Yeah You Know Me

Global Pandemic for $200 Please, Alex

In “these uncertain times”, organisations are facing new challenges

  • No downtime, no data loss, 24/7 availability
  • Influx of remote work
  • Data growth and sprawl
  • Security threats
  • Acceleration of cloud

Many of these things were already a problem, and the global pandemic has done a great job highlighting them.

“Legacy Architecture”

Zerto paints a bleak picture of the “legacy architecture” adopted by many of the traditional dat protection solutions, positing that many IT shops need to use a variety of tools to get to a point where operations staff can sleep better at night. Disaster recovery, for example, is frequently handled via replication for mission-critical applications, with backup being performed via periodic snapshots for all other applications. ZDP aims to being all this together under one banner of continuous data protection, delivering:

  • Local continuous backup and long-term retention (LTR) to public cloud; and
  • Pricing optimised for backup.

[image courtesy of Zerto]

Features

[image courtesy of Zerto]

So what do you get with ZDP? Some neat features, including:

  • Continuous backup with journal
  • Instant restore from local journal
  • Application consistent recovery
  • Short-term SLA policy settings
  • Intelligent index and search
  • LTR to disk, object or Cloud (Azure, AWS)
  • LTR policies, daily incremental with weekly, monthly or yearly fulls
  • Data protection workflows

 

New Licensing

It wouldn’t be a new software product without some mention of new licensing. If you want to use ZDP, you get:

  • Backup for short-term retention and LTR;
  • On-premises or backup to cloud;
  • Analytics; and
  • Orchestration and automation for backup functions.

If you’re sticking with (the existing) Zerto Cloud Edition, you get:

  • Everything in ZDP;
  • Disaster Recovery for on-premises and cloud;
  • Multi-cloud support; and
  • Orchestration and automation.

 

Zerto 8.5

A big focus of Zerto’s recently has been VMware on public cloud support, including the various flavours of VMware on Azure, AWS, and Oracle Cloud. There are a bunch of reasons why this approach has proven popular with existing VMware customers looking to migrate from on-premises to public cloud, including:

  • Native VMware support – run existing VMware workloads natively on IaaS;
  • Policies and configuration don’t need to change;
  • Minimal changes – no need to refactor applications; and
  • IaaS benefits- reliability, scale, and operational model.

[image courtesy of Zerto]

New in 8.5

With 8.5, you can now backup directly to Microsoft Azure and AWS. You also get instant file and folder restores to production. There’s now support for VMware on public cloud disaster recovery and data protection for Microsoft Azure VMware Solution, Google Cloud VMware Engine, and the Oracle Cloud VMware Solution. You also get platform automation and lifecycle management features, including:

  • Auto-evacuate for recovery hosts;
  • Auto-populate for recovery hosts; and
  • Encryption capabilities.

And finally, a Zerto PowerShell Cmdlets Module has also been released.

 

Thoughts and Further Reading

The writing’s been on the wall for some time that Zerto might need to expand its solution offering to incorporate backup and recovery. Continuous data protection is a great feature and my experience with Zerto has been that it does what it says on the tin. The market, however, is looking for ways to consolidate solution offerings in order to save a few more dollarydoos and keep the finance department happy. I haven’t seen the street pricing for ZDP, but Seymour seemed confident that it stacks up well against the more traditional data protection options on the market, particularly when compared against offerings that incorporate components that deal with CDP and periodic data protection with different tools. There’s a new TCO calculator on the Zerto website, and there’s also the opportunity to talk to a Zerto account representative about your particular needs.

I’ve always treated regular backup and recovery and disaster recovery as very different things, mainly because they are. Companies frequently make the mistake of trying to cobble together some kind of DR solution using traditional backup and recovery tools. I’m interested to see how Zerto goes with this approach. It’s not the first company to converge elements that fit in the data protection space together, and it will be interesting to see how much of the initial uptake of ZDP is with existing customers or net new logos. The broadening of support for the VMware on X public cloud workloads is good news for enterprises too (putting aside my thoughts on whether or not that’s a great long term strategy for said enterprises). There’s some interesting stuff happening, and I’m looking forward to see how the story unfolds over the next 6 – 12 months.

Quobyte Announces 3.0

Quobyte recently announced Release 3.0 of its software. I had the opportunity to speak to Björn Kolbeck (Co-Founder and CEO) about the release, and thought I’d share some thoughts here.

 

About Quobyte

If you haven’t heard of Quobyte before, it was founded in 2013 by some ex-Googlers and HPC experts. The folks at Quobyte were heavily influenced by Google’s scale-out software model and wanted to bring that to the enterprise. Quobyte has had software in production since 2016 and has customers across a range of industry verticals, including financial services and media streaming. It’s not really object storage, more a parallel file system or, at a stretch, scale-out NAS.

 

The Tech

Kolbeck describes Quobyte as “storage for Generation Scale-Out” and is focussed on “getting storage out of the ugly corner of specialised appliances”.

Unlimited Performance

  • Linear scaling delivers unlimited performance
  • No bottlenecks – scale from small to 1000s of servers
  • No more NFS – it’s part of the problem

Deploy Anywhere

  • True software storage runs anywhere – bare metal, containers, cloud
  • Almost any x86t server – no appliances

Unconditional Simplicity

  • Anyone can do storage, it’s just another Linux application
  • All in user space, installs in minutes

 

The Announcement

Free Edition

The first part of the announcement is that there’s a free edition (previously there was a 45 day trial on offer). It’s limited in terms of capacity, support, and file system clients, but could be useful in labs and smaller environments.

[image courtesy of Quobyte]

3.0 Release

The 3.0 release is also a big part of Quobyte’s news, with the new version delivering a bunch of new features, most of which are outlined below.

360 Security

  • Holistic data protection
  • End to end AES encryption (in transit / at rest / untrusted storage nodes)
  • Selective TLS support
  • Access keys for the file system
  • X.509 certificates
  • Event stream (metadata, file access)

Policy Engine

Powerful Policy Engine

  • For: Tenant, volume, file, client
  • Control: Layout, tiering, QoS, recoding, caching
  • Dynamic: Runtime re-configurable

Automated

  • Auto file layout: replication + EC and Flash + HDD
  • Auto selection of replication factor, EC schema

Self-Service

Quobyte is looking to deliver a “cloud-like experience” with its self-service capabilities.

Login for users

  • Manage access keys
  • Check resource consumption

Authenticate using access keys

  • S3
  • File system driver
  • K8s / CSI
  • User-space drivers: HDFS, TF, MPI-IO

Multi-Cluster

Data Mover

  • Bi-directional sync (evental consistency)
  • Policy-based data tiering between clusters
  • Recoding

TLS between clusters

More Native Drivers

HDFS

MPI-IO

Benefit of kernel bypass

  • Lower latency
  • Less memory bandwidth

 

Thoughts and Further Reading

One of the challenges with software-defined storage is invariably the constraint that poor hardware choices can put on performance. Kolbeck acknowledged that Quobyte is “as fast as your hardware”. I asked him whether Quobyte provided guidance on hardware choices that worked well with the platform. There is a bunch of recommended (and tested) hardware listed on this page. He did mention that whichever way you decided to go, it was recommended to stick with either Mellanox or Broadcom NICs due to issues observed with other vendors’ Linux drivers. There’re also recommendations on the site for public cloud instance sizing covering AWS, GCP, and Oracle.

Quobyte is being deployed to support scale-out workloads in the enterprise across a number of sectors including financial services, life sciences, media and entertainment, and manufacturing in Europe and Asia. Kolbeck noted that one of the interesting things about the advent of smart everything is that “car manufacturers are suddenly in the machine learning field” and looking for new ways to support their businesses.

There are a lot of reasons to like software-defined storage offerings. You can generally run them on anything, and performance enhancements can frequently be had via code upgrades. That’s not to say that you don’t get that with the big box slingers, but the flexibility of hardware choice has tremendous appeal, particularly in the enterprise market where it can feel like the margin on commodity hardware can be exorbitant. Quobyte hasn’t been around forever, but the folks over there seem to have a pretty solid heritage in software-defined and scale-out storage solutions – a good sign if you’re in the market for a software-defined, scale-out storage solution. Some folks are going to rue the lack of NFS support, but I’m sure Kolbeck and the team would be happy to sit down and discuss with them why that’s no great loss. There’s some pretty cool stuff in this release, and the free edition is definitely worth taking for a spin. I’m looking forward to hearing more from Quobyte over the next little while.

StorONE Q3-2020 Update

StorONE recently announced details of its Q3-2020 software release. I had the opportunity to talk about the announcement with George Crump and thought I’d share some brief thoughts here.

 

Release Highlights

Performance Improvements

One of the key highlights of this release is significant performance improvements for the platform based purely on code optimisations. Crump tells me that customers with Intel Optane and NVMe SSDs will be extremely happy with what they see. What’s also notable is that customers still using high latency media such as hard disk drives will still see a performance improvement of 15 – 20%.

Data Protection

StorONE has worked hard on introducing some improved resilience for the platform as well, with two key features being made available:

  • vRack; and
  • vReplicate.

vRack provides the ability to split S1 storage across more than one rack (or row, for that matter) to mitigate any failures impacting the rack hosting the controllers and disk enclosures. You can now also set tolerance for faults at an enclosure level, not just a drive level.

[image courtesy of StorONE]

vReplicate extends S1:Replicate’s capabilities to provide cascading replication. You can now synchronously replicate between data centres or campus sites and then asynchronously send that data to another site, hundreds of kilometres away if necessary. Primary systems can be an All-Flash Array.next, traditional All-Flash Array, or a Hybrid Array, and the replication target can be an inexpensive hard disk only S1 system.

[image courtesy of StorONE]

There’s now full support for Volume Shadow Copy Service (VSS) for S1:Snap users.

 

Other Enhancements

Some of the other enhancements included with this release are:

  • Improved support for NVMe-oF (including the ability to simultaneously support iSCSI and FC along with NVMe);
  • Improved NAS capability, with support for quotas and NIS / LDAP; and
  • Downloadable stats for increased insights.

 

Thoughts

Some of these features might seem like incremental improvements, but this is an incremental release. I like the idea of supporting legacy connections while supporting the ability to add newer tech to the platform, and providing a way forward in terms of hardware migration. The vRack resiliency concept is also great, and a salient reminder that the ability to run this code on commodity hardware makes some of these types of features a little more accessible. I also like the idea of being able to download analytics data and do things with it to gain greater insights into what the system is doing. Sure, it’s an incremental improvement, but an important one nonetheless.

I’ve been a fan of the StorONE story for some time now (and not just because the team slings a few dollars my way to support the site every now and then). I think the key to much of StorONE’s success has been that it hasn’t gotten caught up trying to be a storage appliance vendor, and has instead focussed on delivering reliable code on commodity systems that results in a performance-oriented storage platform that continues to improve from a software perspective without being tied to a particular hardware platform. The good news is though, when new hardware becomes available (such as Optane), it’s not a massive problem to incorporate it into the solution.

StorONE has always talked a big game in terms of raw performance numbers, but I think it’s the addition of features such as vRack and improvements to the replication capability that really makes it a solution worth investigating. It doesn’t hurt that you can check the pricing calculator out for yourself before you decide to go down the path of talking to StorONE’s sales team. I’m looking forward to seeing what StorONE has in store in the next little while, as I get the impression it’s going to be pretty cool. You can read details of the update here.

Pure Storage Acquires Portworx

Pure Storage announced its intention to acquire Portworx in mid-September. Around that time I had the opportunity to talk about the news with Goutham Rao (Portworx CTO) and Matt Kixmoeller (Pure Storage VP, Strategy) and thought I’d share some brief thoughts here.

 

The News

Pure and Portworx have entered an agreement that will see Pure pay approximately $370M US in cash. Portworx will form a new Cloud Native Business Unit inside Pure to be led by Portworx CEO Murli Thirumale. All Portworx founders are joining Pure, with Pure investing significantly to grow the new business unit. According to Pure, “Portworx software to continue as-is, supporting deployments in any cloud and on-premises, and on any bare metal, VM, or array-based storage”. It was also noted that “Portworx solutions to be integrated with Pure yet maintain a commitment to an open ecosystem”.

About Portworx

Described as the “leading Kubernetes data services platform”, Portworx was founded in 2014 in Los Altos, CA. It runs a 100% software, subscription, and cloud business model with development and support sites in California, India, and Eastern Europe. The product has been GA since 2017, and is used by some of the largest enterprise and Cloud / SaaS companies globally.

 

What’s A Portworx?

The idea behind Portworx is that it gives you data services for any application, on any Kubernetes distribution, running on any cloud, any infrastructure, and at any stage of the application lifecycle. To that end, it’s broken up into a bunch of different components, and runs in the K8s control plane adjacent to the applications.

PX-Store

  • Software-defined storage layer that automates container storage for developers and admins
  • Consistent storage APIs: cloud, bare metal, or arrays

PX-Migrate

  • Easily move applications between clusters
  • Enables hybrid cloud and multi-cloud mobility

PX-Backup

  • Application-consistent backup for cloud native apps with all k8s artefacts and state
  • Backup to any cloud or on-premises object storage

PX-Secure

  • Implement consistent encryption and security policies across clouds
  • Enable multi-tenancy with access controls

PX-DR

  • Sync and async replication between Availability Zones and regions
  • Zero RPO active / active for high resiliency

PX-Autopilot

  • GitOps-driven automation allows for easier platform for non-storage experts to deploy stateful applications, monitors everything about an application, reacts and prevents problems from happening
  • Auto-scale storage as your app grows to reduce costs

 

How It Fits Together

When you bring Portworx into the Pure Storage picture, you start to see that it fits well with the existing Pure Storage picture. In the picture below you’ll also see support for the standard container storage interface (CSI) to work with other vendors.

[image courtesy of Pure Storage]

Also worth noting is that PX-Essentials remains free forever for workloads under 5TB and 5 nodes).

 

Thoughts and Further Reading

I think this is a great move by Pure, mainly because it lends them a whole lot more credibility with the DevOps folks. Pure was starting to make inroads with Pure Storage Orchestrator, and I think this move will strengthen that story. Giving Portworx access to Pure’s salesforce globally is also going to broaden its visibility in the market and open up doors to markets that may have been difficult to get into previously.

Persistent storage for containers is heating up. As Rao pointed out in our discussion, “as container adoption grows, storage becomes a problem”. Portworx already had a good story to tell in this space, and Pure is no slouch when it comes to delivering advanced storage capabilities across a variety of platforms. I like that the messaging has been firmly based in maintaining the openness of the platform and I’m interested to see what other integrations happen as the two companies start working more closely together. If you’d like another perspective on the news, check out Chris Evans’s article here.

Rancher Labs Announces 2.5

Rancher Labs recently announced version 2.5 of its platform. I had the opportunity to catch up with co-founder and CEO Sheng Liang about the release and other things that Rancher has been up to and thought I’d share some of my notes here.

 

Introducing Rancher Labs 2.5

Liang described Rancher as a way for organisations to “[f]ocus on enriching their own apps, rather than trying to be a day 1, day 2 K8s outfit”. With that thinking in mind, the new features in 2.5 are as follows:

  1. Rancher now installs everywhere – on EKS, OpenShift, whatever – and they’ve removed a bunch of dependencies. Rancher 2.5 can now be installed on any CNCF-certified Kubernetes cluster, eliminating the need to set up a separate Kubernetes cluster before installing Rancher. The new lightweight installation experience is useful for users who already have access to a cloud-managed Kubernetes service like EKS.
  2. Enhanced management for EKS. Rancher Labs was a launch partner for EKS and used to treat it like a dumb distribution. The management architecture has been revamped with improved lifecycle management for EKS. It now uses the native EKS way of doing various things and only adds value where it’s not already present.
  3. Managing edge clusters. Liang described K3s as “almost the goto distribution for edge computing (5G, IoT, ATMs, etc)”. When you get into some of these scenarios, the scale of operations is becoming pretty big. You need to re-think multi-cluster management when you have that in place. Rancher has introduced a GitOps framework to do that. “GitOps at scale” – created its own GitOp framework to accommodate the required scale.
  4. K8s has plenty of traction in government and high security environments, hence the development of RKE Government Edition.

 

Other Notes

Liang mentioned that Longhorn uptake (made generally available in May 2020) has been great, with over 10000 active deployments (not just downloads) in the wild now. He noted that persistent storage with K8s has been hard to do, and Longhorn has gone some way to improving that experience. K3s is now a CNCF Sandbox project, not just a Rancher project, and this has certainly helped with its popularity as well. He also mentioned the acquisition by SUSE was continuing to progress, and expected it would be closed in Q4, 2020.

 

Thoughts and Further Reading

Longtime readers of this blog will know that my background is fairly well entrenched in infrastructure as opposed to cloud-native technologies. Liang understands this, and always does a pretty good job of translating some of the concepts he talks about with me back into infrastructure terms. The world continues to change, though, and the popularity of Kubernetes and solutions like Rancher Labs highlights that it’s no longer a simple conversation about LUNs, CPUs, network throughput and which server I’ll use to host my application. Organisations are looking for effective ways to get the most out of their technology investment, and Kubernetes can provide an extremely effective way of deploying and running containerised applications in an agile and efficient fashion. That said, the bar for entry into the cloud-native world can still be considered pretty high, particularly when you need to do things at large scale. This is where I think platforms like the one from Rancher Labs make so much sense. I may have described some elements of cloud-native architecture as a bin fire previously, but I think the progress that Rancher is making demonstrates just how far we’ve come. I know that VMware and Kubernetes has little in common, but it strikes me that we’re seeing the same development progress that we saw 15 years ago with VMware (and ESX in particular). I remember at the time that VMware seemed like a whole bunch of weird to many infrastructure folks, and it wasn’t until much later that these same people were happily using VMware in every part of the data centre. I suspect that the adoption of Kubernetes (and useful management frameworks for it) will be a bit quicker than that, but it’s going to be heavily reliant on solutions like this to broaden the appeal of what’s a very useful (but nonetheless challenging) container deployment and management ecosystem.

If you’re in the APAC region, Rancher is hosting a webinar in a friendly timezone later this month. You can get more details on that here. And if you’re on US Eastern time, there’s the “Computing on the Edge with Kubernetes” one day event that’s worth checking out.

Random Short Take #44

Welcome to Random Short Take #44. A few players have worn 44 in the NBA, including Danny Ainge and Pistol Pete, but my favourite from this list is Keith Van Horn.  A nice shooting touch and strong long sock game. Let’s get random.

  • ATMs are just computers built to give you money. And it’s scary to think of the platforms that are used to deliver that functionality. El Reg pointed out a recent problem with one spotted in the wild in Ngunnawal.
  • Speaking of computing at the edge, I found this piece from Keith interesting. As much as things change they stay the same. I think he’s spot on when he says “[m]anufacturers and technology companies must come together with modular solutions that enable software upgrades for these assets’ lives”. We need to be demanding more from the industry when it comes to some of this stuff.
  • Heard about Project Monterey at VMworld and wanted to know more? Pensando has you covered.
  • I enjoyed this article from Preston about the difference between bunkers and vaults – worth checking out even if you’re not a Dell EMC customer.
  • Cloud – it can be tough to know which way to go. And a whole bunch of people have an interest in you using their particular solution. This article from Chris Evans was particularly insightful.
  • DH2i has launched DxOdyssey for IoT – you can read more about that here.
  • Speaking of news, Retrospect recently announced Backup 17.5 too. There are some cloud improvements, and support for macOS Big Sur beta.
  • It’s the 30th anniversary of Vanilla Ice’s “Ice Ice Baby“, and like me you were probably looking for a comprehensive retrospective on Vanilla Ice’s career. Look no further than this article over at The Ringer.

Storage Field Day 20 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Storage Field Day 20.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

This is a quick post to say thanks once again to Stephen and Ben, and the presenters at Storage Field Day 20. I had a super fun and educational time. For easy reference, here’s a list of the posts I did covering the events (they may not match the order of the presentations).

Storage Field Day 20 – I’ll Be At Storage Field Day 20

Storage Field Day 20 – (Fairly) Full Disclosure

Cisco MDS, NVMe, and Flexibility

Qumulo – Storage Your Way

Pure Storage Announces Second Generation FlashArray//C with QLC

Nebulon – It’s Server Storage Jim, But Not As We Know It

VAST Data – The Best Is Yet To Come

Intel Optane And The DAOS Storage Engine

 

Also, here’s a number of links to posts by my fellow delegates (in no particular order). They’re all very smart people, and you should check out their stuff, particularly if you haven’t before. I’ll attempt to keep this updated as more posts are published. But if it gets stale, the Storage Field Day 20 landing page will have updated links.

 

Jason Benedicic (@JABenedicic)

Nebulon Shadow Storage

 

David Chapa (@DavidChapa)

“High Optane” Fuel For Performance

 

Becky Elliott (@BeckyLElliott)

Guess Who’s Attending Storage Field Day 20?

 

Ray Lucchesi (@RayLucchesi)

Storage that provides 100% performance at 99% full

106: Greybeards talk Intel’s new HPC file system with Kelsey Prantis, Senior Software Eng. Manager, Intel

 

Vuong Pham (@Digital_KungFu)

Storage Field Day 20.. oh yeah!!

 

Keiran Shelden (@Keiran_Shelden)

Let’s Zoom to SFD20

 

Enrico Signoretti (@esignoretti)

An Intriguing Approach to Modern Data Center Infrastructure

Is Scale-Out File Storage the New Black?

 

Keith Townsend (@CTOAdvisor)

Will the DPU kill the Storage Array?

 

[image courtesy of Stephen Foskett]

Random Short Take #43

Welcome to Random Short Take #43. A few players have worn 43 in the NBA, including Frank Brickowski, but my favourite from this list is Red Kerr (more for his commentary chops than his game, I think).  Let’s get random.

  • Mike Wilson has published Part 2 of his VMware VCP 2020 Study Guide and it’s a ripper. Check it out here. I try to duck and weave when it comes to certification exams nowadays, but these kind of resources are invaluable.
  • It’s been a while since I had stick time with Data Domain OS, but Preston’s article on password hardening was very useful.
  • Mr Foskett bought a cloud, of sorts. Read more about that here. Anyone who knows Stephen knows that he’s all about what’s talking about what’s happening in the industry, but I do enjoy reading about these home projects as well.
  • Speaking of clouds, Rancher was named “A Leader” in multi-cloud container development platforms by an independent research firm. You can read the press release here.
  • Datadobi had a good story to share about what it did with UMass Memorial Health Care. You can read the story here.
  • Steve O has done way too much work understanding how to change the default theme in Veeam Enterprise Manager 10 and documenting the process so you don’t need to work it out. Read about the process here.
  • Speaking of data protection, Zerto has noticed Azure adoption increasing at quite a pace, amongst other things.
  • This was a great article on open source storage from Chin-Fah.

Intel Optane And The DAOS Storage Engine

Disclaimer: I recently attended Storage Field Day 20.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Intel recently presented at Storage Field Day 20. You can see videos of the presentation here, and download my rough notes from here.

 

Intel Optane Persistent Memory

If you’re a diskslinger, you’ve very likely heard of Intel Optane. You may have even heard of Intel Optane Persistent Memory. It’s a little different to Optane SSD, and Intel describes it as “memory technology that delivers a unique combination of affordable large capacity and support for data persistence”. It looks a lot like DRAM, but the capacity is greater, and there’s data persistence across power losses. This all sounds pretty cool, but isn’t it just another form factor for fast storage? Sort of, but the application of the engineering behind the product is where I think it starts to get really interesting.

 

Enter DAOS

Distributed Asynchronous Object Storage (DAOS) is described by Intel as “an open source software-defined scale-out object store that provides high bandwidth, low latency, and high I/O operations per second (IOPS) storage containers to HPC applications”. It’s ostensibly a software stack built from the ground up to take advantage of the crazy speeds you can achieve with Optane, and at scale. There’s a handy overview of the architecture available on Intel’s website. Traditional object (and other storage systems) haven’t really been built to take advantage of Optane in quite the same way DAOS has.

[image courtesy of Intel]

There are some cool features built into DAOS, including:

  • Ultra-fine grained, low-latency, and true zero-copy I/O
  • Advanced data placement to account for fault domains
  • Software-managed redundancy supporting both replication and erasure code with online rebuild
  • End-to-end (E2E) data integrity
  • Scalable distributed transactions with guaranteed data consistency and automated recovery
  • Dataset snapshot capability
  • Security framework to manage access control to storage pools
  • Software-defined storage management to provision, configure, modify, and monitor storage pools

Exciting? Sure is. There’s also integration with Lustre. The best thing about this is that you can grab it from Github under the Apache 2.0 license.

 

Thoughts And Further Reading

Object storage is in its relative infancy when compared to some of the storage architectures out there. It was designed to be highly scalable and generally does a good job of cheap and deep storage at “web scale”. It’s my opinion that object storage becomes even more interesting as a storage solution when you put a whole bunch of really fast storage media behind it. I’ve seen some media companies do this with great success, and there are a few of the bigger vendors out there starting to push the All-Flash object story. Even then, though, many of the more popular object storage systems aren’t necessarily optimised for products like Intel Optane PMEM. This is what makes DAOS so interesting – the ability for the storage to fundamentally do what it needs to do at massive scale, and have it go as fast as the media will let it go. You don’t need to worry as much about the storage architecture being optimised for the storage it will sit on, because the folks developing it have access to the team that developed the hardware.

The other thing I really like about this project is that it’s open source. This tells me that Intel are both focused on Optane being successful, and also focused on the industry making the most of the hardware it’s putting out there. It’s a smart move – come up with some super fast media, and then give the market as much help as possible to squeeze the most out of it.

You can grab the admin guide from here, and check out the roadmap here. Intel has plans to release a new version every 6 months, and I’m really looking forward to seeing this thing gain traction. For another perspective on DAOS and Intel Optane, check out David Chapa’s article here.

 

 

VAST Data – The Best Is Yet To Come

Disclaimer: I recently attended Storage Field Day 20.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

VAST Data recently presented at Storage Field Day 20. You can see videos of their presentation here, and download my rough notes from here.

 

Feature Progress

VAST Data has come up with a pretty cool solution, and it continues to evolve as time passes (funny how that works). You can see that a whole stack of features has been added to the platform since the 1.0 release in November 2018.

[image courtesy of VAST Data]

The Similarity

One feature that caught my eye was the numbers that VAST Data presented that had been observed with the similarity-based data reduction capability (introduced in 1.2). In the picture below you’ll see a lot of 3:1 and 2:1. It doesn’t seem like that great a ratio, but the data that’s being worked on here is pre-compressed. My experience with applying data reduction techniques to pre-compressed and / or pre-deduplicated data is that it’s usually tough to get anything decent out of it, so I think this is pretty neat.

[image courtesy of VAST Data]

Snap to S3

Another cool feature (added in 3.0) is snap to cloud / S3. This is one of those features where you think, ha, I hadn’t been looking for that specifically, but it does look kind of cool.

[image courtesy of VAST Data]

Replicate snaps to object store

  • Independent schedule, retention

Presented as .remote folder

  • Self service restore (<30 days .snapshots, >30 days .remote)

Large objects

  • Data and metadata
  • Compressed

ReaderVM

  • Presents read-only .remote
  • .ovf, AMI
  • Parallel for bandwidth

 

Thoughts and Further Reading

You’ll notice I haven’t written a lot in this article. This isn’t because I don’t think VAST Data is intriguing, or that I don’t like what it can do. Rather, I think you’d be better served checking out the Storage Field Day presentations yourself (I recommend the presentations from both Storage Field Day 18 and Storage Field Day 20). You can also read my summary of the tech from Storage Field Day 18 here, but obviously things have progressed significantly since then.

As Howard Marks pointed out in his presentation, this is not the first rodeo for many of the folks involved in VAST Data’s rapid development. You can see from the number of features being added in a short time that they’re keen on making more progress and meeting the needs of the market. But it still takes time. SMB failover is cool, but some people might be more interested in seeing vSphere support sooner rather than later. I have no insight into the roadmap, but based on what I’ve seen over the last 20ish months, there’s been some great stuff forthcoming, and definitely more cool things to come. Coupled with the fact that this thing relies heavily on QLC and you’ve got a compelling platform at potentially a super interesting price point upon which you can do a bunch of interesting things, storage-wise. I’m looking forward to seeing what’s in store over the next 20 months.