Rancher Labs Announces Longhorn General Availability

This happened a little while ago, and the news about Rancher Labs has shifted to Suse’s announcement regarding its intent to acquire Rancher Labs. Nonetheless, I had a chance to speak to Sheng Liang (Co-founder and CEO) about Longhorn’s general availability, and thought I’d share some thoughts here.

 

What Is It?

Described by Rancher Labs as “an enterprise-grade, cloud-native container storage solution”, Longhorn has been in development for around 6 years, in beta for a year, and is now generally available. It’s comprised of around 40k lines of Golang code, and each volume is a set of independent micro-services, orchestrated by Kubernetes.

Liang described this to me as “enterprise-grade distributed block storage for K8S”, and the features certainly seem to line up with those expectations. There’s support for:

  • Thin-provisioning, snapshots, backup, and restore
  • Non-disruptive volume expansion
  • Cross-cluster disaster recovery volume with defined RTO and RPO
  • Live upgrade of Longhorn software without impacting running volumes
  • Full-featured Kubernetes CLI integration and standalone UI

From a licensing perspective, Longhorn is free to download and use, and customers looking for support can purchase a premium support model with the same SLAs provided through Rancher Support Services. There are no licensing fees, and node-based subscription pricing keeps costs to a minimum.

Use Cases

Why would you use it?

  • Bare metal workloads
  • Edge persistent
  • Geo-replicated storage for Amazon EKS
  • Application backup and disaster recovery

 

Thoughts

One of the barriers to entry when moving from traditional infrastructure to cloud-native is that concepts seem slightly different to the comfortable slippers you may have been used to in enterprise infrastructure land. The neat thing about Longhorn is that it leverages a lot of the same concepts you’ll see in traditional storage deployments to deliver resilient and scalable persistent storage for Kubernetes.

This doesn’t mean that Rancher Labs is trying to compete with traditional storage vendors like Pure Storage and NetApp when it comes to delivering persistent storage for cloud workloads. Liang acknowledges that these shops can offer more storage features than Longhorn can. There seems to be nonetheless a requirement for this kind of accessible and robust solution. Plus it’s 100% open source.

Rancher Labs already has a good story to tell when it comes to making Kubernetes management a whole lot simpler. The addition of Longhorn simply improves that story further. If you’re feeling curious about Longhorn and would like to know more, this website has a lot of useful information.

Komprise Announces Cloud Capability

Komprise recently made some announcements around extending its product to cloud. I had the opportunity to speak to Krishna Subramanian (President and COO) about the news and I thought I’d share some of my thoughts here.

 

The Announcement

Komprise has traditionally focused on unstructured data stored on-premises. It has now extended the capabilities of Komprise Intelligent Data Management to include cloud data. There’s currently support for Amazon S3 and Wasabi, with Google Cloud, Microsoft Azure, and IBM support coming soon.

 

Benefits

So what do you get with this capability?

Analyse data usage across cloud accounts and buckets easily

  • Single view across cloud accounts, buckets, and storage classes
  • Analyse AWS usage by various metrics accurately based on access times
  • Explore different data archival, replication, and deletion strategies with instant cost projections

Optimise AWS costs with analytics-driven archiving

  • Continuously move objects by policy across Cloud Network Attached Storage (NAS), Amazon S3, Amazon S3 Standard-IA, Amazon S3 Glacier, and Amazon S3 Glacier DeepArchive
  • Minimise costs and penalties by moving data at the right time based on access patterns

Bridge to Big Data/Artificial Intelligence (AI) projects

  • Create virtual data lakes for Big Data, AI – search for exactly what you need across cloud accounts and buckets
  • Native access to moved data on each storage class with full data fidelity

Create Cyber Resiliency with AWS

  • Copy S3 data to AWS to protect from ransomware with an air-gapped copy

[image courtesy of Komprise]

 

Why Is This Good?

The move to cloud storage hasn’t been all beer and skittles for enterprise. Storing large amounts of data in public cloud presents enterprises with a number of challenges, including:

  • Poor visibility – “Bucket sprawl”
  • Insufficient data – Cloud does not easily track last access / data use
  • Cost complexity – Manual data movement can lead to unexpected retrieval cost surprises
  • Labour – Manually moving data is error-prone and time-consuming

Sample Use Cases

Some other reasons you might want to have Komprise manage your data include:

  • Finding ex-employee data stored in buckets.
  • Data migration – you might want to take a copy of your data from Wasabi to AWS.

There’s support for all unstructured data (file and object), so the benefits of Komprise can be enjoyed regardless of how you’re storing your unstructured data. It’s also important to note that there’s no change to the existing licensing model, you’re just now able to use the product on public cloud storage.

 

Thoughts

Effective data management remains a big challenge for enterprises. It’s no secret that public cloud storage is really just storage that lives in another company’s data centre. Sure, it might be object storage, rather than file based, but it’s still just a bunch of unstructured data sitting in another company’s data centre. The way you consume that data may have changed, and certainly the way you pay for it has changed, but fundamentally it’s still your unstructured data sitting on a share or a filesystem. The problems you had on-premises though, still manifest in public cloud environments (i.e. data sprawl, capacity issues, etc). That’s why the Komprise solution seems so compelling when it comes to managing your on-premises storage consumption, and extending that capability to cloud storage is a no-brainer. When it comes to storing unstructured data, it’s frequently a bin fire of some sort or another. The reason for this is because it doesn’t scale well. I don’t mean the storage doesn’t scale – you can store petabytes all over the place if you like. But if you’re still hand crafting your shares and manually moving data around, you’ll notice that it becomes more and more time consuming as time goes on (and your data storage needs grow).

One way to address this challenge is to introduce a level of automation, which is something that Komprise does quite well. If you’ve got many terabytes of data stored on-premises and in AWS buckets (or you’re looking to move some old data from on-premises to the cloud) and you’re not quite sure what it’s all for or how best to go about it, Komprise can certainly help you out.

Spectro Cloud – Profile-Based Kubernetes Management For The Enterprise

 

Spectro Cloud launched in March. I recently had the opportunity to speak to Tenry Fu (CEO) and Tina Nolte (VP, Products) about the launch, and what Spectro Cloud is, and thought I’d share some notes here.

 

The Problem?

I was going to start this article by saying that Kubernetes in the enterprise is a bin fire, but that’s too harsh (and entirely unfair on the folks who are doing it well). There is, however, a frequent compromise being made between ease of use, control, and visibility.

[image courtesy of Spectro Cloud]

According to Fu, the way that enterprises consume Kubernetes shouldn’t just be on the left or the right side of the diagram. There is a way to do both.

 

The Solution?

According to the team, Spectro Cloud is “a SaaS platform that gives Enterprises control over Kubernetes infrastructure stack integrations, consistently and at scale”. What does that mean though? Well, you get access to the “table stakes” SaaS management, including:

  • Managed Kubernetes experience;
  • Multi-cluster and environment management; and
  • Enterprise features.

Profile-Based Management

You also get some cool stuff that heavily leverages profile-based management, including infrastructure stack modelling and lifecycle management that can be done based on integration policies. In short, you build cluster profiles and then apply them to your infrastructure. The cluster profile usually describes the OS flavour and version, Kubernetes version, storage configuration, networking drivers, and so on. The Pallet orchestrator then ensures these profiles are used to maintain the desired cluster state. There are also security-hardened profiles available out of the box.

If you’re a VMware-based cloud user, the appliance (deployed from an OVA file) sits in your on-premises VMware cloud environment and communicates with the Spectro Cloud SaaS offering over TLS, and the cloud properties are dynamically propagated.

Licensing

The solution is licensed on the number of worker node cores under management. This is tiered based on the number of cores and it follows a simple model: More cores and a longer commitment equals a bigger discount.

 

The Differentiator?

Current Kubernetes deployment options vary in their complexity and maturity. You can take the DIY path, but you might find that this option is difficult to maintain at scale. There are packaged options available, such as VMware Tanzu, but you might find that multi-cluster management is not always a focus. The managed Kubernetes option (such as those offered by Google and AWS) has its appeal to the enterprise crowd, but those offerings are normally quite restricted in terms of technology offerings and available versions.

Why does Spectro Cloud have appeal as a solution then? Because you get control over the integrations you might want to use with your infrastructure, but also get the warm and fuzzy feeling of leveraging a managed service experience.

 

Thoughts

I’m no great fan of complexity for complexity’s sake, particularly when it comes to enterprise IT deployments. That said, there are always reasons why things get complicated in the enterprise. Requirements come from all parts of the business, legacy applications need to be fed and watered, rules and regulations seem to be in place simply to make things difficult. Enterprise application owners crave solutions like Kubernetes because there’s some hope that they, too, can deliver modern applications if only they used some modern application deployment and management constructs. Unfortunately, Kubernetes can be a real pain in the rear to get right, particularly at scale. And if enterprise has taught us anything, it’s that most enterprise shops are struggling to do the basics well, let alone the needlessly complicated stuff.

Solutions like the one from Spectro Cloud aren’t a silver bullet for enterprise organisations looking to modernise the way applications are deployed, scaled, and managed. But something like Spectro Cloud certainly has great appeal given the inherent difficulties you’re likely to experience if you’re coming at this from a standing start. Sure, if you’re a mature Kubernetes shop, chances are slim that you really need something like this. But if you’re still new to it, or are finding that the managed offerings don’t give you the flexibility you might need, then something like Spectro Cloud could be just what you’re looking for.

Backblaze B2 And A Happy Customer

Backblaze recently published a case study with AK Productions. I had the opportunity to speak to Aiden Korotkin and thought I’d share some of my notes here.

 

The Problem

Korotkin’s problem was a fairly common one – he had lots of data from previous projects that had built up over the years. He’d been using a bunch of external drives to store this data, and had had a couple of external drives fail, including the backup drives. Google’s cloud storage option “seemed like a more redundant and safer investment financially to go into the cloud space”. He was already using G Suite. And so he migrated his old projects off hard drives and into the cloud. He had a credit with Google for a year to use its cloud platform. It became pretty expensive after that, not really feasible. Korotkin also stated that calculating the expected costs was difficult. He also felt that he needed to find something more private / secure.

 

The Solution

So how did he come by Backblaze? He did a bunch of research. Backblaze B2 consistently showed up in the top 15 results when online magazines were publishing their guides to cloud storage. He’d heard of it before, possibly seen a demo. The technology seemed very streamlined, exactly what he needed for his business. A bonus was that there were no extra steps to backup his QNAP NAS as well. This seemed like the best option.

Current Workflow

I asked Korotkin to walk me though his current workflow. B2 is being used as a backup target for the moment. Physics being what it is, it’s still “[h]ard to do video editing direct on the cloud”. The QNAP NAS houses current projects, with data mirrored to B2. Archives are uploaded to a different area of B2. After time, data is completely archived to the cloud.

How About Ingest?

Korotkin needed to move 12TB from Google to Backblaze. He used Flexify.IO to transfer from one cloud to the next. They walked him through how to do it. The good news is that they were able to do it in 12 hours.

It’s About Support

Korotkin noted that between Backblaze and Flexify.IO “the tech support experience was incredible”. He said that he “[f]elt like I was very much taken care of”. He got the strong impression that the support staff enjoyed helping him, and were with him through every step of the way. The most frustrating part of the migration, according to Korotkin, was dealing with Google generally. The offloading of the data from Google cost more money than he’s paid to date with Backblaze. “As a small business owner I don’t have $1500 just to throw away”.

 

Thoughts

I’ve been a fan of Backblaze for some time. I’m a happy customer when it comes to the consumer backup product, and I’ve always enjoyed the transparency it’s displayed as a company with regards to its pod designs and the process required to get to where it is today. I remain fascinated by the workflows required to do multimedia content creation successfully, and I think this story is a great tribute to the support culture of Backblaze. It’s nice to see that smaller shops, such as Korotkin’s, are afforded the same kind of care and support experience as some of the bigger customers might. This is a noticeable point of distinction when compared to working with the hyperscalers. It’s not that those folks aren’t happy to help, they’re just operating at a different level.

Korotkin’s approach was not unreasonable, or unusual, particularly for content creators. Keeping data safe is a challenge for small business, and solutions that make storing and protecting data easier are going to be popular. Korotkin’s story is a good one, and I’m always happy to hear these kinds of stories. If you find yourself shuffling external drives, or need a lot of capacity but don’t want to invest too heavily in on-premises storage, Backblaze has a good story in terms of both cloud storage and data protection.

Random Short Take #34

Welcome to Random Short Take #34. Some really good players have worn 34 in the NBA, including Ray Allen and Sir Charles. This one, though, goes out to my favourite enforcer, Charles Oakley. If it feels like it’s only been a week since the last post, that’s because it has.

  • I spoke to the folks at Rancher Labs a little while ago, and they’re doing some stuff around what they call “Edge Scalability” and have also announced Series D funding.
  • April Fool’s is always a bit of a trying time, what with a lot of the world being a few timezones removed from where I live. Invariably I stop checking news sites for a few days to be sure. Backblaze recognised that these are strange times, and decided to have some fun with their releases, rather than trying to fool people outright. I found the post on Catblaze Cloud Backup inspiring.
  • Hal Yaman announced the availability of version 2.6 of his Office 365 Backup sizing tool. Speaking of Veeam and handy utilities, the Veeam Extract utility is now available as a standalone tool. Cade talks about that here.
  • VMware vSphere 7 recently went GA. Here’s a handy article covering what it means for VMware cloud providers.
  • Speaking of VMware things, John Nicholson wrote a great article on SMB and vSAN (I can’t bring myself to write CIFS, even when I know why it’s being referred to that way).
  • Scale is infinite, until it isn’t. Azure had some minor issues recently, and Keith Townsend shared some thoughts on the situation.
  • StorMagic recently announced that it has acquired KeyNexus. It also announced the availability of SvKMS, a key management system for edge, DC, and cloud solutions.
  • Joey D’Antoni, in collaboration with DH2i, is delivering a webinar titled “Overcoming the HA/DR and Networking Challenges of SQL Server on Linux”. It’s being held on Wednesday 15th April at 11am Pacific Time. If that timezone works for you, you can find out more and register here.

Random Short Take #27

Welcome to my semi-regular, random news post in a short format. This is #27. You’d think it would be hard to keep naming them after basketball players, and it is. None of my favourite players ever wore 27, but Marvin Barnes did surface as a really interesting story, particularly when it comes to effective communication with colleagues. Happy holidays too, as I’m pretty sure this will be the last one of these posts I do this year. I’ll try and keep it short, as you’ve probably got stuff to do.

  • This story of serious failure on El Reg had me in stitches.
  • I really enjoyed this article by Raj Dutt (over at Cohesity’s blog) on recovery predictability. As an industry we talk an awful lot about speeds and feeds and supportability, but sometimes I think we forget about keeping it simple and making sure we can get our stuff back as we expect.
  • Speaking of data protection, I wrote some articles for Druva about, well, data protection and things of that nature. You can read them here.
  • There have been some pretty important CBT-related patches released by VMware recently. Anthony has provided a handy summary here.
  • Everything’s an opinion until people actually do it, but I thought this research on cloud adoption from Leaseweb USA was interesting. I didn’t expect to see everyone putting their hands up and saying they’re all in on public cloud, but I was also hopeful that we, as an industry, hadn’t made things as unclear as they seem to be. Yay, hybrid!
  • Site sponsor StorONE has partnered with Tech Data Global Computing Components to offer an All-Flash Array as a Service solution.
  • Backblaze has done a nice job of talking about data protection and cloud storage through the lens of Star Wars.
  • This tip on removing particular formatting in Microsoft Word documents really helped me out recently. Yes I know Word is awful.
  • Someone was nice enough to give me an acknowledgement for helping review a non-fiction book once. Now I’ve managed to get a character named after me in one of John Birmingham’s epics. You can read it out of context here. And if you’re into supporting good authors on Patreon – then check out JB’s page here. He’s a good egg, and his literary contributions to the world have been fantastic over the years. I don’t say this just because we live in the same city either.

Random Short Take #26

Welcome to my semi-regular, random news post in a short format. This is #26. I was going to start naming them after my favourite basketball players. This one could be the Korver edition, for example. I don’t think that’ll last though. We’ll see. I’ll stop rambling now.

Datrium Enhances DRaaS – Makes A Cool Thing Cooler

Datrium recently made a few announcements to the market. I had the opportunity to speak with Brian Biles (Chief Product Officer, Co-Founder), Sazzala Reddy (Chief Technology Officer and Co-Founder), and Kristin Brennan (VP of Marketing) about the news and thought I’d cover it here.

 

Datrium DRaaS with VMware Cloud

Before we talk about the new features, let’s quickly revisit the DRaaS for VMware Cloud offering, announced by Datrium in August this year.

[image courtesy of Datrium]

The cool thing about this offering was that, according to Datrium, it “gives customers complete, one-click failover and failback between their on-premises data center and an on-demand SDDC on VMware Cloud on AWS”. There are some real benefits to be had for Datrium customers, including:

  • Highly optimised, and more efficient than some competing solutions;
  • Consistent management for both on-premises and cloud workloads;
  • Eliminates the headaches as enterprises scale;
  • Single-click resilience;
  • Simple recovery from current snapshots or old backup data;
  • Cost-effective failback from the public cloud; and
  • Purely software-defined DRaaS on hyperscale public clouds for reduced deployment risk long term.

But what if you want a little flexibility in terms of where those workloads are recovered? Read on.

Instant RTO

So you’re protecting your workloads in AWS, but what happens when you need to stand up stuff fast in VMC on AWS? This is where Instant RTO can really help. There’s no rehydration or backup “recovery” delay. Datrium tells me you can perform massively parallel VM restarts (hundreds at a time) and you’re ready to go in no time at all. The full RTO varies by run-book plan, but by booting VMs from a live NFS datastore, you know it won’t take long. Failback uses VADP.

[image courtesy of Datrium]

The only cost during normal business operations (when not testing or deploying DR) is the cost of storing ongoing backups. And these are are automatically deduplicated, compressed and encrypted. In the event of a disaster, Datrium DRaaS provisions an on-demand SDDC in VMware Cloud on AWS for recovery. All the snapshots in S3 are instantly made executable on a live, cloud-native NFS datastore mounted by ESX hosts in that SDDC, with caching on NVMe flash. Instant RTO is available from Datrium today.

DRaaS Connect

DRaaS Connect extends the benefits of Instant RTO DR to any vSphere environment. DRaaS Connect is available for two different vSphere deployment models:

  • DRaaS Connect for VMware Cloud offers instant RTO disaster recovery from an SDDC in one AWS Availability Zone (AZ) to another;
  • DRaaS Connect for vSphere On Prem integrates with any vSphere physical infrastructure on-premises.

[image courtesy of Datrium]

DRaaS Connect for vSphere On Prem extends Datrium DRaaS to any vSphere on-premises infrastructure. It will be managed by a DRaaS cloud-based control plane to define VM protection groups and their frequency, replication and retention policies. On failback, DRaaS will return only changed blocks back to vSphere and the local on-premises infrastructure through DRaaS Connect.

The other cool things to note about DRaaS Connect is that:

  • There’s no Datrium DHCI system required
  • It’s a downloadable VM
  • You can start protecting workloads in minutes

DRaaS Connect will be available in Q1 2020.

 

Thoughts and Further Reading

Datrium announced some research around disaster recovery and ransomware in enterprise data centres in concert with the product announcements. Some of it wasn’t particularly astonishing, with folks keen to leverage pay as you go models for DR, and wanting easier mechanisms for data mobility. What was striking is that one of the main causes of disasters is people, not nature. Years ago I remember we used to plan for disasters that invariably involved some kind of flood, fire, or famine. Nowadays, we need to plan for some script kid pumping some nasty code onto our boxes and trashing critical data.

I’m a fan of companies that focus on disaster recovery, particularly if they make it easy for consumers to access their services. Disasters happen frequently. It’s not a matter of if, just a matter of when. Datrium has acknowledged that not everyone is using their infrastructure, but that doesn’t mean it can’t offer value to customers using VMC on AWS. I’m not 100% sold on Datrium’s vision for “disaggregated HCI” (despite Hugo’s efforts to educate me), but I am a fan of vendors focused on making things easier to consume and operate for customers. Instant RTO and DRaaS Connect are both features that round out the DRaaS for VMwareCloud on AWS quite nicely.

I haven’t dived as deep into this as I’d like, but Andre from Datrium has written a comprehensive technical overview that you can read here. Datrium’s product overview is available here, and the product brief is here.

Clumio’s DPaaS Is All SaaS

I recently had the chance to speak to Clumio’s Head of Product Marketing, Steve Siegel, about what Clumio does, and thought I’d share a few notes here.

 

Clumio?

Clumio has raised $51M+ in Series A and B funding. It was founded in 2017, built on public cloud technology, and came out of stealth in August.

 

The Approach

Clumio want to be able to deliver a data management platform in the cloud. The first real opportunity it identified was Backup as a Service. The feeling was that there were too many backup models across private, public cloud, Software as a Service (SaaS), and none of them were particularly easy to take advantage of in an effective manner. This can be a real problem when you’re looking to protect critical information assets.

 

Proper SaaS

The answer, as far as Clumio was concerned, was to develop an “authentic SaaS” offering. This offering provides all of the features you’d expect from a SaaS-based DPaaS (yes, we’re now officially in acronym hell), including:

  • On-demand scalability
  • Ease of management
  • Predictable costs
  • Global compliance
  • Always-on security – with data encrypted in-flight and at-rest

The platform is mainly built on AWS at this stage, but there are plans in place to leverage the other hyperscalers in the future. Clumio charge per VM, with the subscription fee including support. It has plans to improve capabilities, with:

  • AWS support in Dec 2019
  • O365 support in Q1 2020

Clumio currently supports the following public cloud workloads:

  • VMware Cloud on AWS; and
  • AWS – extending backup and recovery to support EBC and EC2 workloads (RDS to follow soon after)

 

Thoughts and Further Reading

If you’re a regular reader of this blog, you’ll notice that I’ve done a bit with data protection technologies over the years. From the big enterprise software shops to the “next-generation” data protection providers, as well as the consumer-side stuff and the “as a Service crowd”. There are a bunch of different ways to do data protection, and some work better than others. Clumio feels strongly that the “[s]implicity of SaaS is winning”, and there’s definitely an argument to be made that the simplicity of the approach is a big reason why the likes of Clumio will receive some significant attention from the marketplace.

That said, the success of services is ultimately determined by a few factors. In my opinion, a big part of what’s important when evaluating these types of services is whether they can technically service the requirements you have. If you’re an HP-UX shop running a bunch of on-premises tin, you might find that this type of service isn’t going to be much use. And if you’re using a cloud-based service but don’t really have decent connectivity to said cloud, you’re going to have a tough time getting your data back when something goes wrong. But that’s not all there is to it. You also need to look at how much it’s going to cost you to consume the service, and think about what it’s going to cost when something goes wrong. It’s all well and good if your daily ingress charges are relatively low with AWS, but if you need to get a bunch of data back out in a hurry, you might find it’s not a great experience, financially speaking. There are a bunch of factors that will impact this though, so you really need to do some modelling before you go down that path.

I’m a big fan of SaaS offerings when they’re done well, and I hope Clumio continues to innovate in the future and expand its support for workloads and infrastructure topologies. It’s picked up a few customers, and are hiring smart people. You can read more about Clumio over at Blocks & Files, and Ken Nalbone also covered it over at Gestalt IT.

Pure//Accelerate 2019 – Cloud Block Store for AWS

Disclaimer: I recently attended Pure//Accelerate 2019.  My flights, accommodation, and conference pass were paid for by Pure Storage. There is no requirement for me to blog about any of the content presented and I am not compensated by Pure Storage for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Cloud Block Store for AWS from Pure Storage has been around for a little while now. I had the opportunity to hear about it in more depth at the Storage Field Day Exclusive event at Pure//Accelerate 2019 and thought I’d share some thoughts here. You can grab a copy of my rough notes from the session here, and video from the session is available here.

 

Cloud Vision

Pure Storage has been focused on making everything related to their products effortless from day 1. An example of this approach is the FlashArray setup process – it’s really easy to get up and running and serving up storage to workloads. They wanted to do the same thing with anything they deliver via cloud services as well. There is, however, something of a “cloud divide” in operation in the industry. If you’re familiar with the various cloud deployment options, you’ll likely be aware that on-premises and hosted cloud is a bit different to public cloud. They:

  • Deliver different application architectures;
  • Deliver different management and consumption experience; and
  • Use different storage.

So what if Pure could build application portability and deliver common shared data services?

Pure have architected their cloud service to leverage what they call “Three Pillars”:

  • Build Your Cloud
  • Run anywhere
  • Protect everywhere

 

What Is It?

So what exactly is Cloud Block Store for AWS then? Well, imagine if you will, that you’re watching an episode of Pimp My Ride, and Xzibit is talking to an enterprise punter about how he or she likes cloud, and how he or she likes the way Pure Storage’s FlashArray works. And then X says, “Hey, we heard you liked these two things so we put this thing in the other thing”. Look, I don’t know the exact situation where this would happen. But anyway …

  • 100% software – deploys instantly as a virtual appliance in the cloud, runs only as long as you need it;
  • Efficient – deduplication, compression, and thin provisioning deliver capacity and performance economically;
  • Hybrid – easily migrate data bidirectionally, delivering data portability and protection across your hybrid cloud;
  • Consistent APIs – developers connect to storage the same way on-premises and in the cloud. Automated deployment with Cloud Formation templates;
  • Reliable, secure – delivers industrial-strength perfromance, reliability & protection with Multi-AZ HA, NDU, instant snaps and data at rest encryption; and
  • Flexible – pay as you go consumption model to best match your needs for production and development.

[image courtesy of Pure Storage]

Architecture

At the heart of it, the architecture for CVS is not dissimilar to the FlashArray architecture. There’re controllers, drives, NVRAM, and a virtual shelf.

  • EC2: CBS Controllers
  • EC2: Virtual Drives
  • Virtual Shelf: 7 Virtual drives in Spread Placement Group
  • EBS IO1: NVRAM, Write Buffer (7 total)
  • S3: Durable persistent storage
  • Instance Store: Non-Persistent Read Mirror

[image courtesy of Pure Storage]

What’s interesting, to me at least, is how they use S3 for persistent storage.

Procurement

How do you procure CBS for AWS? I’m glad you asked. There are two procurement options.

A – Pure as-a-Service

  • Offered via SLED / CLED process
  • Minimums 100TiB effective used capacity
  • Unified hybrid contracts (on-premises and CBS, CBS)
  • 1 year to 3 year contracts

B – AWS Marketplace

  • Direct to customer
  • Minimum, 10 TiB effective used capacity
  • CBS only
  • Month to month contract or 1 year contract

 

Use Cases

There are a raft of different use cases for CBS. Some of them made sense to me straight away, some of them took a little time to bounce around in my head.

Disaster Recovery

  • Production instance on-premises
  • Replicate data to public cloud
  • Fail over in DR event
  • Fail back and recover

Lift and shift

  • Production instance on-premises
  • Replicate data to public cloud
  • Run the same architecture as before
  • Run production on CBS

Use case: Dev / test

  • Replicate data to public cloud
  • Instantiate test / dev instances in public cloud
  • Refresh test / dev periodically
  • Bring changes back on-premises
  • Snapshots are more costly and slower to restore in native AWS

ActiveCluster

  • HA within an availability zone and / or across availability zones in an AWS region (ActiveCluster needs <11ms latency)
  • No downtime when a Cloud Block Store Instance goes away or there is a zone outage
  • Pure1 Cloud Mediator Witness (simple to manage and deploy)

Migrating VMware Environments

VMware Challenges

  • AWS does not recognise VMFS
  • Replicating volumes with VMFS will not do any good

Workaround

  • Convert VMFS datastore into vVOLs
  • Now each volume has the Guest VM’s file system (NTFS, EXT3, etc)
  • Replicate VMDK vVOLs to CBS
  • Now the volumes can be mounted to EC2 with matching OS

Note: This is for the VM’s data volumes. The VM boot volume will not be usable in AWS. The VM’s application will need to be redeployed in native AWS EC2.

VMware Cloud

VMware Challenges

  • VMware Cloud does not support external storage, it only supports vSAN

Workaround

  • Connect Guest VMs directly to CBS via iSCSI

Note: I haven’t verified this myself, and I suspect there may be other ways to do this. But in the context of Pure’s offering, it makes sense.

 

Thoughts and Further Reading

There’s been a feeling in some parts of the industry for the last 5-10 years that the rise of the public cloud providers would spell the death of the traditional storage vendor. That’s clearly not been the case, but it has been interesting to see the major storage slingers evolving their product strategies to both accommodate and leverage the cloud providers in a more effective manner. Some have used the opportunity to get themselves as close as possible to the cloud providers, without actually being in the cloud. Others have deployed virtualised versions of their offerings inside public cloud and offered users the comfort of their traditional stack, but off-premises. There’s value in these approaches, for sure. But I like the way that Pure has taken it a step further and optimised its architecture to leverage some of the features of what AWS can offer from a cloud hardware perspective.

In my opinion, the main reason you’d look to leverage something like CBS on AWS is if you have an existing investment in Pure and want to keep doing things a certain way. You’re also likely using a lot of traditional VMs in AWS and want something that can improve the performance and resilience of those workloads. CBS is certainly a great way to do this. If you’re already running a raft of cloud-native applications, it’s likely that you don’t necessarily need the features on offer from CBS, as you’re already (hopefully) using them natively. I think Pure understands this though, and isn’t pushing CBS for AWS as the silver bullet for every cloud workload.

I’m looking forward to seeing what the market uptake on this product is like. I’m also keen to crunch the numbers on running this type of solution versus the cost associated with doing something on-premises or via other means. In any case, I’m looking forward to see how this capability evolves over time, and I think CBS on AWS is definitely worthy of further consideration.