Random Short Take #28

New year, same old format for news bites. This is #28 – the McKinnie Edition. I always thought Alfonzo looked a bit like that cop in The Deuce. Okay – it’s clear that some of these numbers are going to be hard to work with, but I’ll keep it going for a little while longer (the 30s are where you find a lot of the great players).

  • In what seems like pretty big news, Veeam has been acquired by Insight Partners. You can read the press release here, and Anton Gostev shares his views on it here.
  • This one looks like a bit of a science project, but I find myself oddly intrigued by it. You can read the official announcement here. Pre-orders are open now, and I’ll report back some time in March or April when / if the box turns up.
  • I loved this article from Chin-Fah on ransomware and NAS environments. I’m looking forward to catching up with Chin-Fah next week (along with all of the other delegates) at Storage Field Day 19. Tune in here if you want to see us on camera.
  • Speaking of ransomware, this article from Joey D’Antoni provided some great insights into the problem and what we can do about it.
  • A lot of my friends overseas are asking about the bush fires in Australia. There’s a lot in the media about it, and this article about the impact on infrastructure from Preston made for some thought-provoking reading.
  • I still use Plex heavily, and spend a lot of time moving things from optical discs to my NAS. This article covers a lot of the process I use too. I’ve started using tinyMediaManager as well – it’s pretty neat.
  • All the kids (and vendor executives) are talking about Kubernetes. It’s almost like we’re talking about public cloud or big data. Inspired in part by what he saw at Cloud Field Day 6, Keith weighs in on the subject here and I recommend you take the time to read (and understand) what he’s saying.
  • I enjoy reading Justin’s disclosure posts, even when he throws shade on my state (“Queensland is Australia’s Florida”). Not that he’s wrong, mind you.

Random Short Take #27

Welcome to my semi-regular, random news post in a short format. This is #27. You’d think it would be hard to keep naming them after basketball players, and it is. None of my favourite players ever wore 27, but Marvin Barnes did surface as a really interesting story, particularly when it comes to effective communication with colleagues. Happy holidays too, as I’m pretty sure this will be the last one of these posts I do this year. I’ll try and keep it short, as you’ve probably got stuff to do.

  • This story of serious failure on El Reg had me in stitches.
  • I really enjoyed this article by Raj Dutt (over at Cohesity’s blog) on recovery predictability. As an industry we talk an awful lot about speeds and feeds and supportability, but sometimes I think we forget about keeping it simple and making sure we can get our stuff back as we expect.
  • Speaking of data protection, I wrote some articles for Druva about, well, data protection and things of that nature. You can read them here.
  • There have been some pretty important CBT-related patches released by VMware recently. Anthony has provided a handy summary here.
  • Everything’s an opinion until people actually do it, but I thought this research on cloud adoption from Leaseweb USA was interesting. I didn’t expect to see everyone putting their hands up and saying they’re all in on public cloud, but I was also hopeful that we, as an industry, hadn’t made things as unclear as they seem to be. Yay, hybrid!
  • Site sponsor StorONE has partnered with Tech Data Global Computing Components to offer an All-Flash Array as a Service solution.
  • Backblaze has done a nice job of talking about data protection and cloud storage through the lens of Star Wars.
  • This tip on removing particular formatting in Microsoft Word documents really helped me out recently. Yes I know Word is awful.
  • Someone was nice enough to give me an acknowledgement for helping review a non-fiction book once. Now I’ve managed to get a character named after me in one of John Birmingham’s epics. You can read it out of context here. And if you’re into supporting good authors on Patreon – then check out JB’s page here. He’s a good egg, and his literary contributions to the world have been fantastic over the years. I don’t say this just because we live in the same city either.

Storage Field Day – I’ll Be At Storage Field Day 19

Here’s some news that will get you in the holiday spirit. I’ll be heading to the US in late January for another Storage Field Day event. If you haven’t heard of the very excellent Tech Field Day events, you should check them out. I’m looking forward to time travel and spending time with some really smart people for a few days. It’s also worth checking back on the Storage Field Day 19 website during the event (January 22 – 24) as there’ll be video streaming and updated links to additional content. You can also see the list of delegates and event-related articles that have been published.

I think it’s a great line-up of both delegates and presenting companies (including a “secret company”) this time around. I know most of them, but there may also still be a few companies added to the line-up. I’ll update this if and when they’re announced.

I’d like to publicly thank in advance the nice folks from Tech Field Day who’ve seen fit to have me back, as well as my employer for letting me take time off to attend these events. Also big thanks to the companies presenting. It’s going to be a lot of fun. Seriously. If you’re in the Bay Area and want to catch up prior to the event, please get in touch. I’ll have some free time, so perhaps we could check out a Warriors game on the 18th and discuss the state of the industry?

InfiniteIO And Your Data – Making Meta Better

InfiniteIO recently announced its new Application Accelerator. I had the opportunity to speak about the news with Liem Nguyen (VP of Marketing) and Kris Meier (VP of Product Management) from InfiniteIO and thought I’d share some thoughts here.

 

Metadata Is Good, And Bad

When you think about file metadata you might think about photos and the information they store that tells you about where the photo was taken, when it was taken, and the kind of camera used. Or you might think of an audio file and the metadata that it contains, such as the artist name, year of release, track number, and so on. Metadata is a really useful thing that tells us an awful lot about data we’re storing. But things like simple file read operations make use of a lot of metadata just to open the file:

  • During the typical file read, 7 out of 8 operations are metadata requests which significantly increases latency; and
  • Up to 90% of all requests going to NAS systems are for metadata.

[image courtesy of InfiniteIO]

 

Fire Up The Metadata Engine

Imagine how much faster storage would be if it only has to service 10% of the requests it does today? The Application Accelerator helps with this by:

  • Separating metadata request processing from file I/O
  • Responding directly to metadata requests at the speed of DRAM – much faster than a file system

[image courtesy of InfiniteIO]

The cool thing is it’s a simple deployment – installed like a network switch requiring no changes to workflows.

 

Thoughts and Further Reading

Metadata is a key part of information management. It provides data with a lot of extra information that makes that data more useful to applications that consume it and to the end users of those applications. But this metadata has a cost associated with it. You don’t think about the amount of activity that happens with simple file operations, but there is a lot going on. It gets worse when you look at activities like AI training and software build operations. The point of a solution like the Application Accelerator is that, according to InfiniteIO, your primary storage devices could be performing at another level if another device was doing the heavy lifting when it came to metadata operations.

Sure, it’s another box in the data centre, but the key to the Application Accelerator’s success is the software that sits on the platform. When I saw the name my initial reaction was that filesystem activities aren’t applications. But they really are, and more and more applications are leveraging data on those filesystems. If you could reduce the load on those filesystems to the extent that InfiniteIO suggest then the Application Accelerator becomes a critical piece of the puzzle.

You might not care about increasing the performance of your applications when accessing filesystem data. And that’s perfectly fine. But if you’re using a lot of applications that need high performance access to data, or your primary devices are struggling under the weight of your workload, then something like the Application Accelerator might be just what you need. For another view, Chris Mellor provided some typically comprehensive coverage here.

Random Short Take #26

Welcome to my semi-regular, random news post in a short format. This is #26. I was going to start naming them after my favourite basketball players. This one could be the Korver edition, for example. I don’t think that’ll last though. We’ll see. I’ll stop rambling now.

Excelero And The NVEdge

It’s been a little while since I last wrote about Excelero. I recently had the opportunity to catch up with Josh Goldenhar and Tom Leyden and thought I’d share some of my thoughts here.

 

NVMe Performance Good, But Challenging

NVMe has really delivered storage performance improvements in recent times.

All The Kids Are Doing It

Great performance:

  • Up to 1.2M IOPs, 6GB/s per drive
  • Ultra-low latency (20μs)

Game changer for data-intensive workloads:

  • Mission-Critical Databases
  • Analytical Processing
  • AI and Machine Learning

But It’s Not Always What You’d Expect

IOPs and Bandwidth Utilisation

  • Applications struggle to use local NVMe performance beyond 3-4 drives
  • Stranded IOPS and / or bandwidth = poor ROI

Sharing is the Logical Answer, with local latency

  • Physical disaggregation is often operationally desirable
  • 24 Drive servers are common and readily available

Data Protection Desired

  • NVMe performs, but by itself offers no data protection
  • Local data protection does not protect against server failures

Some NVMe-over-fabrics solutions offer controller based data protection, but limit IOPs, bandwidth and sacrifice latency.

 

Scale Up Or Out?

NVMesh – Scale-out design: data centre scale

  • Disaggregated & converged architecture
  • No CPU overhead: no noisy neighbours
  • Lowest latency: 5μs

NVEdge – Scale-up design: rack scale

  • Disaggregated architecture
  • Full bandwidth even at 4K IO
  • Client-less architecture with NVMe-oF initiators
  • Enterprise-ready: RAID 1/0, High Availability with fast failover, Thin Provisioning, CRC

 

Flexible Deployment Models

There are a few different ways you can deploy Excelero.

Converged – Local NVMe drives in Application Servers

  • Single, unified storage pool
  • NVMesh initiator and client on all nodes
  • NVMesh bypasses server CPU
  • Various protection levels
  • No dedicated storage servers needed
  • Linearly scalable
  • Highest aggregate bandwidth

Top-of-Rack Flash

  • Single, unified storage pool
  • NVMesh Target runs on dedicated storage nodes
  • NVMesh Client runs on application servers
  • Applications get performance of local NVMe storage
  • Various Protection Levels
  • Linearly scalable

Data Protection

There are also a number of options when it comes to data resiliency.

[image courtesy of Excelero]

Networking Options

You can choose either TCP/IP or RDMA. TCP/IP offers a latency hit, but it works with any NIC (and your existing infrastructure). RDMA has super low latency, but is only available on a limited subset of NICs.

 

NVEdge Then?

Excelero described NVEdge as “block storage software for building NVMe Flash Arrays for demanding workflows such as AI, ML and databases in the Cloud and at the Edge”.

Scale-up architecture

  • High NVMe AFA performance, leveraging NVMe-oF
  • Full bandwidth performance even at 4K block size

High availability, supporting:

  • Dual-port NVMe drives
  • Dual controllers (with fast failover, less than 100ms)
  • Active / active controller operation and active/passive logical volume access

Data services include:

  • RAID 1/0 data protection
  • Thin Provisioning: thousands of striped volumes of up to 1PB each
  • Enterprise grade block checksums (CRC 16/32/64).

Hardware Compatibility?

Supported Platforms

  • x86-based systems for higher aggregate performance
  • SmartNIC-based architectures for lower power & cost

HW Requirements

  • Each controller has PCIe connectivity to all drives
  • Controllers can communicate over a network
  • Controllers communicate over both the network and drive pairs to identify connectivity (failure) issues

Supported Networking

  • RDMA (InfiniBand or Ethernet) TCP/IP networking

 

Thoughts and Further Reading

NVMe has been a good news story for folks struggling with the limitations of the SAS protocol. I’ve waxed lyrical in the past about how impressed I was with Excelero’s offering. Not every workload is necessarily suited to NVMesh though, and NVEdge is an interesting approach to solving that problem. Where NVMesh provides a tonne of flexibility when it comes to deployment options and the hardware used, NVEdge doubles down on availability and performance for different workloads.

NVMe isn’t a handful of magic beans that will instantly have your storage workloads. You need to be able to feed it to really get value from it, and you need to be able to protect it too. It comes down to understanding what it is you’re trying to achieve with your applications, rather than just splashing cash on the latest storage protocol in the hope that it will make your business more money.

At this point I’d make some comment about data being the new oil, but I don’t really have enough background in the resources sector to be able to carry that analogy much further than that. Instead I’ll say this: data (in all of its various incantations) is likely very important to your business. Whether it’s something relatively straightforward like seismic data, or financial results, or policy documents, or it may be the value that you can extract from that data by having fast access to a lot of it. Whatever you’re doing with it, you’re likely investing in hardware and software that helps you get to that value. Excelero appears to have focused on ensuring that the ability to access data in a timely fashion isn’t the thing that holds you back from achieving your data value goals.

SwiftStack Announces 7

SwiftStack recently announced version 7 of its solution. I had the opportunity to speak to Joe Arnold and Erik Pounds from SwiftStack about the announcement and thought I’d share some thoughts here.

 

Insane Data Requirements

We spoke briefly about just how insane modern data requirements are becoming, in terms of both volume and performance requirements. The example offered up was that of an Advanced Driver-Assistance System (ADAS). These things need a lot of capacity to work, with training data starting at 15PB of data with performance requirements approaching 100GB/s.

  • Autonomy – Level 2+
  • 10 Deep neural networks needed
  • Survey car – 2MP cameras
  • 2PB per year per car
  • 100 NVIDIA DGX-1 servers per car

When your hot data is 15 – 30PB and growing – it’s a problem.

 

What’s New In 7?

SwiftStack has been working to address those kinds of challenges with version 7.

Ultra-scale Performance Architecture

SwiftStack has managed to get some pretty decent numbers under its belt, delivering over 100GB/s at scale with a platform that’s designed to scale linearly to higher levels. The numbers stack up well against some of their competitors, and have been validated through:

  • Independent testing;
  • Comparing similar hardware and workloads; and
  • Results being posted publicly (with solutions based on Cisco Validated Designs).

 

ProxyFS Edge

ProxyFS Edge takes advantage of SwiftStack’s file services to deliver distributed file services between edge, core, and cloud. The idea is that you can use it for “high-throughput, data-intensive use cases”.

[image courtesy of SwiftStack]

Enabling functionality:

  • Containerised deployment of ProxyFS agent for orchestrated elasticity
  • Clustered filesystem enables scale-out capabilities
  • Caching at the edge, minimising latency for improved application performance
  • Load-balanced, high-throughput API-based communication to the core

 

1space File Connector

But what if you have a bunch of unstructured data sitting in file environments that you want to use with your more modern apps? 1space File Connector brings enterprise file data into the cloud namespace, and “[g]ives modern, cloud-native applications access to existing data without migration”. The thinking is that you can modernise your workflows at an incremental rate, rather than having to deal with the app and the storage all in one go.  incrementally

[image courtesy of SwiftStack]

Enabling functionality:

  • Containerised deployment 1space File Connector for orchestrated elasticity
  • File data is accessible using S3 or Swift object APIs
  • Scales out and is load balanced for high-throughput
  • 1space policies can be applied to file data when migration is desired

The SwiftStack AI Architecture

SwiftStack has also developed a comprehensive AI Architecture model, describing it as “the customer-proven stack that enables deep learning at ultra-scale”. You can read more on that here.

Ultra-Scale Performance

  • Shared-nothing distributed architecture
  • Keep GPU compute complexes busy

Elasticity from Edge-to-Core-to-Cloud

  • With 1space, ingest and access data anywhere
  • Eliminate data silos and move beyond one cloud

Data Immutability

  • Data can be retained and referenced indefinitely as it was originally written
  • Enabling traceability, accountability, confidence, and safety throughout the life of a DNN

Optimal TCO

  • Compelling savings compared to public cloud or all-flash arrays Real-World Confidence
  • Notable AI deployments for autonomous vehicle development

SwiftStack PRO

The final piece is the SwiftStack PRO offering, a support service delivering:

  • 24×7 remote management and monitoring of your SwiftStack production cluster(s);
  • Incorporating operational best-practices learned from 100s of large-scale production clusters;
  • Including advanced monitoring software suite for log aggregation, indexing, and analysis; and
  • Operations integration with your internal team to ensure end-to-end management of your environment.

 

Thoughts And Further Reading

The sheer scale of data enterprises are working with every day is pretty amazing. And data is coming from previously unexpected places as well. The traditional enterprise workloads hosted on NAS or in structured applications are insignificant in size when compared to the PB-scale stuff going on in some environments. So how on earth do we start to derive value from these enormous data sets? I think the key is to understand that data is sometimes going to be in places that we don’t expect, and that we sometimes have to work around that constraint. In this case, SwiftStack has recognised that not all data is going to be sitting in the core, or the cloud, and it’s using some interesting technology to get that data where you need it to get the most value from it.

Getting the data from the edge to somewhere useable (or making it useable at the edge) is one thing, but the ability to use unstructured data sitting in file with modern applications is also pretty cool. There’s often reticence associated with making wholesale changes to data sources, and this solution helps to make that transition a little easier. And it gives the punters an opportunity to address data challenges in places that may have been inaccessible in the past.

SwiftStack has good pedigree in delivering modern scale-out storage solutions, and it’s done a lot of work ensure that its platform adds value. Worth checking out.

NetApp Announces New AFF And FAS Models

NetApp recently announced some new storage platforms at INSIGHT 2019. I didn’t attend the conference, but I had the opportunity to be briefed on these announcements recently and thought I’d share some thoughts here.

 

All Flash FAS (AFF) A400

Overview

  • 4U enclosure
  • Replacement for AFF A300
  • Available in two possible configurations:
    • Ethernet: 4x 25Gb Ethernet (SFP28) ports
    • Fiber Channel: 4x 16Gb FC (SFP+) ports
  • Based on latest Intel Cascade Lake processors
  • 25GbE and 16Gb FC host support
  • 100GbE RDMA over Converged Ethernet (RoCE) connectivity to NVMe expansion storage shelves
  • Full 12Gb/sec SAS connectivity expansion storage shelves

It wouldn’t be a storage product announcement without a box shot.

[image courtesy of NetApp]

More Numbers

Each AFF A400 packs some grunt in terms of performance and capacity:

  • 40 CPU cores
  • 256GB RAM
  • Max drives: 480

Aggregates and Volumes

Maximum number of volumes 2500
Maximum aggregate size 800 TiB
Maximum volume size 100 TiB
Minimum root aggregate size 185 GiB
Minimum root volume size 150 GiB

Other Notes

NetApp is looking to position the A400 as a replacement for the A300 and A320. That said, they will continue to offer the A300. Note that it supports NVMe, but also SAS SSDs – and you can mix them in the same HA pair, same aggregate, and even the same RAID group (if you were so inclined). For those of you looking for MetroCluster support, FC MCC support is targeted for February, with MetroCluster over IP being targeted for the ONTAP 9.8 release.

 

FAS8300 And FAS8700

Overview

  • 4U enclosure
  • Two models available
    • FAS8300
    • FAS8700
  • Available in two possible configurations
    • Ethernet: 4x 25Gb Ethernet (SFP28) ports
    • Unified: 4x 16Gb FC (SFP+) ports

[image courtesy of NetApp]

  • Based on latest Intel Cascade Lake processors
  • Uses NVMe M.2 connection for onboard Flash Cache™
  • 25GbE and 16Gb FC host support
  • Full 12Gbps SAS connectivity expansion storage shelves

Aggregates and Volumes

Maximum number of volumes 2500
Maximum aggregate size 400 TiB
Maximum volume size 100 TiB
Minimum root aggregate size 185 GiB
Minimum root volume size 150 GiB

Other Notes

The 8300 can do everything the 8200 can do, and more! And it also supports more drives (720 vs 480). The 8700 supports a maximum of 1440 drives.

 

Thoughts And Further Reading

Speeds and feeds announcement posts aren’t always the most interesting things to read. It demonstrates that NetApp is continuing to evolve both its AFF and FAS lines, and coupled with improvements in ONTAP 9.7, there’s a lot to like about these new iterations. It looks like there’s enough here to entice customers looking to scale up their array performance. Whilst it adds to the existing portfolio, NetApp is mindful of this, and working on streamlining the portfolio. Shipments are expected to start mid-December.

Midrange storage isn’t always the sexiest thing to read about. But the fact that “midrange” storage now offers up this kind of potential performance is pretty cool. Think back to 5 – 10 years ago, and your bang for buck wasn’t anywhere near like it is now. This is to be expected, given the improvements we’ve seen in processor performance over the last little while, but it’s also important to note that improvements in the software platform are also helping to drive performance improvements across the board.

There have also been some cool enhancements announced with StorageGRID, and NetApp has also announced an “All-SAN” AFF model, with none of the traditional NAS features available. The All-SAN idea had a few pundits scratching their heads, but it makes sense in a way. The market for block-only storage arrays is still in the many billions of dollars worldwide, and NetApp doesn’t yet have a big part of that pie. This is a good way to get into opportunities that it may have been excluded from previously. I don’t think there’s been any suggestion that file or hybrid isn’t the way for them to go, but it is interesting to see this being offered up as a direct competitor to some of the block-only players out there.

I’ve written a bit about NetApp’s cloud vision in the past, as that’s seen quite a bit of evolution in recent times. But that doesn’t mean that they don’t have a good hardware story to tell, and I think it’s reflected in these new product announcements. NetApp has been doing some cool stuff lately. I may have mentioned it before, but NetApp’s been named a leader in the Gartner 2019 Magic Quadrant for Primary Storage. You can read a comprehensive roundup of INSIGHT news over here at Blocks & Files.

Random Short Take #24

Want some news? In a shorter format? And a little bit random? This listicle might be for you. Welcome to #24 – The Kobe Edition (not a lot of passing, but still entertaining). 8 articles too. Which one was your favourite Kobe? 8 or 24?

  • I wrote an article about how architecture matters years ago. It’s nothing to do with this one from Preston, but he makes some great points about the importance of architecture when looking to protect your public cloud workloads.
  • Commvault GO 2019 was held recently, and Chin-Fah had some thoughts on where Commvault’s at. You can read all about that here. Speaking of Commvault, Keith had some thoughts as well, and you can check them out here.
  • Still on data protection, Alastair posted this article a little while ago about using the Cohesity API for reporting.
  • Cade just posted a great article on using the right transport mode in Veeam Backup & Replication. Goes to show he’s not just a pretty face.
  • VMware vFORUM is coming up in November. I’ll be making the trip down to Sydney to help out with some VMUG stuff. You can find out more here, and register here.
  • Speaking of VMUG, Angelo put together a great 7-part series on VMUG chapter leadership and tips for running successful meetings. You can read part 7 here.
  • This is a great article on managing Rubrik users from the CLI from Frederic Lhoest.
  • Are you into Splunk? And Pure Storage? Vaughn has you covered with an overview of Splunk SmartStore on Pure Storage here.

Pure//Accelerate 2019 – Cloud Block Store for AWS

Disclaimer: I recently attended Pure//Accelerate 2019.  My flights, accommodation, and conference pass were paid for by Pure Storage. There is no requirement for me to blog about any of the content presented and I am not compensated by Pure Storage for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Cloud Block Store for AWS from Pure Storage has been around for a little while now. I had the opportunity to hear about it in more depth at the Storage Field Day Exclusive event at Pure//Accelerate 2019 and thought I’d share some thoughts here. You can grab a copy of my rough notes from the session here, and video from the session is available here.

 

Cloud Vision

Pure Storage has been focused on making everything related to their products effortless from day 1. An example of this approach is the FlashArray setup process – it’s really easy to get up and running and serving up storage to workloads. They wanted to do the same thing with anything they deliver via cloud services as well. There is, however, something of a “cloud divide” in operation in the industry. If you’re familiar with the various cloud deployment options, you’ll likely be aware that on-premises and hosted cloud is a bit different to public cloud. They:

  • Deliver different application architectures;
  • Deliver different management and consumption experience; and
  • Use different storage.

So what if Pure could build application portability and deliver common shared data services?

Pure have architected their cloud service to leverage what they call “Three Pillars”:

  • Build Your Cloud
  • Run anywhere
  • Protect everywhere

 

What Is It?

So what exactly is Cloud Block Store for AWS then? Well, imagine if you will, that you’re watching an episode of Pimp My Ride, and Xzibit is talking to an enterprise punter about how he or she likes cloud, and how he or she likes the way Pure Storage’s FlashArray works. And then X says, “Hey, we heard you liked these two things so we put this thing in the other thing”. Look, I don’t know the exact situation where this would happen. But anyway …

  • 100% software – deploys instantly as a virtual appliance in the cloud, runs only as long as you need it;
  • Efficient – deduplication, compression, and thin provisioning deliver capacity and performance economically;
  • Hybrid – easily migrate data bidirectionally, delivering data portability and protection across your hybrid cloud;
  • Consistent APIs – developers connect to storage the same way on-premises and in the cloud. Automated deployment with Cloud Formation templates;
  • Reliable, secure – delivers industrial-strength perfromance, reliability & protection with Multi-AZ HA, NDU, instant snaps and data at rest encryption; and
  • Flexible – pay as you go consumption model to best match your needs for production and development.

[image courtesy of Pure Storage]

Architecture

At the heart of it, the architecture for CVS is not dissimilar to the FlashArray architecture. There’re controllers, drives, NVRAM, and a virtual shelf.

  • EC2: CBS Controllers
  • EC2: Virtual Drives
  • Virtual Shelf: 7 Virtual drives in Spread Placement Group
  • EBS IO1: NVRAM, Write Buffer (7 total)
  • S3: Durable persistent storage
  • Instance Store: Non-Persistent Read Mirror

[image courtesy of Pure Storage]

What’s interesting, to me at least, is how they use S3 for persistent storage.

Procurement

How do you procure CBS for AWS? I’m glad you asked. There are two procurement options.

A – Pure as-a-Service

  • Offered via SLED / CLED process
  • Minimums 100TiB effective used capacity
  • Unified hybrid contracts (on-premises and CBS, CBS)
  • 1 year to 3 year contracts

B – AWS Marketplace

  • Direct to customer
  • Minimum, 10 TiB effective used capacity
  • CBS only
  • Month to month contract or 1 year contract

 

Use Cases

There are a raft of different use cases for CBS. Some of them made sense to me straight away, some of them took a little time to bounce around in my head.

Disaster Recovery

  • Production instance on-premises
  • Replicate data to public cloud
  • Fail over in DR event
  • Fail back and recover

Lift and shift

  • Production instance on-premises
  • Replicate data to public cloud
  • Run the same architecture as before
  • Run production on CBS

Use case: Dev / test

  • Replicate data to public cloud
  • Instantiate test / dev instances in public cloud
  • Refresh test / dev periodically
  • Bring changes back on-premises
  • Snapshots are more costly and slower to restore in native AWS

ActiveCluster

  • HA within an availability zone and / or across availability zones in an AWS region (ActiveCluster needs <11ms latency)
  • No downtime when a Cloud Block Store Instance goes away or there is a zone outage
  • Pure1 Cloud Mediator Witness (simple to manage and deploy)

Migrating VMware Environments

VMware Challenges

  • AWS does not recognise VMFS
  • Replicating volumes with VMFS will not do any good

Workaround

  • Convert VMFS datastore into vVOLs
  • Now each volume has the Guest VM’s file system (NTFS, EXT3, etc)
  • Replicate VMDK vVOLs to CBS
  • Now the volumes can be mounted to EC2 with matching OS

Note: This is for the VM’s data volumes. The VM boot volume will not be usable in AWS. The VM’s application will need to be redeployed in native AWS EC2.

VMware Cloud

VMware Challenges

  • VMware Cloud does not support external storage, it only supports vSAN

Workaround

  • Connect Guest VMs directly to CBS via iSCSI

Note: I haven’t verified this myself, and I suspect there may be other ways to do this. But in the context of Pure’s offering, it makes sense.

 

Thoughts and Further Reading

There’s been a feeling in some parts of the industry for the last 5-10 years that the rise of the public cloud providers would spell the death of the traditional storage vendor. That’s clearly not been the case, but it has been interesting to see the major storage slingers evolving their product strategies to both accommodate and leverage the cloud providers in a more effective manner. Some have used the opportunity to get themselves as close as possible to the cloud providers, without actually being in the cloud. Others have deployed virtualised versions of their offerings inside public cloud and offered users the comfort of their traditional stack, but off-premises. There’s value in these approaches, for sure. But I like the way that Pure has taken it a step further and optimised its architecture to leverage some of the features of what AWS can offer from a cloud hardware perspective.

In my opinion, the main reason you’d look to leverage something like CBS on AWS is if you have an existing investment in Pure and want to keep doing things a certain way. You’re also likely using a lot of traditional VMs in AWS and want something that can improve the performance and resilience of those workloads. CBS is certainly a great way to do this. If you’re already running a raft of cloud-native applications, it’s likely that you don’t necessarily need the features on offer from CBS, as you’re already (hopefully) using them natively. I think Pure understands this though, and isn’t pushing CBS for AWS as the silver bullet for every cloud workload.

I’m looking forward to seeing what the market uptake on this product is like. I’m also keen to crunch the numbers on running this type of solution versus the cost associated with doing something on-premises or via other means. In any case, I’m looking forward to see how this capability evolves over time, and I think CBS on AWS is definitely worthy of further consideration.