Brisbane VMUG – November 2019

hero_vmug_express_2011

The November 2019 edition of the Brisbane VMUG meeting will be held on Tuesday 26th November at Fishburners from 4pm – 6pm. It’s sponsored by VMware and promises to be a great afternoon.

Here’s the agenda:

  • VMUG Intro
  • VMware Presentation: VMware the network company! A presentation by Francois Prowse
  • Q&A
  • Refreshments and drinks

Join us for an end of year celebration to thank the VMUG community for all their efforts in 2019 as well as hearing from VMware from a different perspective. You can find out more information and register for the event here. I hope to see you there. Also, if you’re interested in sponsoring one of these events, please get in touch with me and I can help make it happen.

Random Short Take #25

Want some news? In a shorter format? And a little bit random? Here’s a short take you might be able to get behind. Welcome to #25. This one seems to be dominated by things related to Veeam.

  • Adam recently posted a great article on protecting VMConAWS workloads using Veeam. You can read it about it here.
  • Speaking of Veeam, Hal has released v2 of MS Office 365 Backup Analysis Tool. You can use it to work out how much capacity you’ll need to protect your O365 workloads. And you can figure out what your licensing costs will be, as well as a bunch of other cool stuff.
  • And in more Veeam news, the VeeamON Virtual event is coming up soon. It will be run across multiple timezones and should be really interesting. You can find out more about that here.
  • This article by Russ on copyright and what happens when bots go wild made for some fascinating reading.
  • Tech Field Day turns 10 years old this year, and Stephen has been running a series of posts covering some of the history of the event. Sadly I won’t be able to make it to the celebration at Tech Field Day 20, but if you’re in the right timezone it’s worthwhile checking it out.
  • Need to connect to an SMB share on your iPad or iPhone? Check out this article (assuming you’re running iOS 13 or iPadOS 13.1).
  • It grinds my gears when this kind of thing happens. But if the mighty corporations have launched a line of products without thinking it through, we shouldn’t expect them to maintain that line of products. Right?
  • Storage and Hollywood can be a real challenge. This episode of Curtis‘s podcast really got into some of the details with Jeff Rochlin.

 

SwiftStack Announces 7

SwiftStack recently announced version 7 of their solution. I had the opportunity to speak to Joe Arnold and Erik Pounds from SwiftStack about the announcement and thought I’d share some thoughts here.

 

Insane Data Requirements

We spoke briefly about just how insane modern data requirements are becoming, in terms of both volume and performance requirements. The example offered up was that of an Advanced Driver-Assistance System (ADAS). These things need a lot of capacity to work, with training data starting at 15PB of data with performance requirements approaching 100GB/s.

  • Autonomy – Level 2+
  • 10 Deep neural networks needed
  • Survey car – 2MP cameras
  • 2PB per year per car
  • 100 NVIDIA DGX-1 servers per car

When your hot data is 15 – 30PB and growing – it’s a problem.

 

What’s New In 7?

SwiftStack has been working to address those kinds of challenges with version 7.

Ultra-scale Performance Architecture

They’ve managed to get some pretty decent numbers under their belt, delivering over 100GB/s at scale with a platform that’s designed to scale linearly to higher levels. The numbers stack up well against some of their competitors, and have been validated through:

  • Independent testing;
  • Comparing similar hardware and workloads; and
  • Results being posted publicly (with solutions based on Cisco Validated Designs).

 

ProxyFS Edge

ProxyFS Edge takes advantage of SwiftStack’s file services to deliver distributed file services between edge, core, and cloud. The idea is that you can use it for “high-throughput, data-intensive use cases”.

[image courtesy of SwiftStack]

Enabling functionality:

  • Containerised deployment of ProxyFS agent for orchestrated elasticity
  • Clustered filesystem enables scale-out capabilities
  • Caching at the edge, minimising latency for improved application performance
  • Load-balanced, high-throughput API-based communication to the core

 

1space File Connector

But what if you have a bunch of unstructured data sitting in file environments that you want to use with your more modern apps? 1space File Connector brings enterprise file data into the cloud namespace, and “[g]ives modern, cloud-native applications access to existing data without migration”. The thinking is that you can modernise your workflows at an incremental rate, rather than having to deal with the app and the storage all in one go.  incrementally

[image courtesy of SwiftStack]

Enabling functionality:

  • Containerised deployment 1space File Connector for orchestrated elasticity
  • File data is accessible using S3 or Swift object APIs
  • Scales out and is load balanced for high-throughput
  • 1space policies can be applied to file data when migration is desired

The SwiftStack AI Architecture

SwiftStack have also developed a comprehensive AI Architecture model, describing it as “the customer-proven stack that enables deep learning at ultra-scale”. You can read more on that here.

Ultra-Scale Performance

  • Shared-nothing distributed architecture
  • Keep GPU compute complexes busy

Elasticity from Edge-to-Core-to-Cloud

  • With 1space, ingest and access data anywhere
  • Eliminate data silos and move beyond one cloud

Data Immutability

  • Data can be retained and referenced indefinitely as it was originally written
  • Enabling traceability, accountability, confidence, and safety throughout the life of a DNN

Optimal TCO

  • Compelling savings compared to public cloud or all-flash arrays Real-World Confidence
  • Notable AI deployments for autonomous vehicle development

SwiftStack PRO

The final piece is the SwiftStack PRO offering, a support service delivering:

  • 24×7 remote management and monitoring of your SwiftStack production cluster(s);
  • Incorporating operational best-practices learned from 100s of large-scale production clusters;
  • Including advanced monitoring software suite for log aggregation, indexing, and analysis; and
  • Operations integration with your internal team to ensure end-to-end management of your environment.

 

Thoughts And Further Reading

The sheer scale of data enterprises are working with every day is pretty amazing. And data is coming from previously unexpected places as well. The traditional enterprise workloads hosted on NAS or in structured applications are insignificant in size when compared to the PB-scale stuff going on in some environments. So how on earth do we start to derive value from these enormous data sets? I think the key is to understand that data is sometimes going to be in places that we don’t expect, and that we sometimes have to work around that constraint. In this case, SwiftStack have recognised that not all data is going to be sitting in the core, or the cloud, and they’re using some interesting technology to get that data where you need it to get the most value from it.

Getting the data from the edge to somewhere useable (or making it useable at the edge) is one thing, but the ability to use unstructured data sitting in file with modern applications is also pretty cool. There’s often reticence associated with making wholesale changes to data sources, and this solution helps to make that transition a little easier. And it gives the punters an opportunity to address data challenges in places that may have been inaccessible in the past.

SwiftStack have good pedigree in delivering modern scale-out storage solutions, and they’ve done a lot of work ensure that their platform adds value. Worth checking out.

NetApp Announces New AFF And FAS Models

NetApp recently announced some new storage platforms at INSIGHT 2019. I didn’t attend the conference, but I had the opportunity to be briefed on these announcements recently and thought I’d share some thoughts here.

 

All Flash FAS (AFF) A400

Overview

  • 4U enclosure
  • Replacement for AFF A300
  • Available in two possible configurations:
    • Ethernet: 4x 25Gb Ethernet (SFP28) ports
    • Fiber Channel: 4x 16Gb FC (SFP+) ports
  • Based on latest Intel Cascade Lake processors
  • 25GbE and 16Gb FC host support
  • 100GbE RDMA over Converged Ethernet (RoCE) connectivity to NVMe expansion storage shelves
  • Full 12Gb/sec SAS connectivity expansion storage shelves

It wouldn’t be a storage product announcement without a box shot.

[image courtesy of NetApp]

More Numbers

Each AFF A400 packs some grunt in terms of performance and capacity:

  • 40 CPU cores
  • 256GB RAM
  • Max drives: 480

Aggregates and Volumes

Maximum number of volumes 2500
Maximum aggregate size 800 TiB
Maximum volume size 100 TiB
Minimum root aggregate size 185 GiB
Minimum root volume size 150 GiB

Other Notes

NetApp is looking to position the A400 as a replacement for the A300 and A320. That said, they will continue to offer the A300. Note that it supports NVMe, but also SAS SSDs – and you can mix them in the same HA pair, same aggregate, and even the same RAID group (if you were so inclined). For those of you looking for MetroCluster support, FC MCC support is targeted for February, with MetroCluster over IP being targeted for the ONTAP 9.8 release.

 

FAS8300 And FAS8700

Overview

  • 4U enclosure
  • Two models available
    • FAS8300
    • FAS8700
  • Available in two possible configurations
    • Ethernet: 4x 25Gb Ethernet (SFP28) ports
    • Unified: 4x 16Gb FC (SFP+) ports

[image courtesy of NetApp]

  • Based on latest Intel Cascade Lake processors
  • Uses NVMe M.2 connection for onboard Flash Cache™
  • 25GbE and 16Gb FC host support
  • Full 12Gbps SAS connectivity expansion storage shelves

Aggregates and Volumes

Maximum number of volumes 2500
Maximum aggregate size 400 TiB
Maximum volume size 100 TiB
Minimum root aggregate size 185 GiB
Minimum root volume size 150 GiB

Other Notes

The 8300 can do everything the 8200 can do, and more! And it also supports more drives (720 vs 480). The 8700 supports a maximum of 1440 drives.

 

Thoughts And Further Reading

Speeds and feeds announcement posts aren’t always the most interesting things to read. It demonstrates that NetApp is continuing to evolve both its AFF and FAS lines, and coupled with improvements in ONTAP 9.7, there’s a lot to like about these new iterations. It looks like there’s enough here to entice customers looking to scale up their array performance. Whilst it adds to the existing portfolio, NetApp are mindful of this, and working on streamlining the portfolio. Shipments are expected to start mid-December.

Midrange storage isn’t always the sexiest thing to read about. But the fact that “midrange” storage now offers up this kind of potential performance is pretty cool. Think back to 5 – 10 years ago, and your bang for buck wasn’t anywhere near like it is now. This is to be expected, given the improvements we’ve seen in processor performance over the last little while, but it’s also important to note that improvements in the software platform are also helping to drive performance improvements across the board.

There have also been some cool enhancements announced with StorageGRID, and NetApp has also announced an “All-SAN” AFF model, with none of the traditional NAS features available. The All-SAN idea had a few pundits scratching their heads, but it makes sense in a way. The market for block-only storage arrays is still in the many billions of dollars worldwide, and NetApp doesn’t yet have a big part of that pie. This is a good way to get into opportunities that it may have been excluded from previously. I don’t think there’s been any suggestion that file or hybrid isn’t the way for them to go, but it is interesting to see this being offered up as a direct competitor to some of the block-only players out there.

I’ve written a bit about NetApp’s cloud vision in the past, as that’s seen quite a bit of evolution in recent times. But that doesn’t mean that they don’t have a good hardware story to tell, and I think it’s reflected in these new product announcements. NetApp has been doing some cool stuff lately. I may have mentioned it before, but NetApp’s been named a leader in the Gartner 2019 Magic Quadrant for Primary Storage. You can read a comprehensive roundup of INSIGHT news over here at Blocks & Files.

Veeam Basics – Cloud Tier And v10

Disclaimer: I recently attended Veeam Vanguard Summit 2019.  My flights, accommodation, and some meals were paid for by Veeam. There is no requirement for me to blog about any of the content presented and I am not compensated by Veeam for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Overview

Depending on how familiar you are with Veeam, you may already have heard of the Cloud Tier feature. This was new in Veeam Availability Suite 9.5 Update 4, and “is the built-in automatic tiering feature of Scale-out Backup Repository that offloads older backup files to more affordable storage, such as cloud or on-premises object storage”. The idea is you can use the cloud (or cloud-like on-premises storage resources) to make more effective (read: economical) use of your primary storage repositories. You can read more about Veeam’s object storage capabilities here.

 

v10 Enhancements

Move, Copy, Move and Copy

In 9.5 U4 the Move mode was introduced:

  • Policy allows chunks of data to be stripped out of a backup files
  • Metadata remains locally on the performance tier
  • Data moved and offloaded into capacity tier
  • Capacity Tier backed by an object storage repository

The idea was that your performance tier provided the landing zone for backup data, and the capacity tier was an object storage repository that data was moved to. Rhys does a nice job of covering Cloud Tier here.

Copy + Move

In v10, you’ll be able to do both copy and move activities on older backup data. Here are some things to note about copy mode:

  • Still uses the same mechanics as Move
  • Data is chunked and offloaded to the Capacity Tier
  • Unlike Move we don’t dehydrate VBK / VIB / VRB
  • Like Move this ensures that all restore functionality is retained
  • Still makes use of the Archive Index and similar to Move
  • Will not duplicate blocks being offloaded from the Performance Tier
  • Both Copy + Move is fully supported
  • Copy + Move will share block data between them

[image courtesy of Veeam]

With Copy and Move the Capacity Tier will contain a copy of every backup file that has been created as well as offloaded data from the Performance Tier. Anthony does a great job of covering off the Cloud Tier Copy feature in more depth here.

Immutability

One of the features I’m really excited about (because I’m into some weird stuff) is the Cloud Tier Immutability feature.

  • Guarantees additional protection for data stored in Object storage
  • Protects against malicious users and accidental deletion (ITP Theory)
  • Applies to data offloaded to capacity tier for Move or Copy
  • Protects the most recent (more important) backup points
  • Beware of increased storage consumption and S3 costs

 

Thoughts and Further Reading

The idea of moving protection data to a cheaper storage repository isn’t a new one. Fifteen years ago we were excited to be enjoying backup to disk as a new way of doing data protection. Sure, it wasn’t (still isn’t) as cheap as tape, but it was a lot more flexible and performance oriented. Unfortunately, the problem with disk-based backup systems is that you need a lot of disk to keep up with the protection requirements of primary storage systems. And then you probably want to keep many, many copies of this data for a long time. Deduplication and compression helps with this problem, but it’s not magic. Hence the requirement to move protection data to lower tiers of storage.

Veeam may have been a little late to market with this feature, but their implementation in 9.5 U4 is rock solid. It’s the kind of thing we’ve come to expect from them. With v10 the addition of the Copy mode, and the Immutability feature in Cloud Tier, should give people cause to be excited. Immutability is a really handy feature, and provides the kind of security that people should be focused on when looking to pump data into the cloud.

I still have some issues with people using protection data as an “archive” – that’s not what it is. Rather, this is a copy of protection data that’s being kept for a long time. It keeps auditors happy. And fits nicely with people’s idea of what archives are. Putting my weird ideas about archives versus protection data aside, the main reason you’d want to move or copy data to a cheaper tier of disk is to save money. And that’s not a bad thing, particularly if you’re working with enterprise protection policies that don’t necessarily make sense (e.g. keeping all backup data for seven years). I’m looking forward to v10 coming soon, and taking these features for a spin.

Veeam Vanguard Summit 2019 – (Fairly) Full Disclosure

Disclaimer: I recently attended Veeam Vanguard Summit 2019.  My flights, accommodation, and some meals were paid for by Veeam. There is no requirement for me to blog about any of the content presented and I am not compensated by Veeam for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my notes on gifts, etc, that I received as an attendee at Veeam Vanguard Summit 2019. Apologies if it’s a bit dry but I’m just trying to make it clear what I received during this event to ensure that we’re all on the same page as far as what I’m being influenced by. I’m going to do this in chronological order, as that was the easiest way for me to take notes during the week. Whilst every attendee’s situation is different, I was paid by my employer to be at this event.

 

Saturday

My wife kindly dropped me at the airport on Saturday evening. I flew Emirates economy class from BNE – DXB – PRG courtesy of Veeam. I had a 3 hour layover at DXB. In DXB I managed to locate the Emirates Business lounge and eventually found the smoked salmon. The Emirates lounge at BNE is also super nice compared to the Qantas one (sorry Qantas!).

 

Sunday

I landed in Prague Sunday afternoon and took a taxi to my friend Max‘s house. We went for a wander to Hanga’r Bar where I had 3 beers that Max kindly paid for. We then headed in to the city centre so Al Rasheed could drop his luggage off. We then dropped by Restaurace Mincova and had some sausage, pickled cheese and a couple more beers. Al kindly paid for this. We then returned to Max’s house for dinner with his family. Max’s family also put me up for the night.

 

Monday

On the way to the hotel (the Hilton in Prague Old Town) Monday, Max and I stopped by the Macao and Wok Restaurant for lunch. I had a variety of Chinese-style dumplings and 2 beers. I then caught up with the other Aussie Vanguards (and Drew). We stopped at a place called Sklep Na Porici and I had 2 Pilsner Urquell unfiltered beers. At the hotel before dinner Steven Onofaro bought me a beer in the hotel bar.

For dinner we had a welcome reception at T-Anker. It was a rooftop bar / restaurant with stunning views of the city. The staff were a little surprised that we all wanted to eat our meals at the same time, but I eventually managed to get hold of a chicken schnitzel. I also had 4 beers. We stopped at a bar called Potrefená Husa (?) on the way back to the hotel. I had another beer that David Kawula paid for. At the hotel I had another beer, paid for by Shane Williford, before heading to bed.

 

Tuesday

I had breakfast at the hotel, consisting of eggs, bacon, chicken sausage, and a flat white. The beauty of the hotel was that it didn’t matter what coffee you ordered, it would invariably be a flat white. Matt Crape gave me a 3D-printed Vanguard thing before the sessions started, and I picked up a Vanguard pin as well.

During the break I had coffee and a chicken, ham, and cheese panini snack. Lunch was in the hotel, and I had beef, fish, pasta, roast vegetables and some water. During the afternoon break I helped myself to some coffee and an apple tatin. Adam Fisher kindly gave me some CDs from his rock and roll days. They were really cool.

For dinner a few of us went to the Restaurant White Horse in the Old Town Square. I had a few beers and the grilled spicy sausage. I then had 2 beers at the hotel before retiring for the night.

 

Wednesday

For breakfast on Wednesday I headed to the hotel buffet and had mushrooms, bacon, scrambled eggs, yoghurt, cheese, ham, and 2 flat whites. During the morning break I helped myself to a bagel with smoked salmon and cream cheese and some coffee. Lunch was in the hotel, and I had basmati rice, chicken, perch, smoked salmon, water, and chocolate cake.

During the afternoon break I had some coffee, a small cheese cake tart, and a tiny tandoori chicken wrap. I had two beers at the hotel bar before we caught a shuttle over to the Staropramen brewery. There I had a 5 or 6 beers and a variety of finger food. From there we headed to The Dubliner bar for a few more beers.

 

Thursday

I skipped breakfast on Thursday in favour of some sleep. I had a light lunch at the hotel, consisting of some pasta, rice, and beef. When I got back to my room I found a gift glass from Staropramen Brewery courtesy of Veeam.

For dinner about 10 of us headed to a Mexican restaurant called Agave. I had 3 Coronas, a burrito with prawns, and some guacamole. The food was great, as was the company, but the service was pretty slow.

 

Friday

On Friday I had breakfast at the hotel, consisting of mushrooms, bacon, scrambled eggs, yoghurt, cheese, ham, and 2 flat whites. I then walked around Prague for a few hours, and took a car service to the airport at my expense. Big thanks to Veeam for having me over for the week, and big thanks to everyone who spent time with me at the event (and after hours) – it’s a big part of what makes this stuff fun. And I’m looking forward to sharing some of what I learnt when I’m a little less jet-lagged.

Cohesity – NAS Data Migration Overview

Data Migration

Cohesity NAS Data Migration, part of SmartFiles, was recently announced as a generally available feature within the Cohesity DataPlatform 6.4 release (after being mentioned in the 6.3 release blog post). The idea behind it is that you can use the feature to perform the migration of NAS data from a primary source to the Cohesity DataPlatform. It is supported for NAS storage registered as SMB or NFS (so it doesn’t necessarily need to be a NAS appliance as such, it can also be a file share hosted somewhere).

 

What To Think About

There are a few things to think about when you configure your migration policy, including:

  • The last time the file was accessed;
  • Last time the file was modified; and
  • The size of the file.

You also need to think about how frequently you want to run the job. Finally, it’s worth considering which View you want the archived data to reside on.

 

What Happens?

When the data is migrated an SMB2 symbolic link is left in place of the file with the same name as the file and the original data is moved to the Cohesity View. Note that on Windows boxes, remote to remote symbolic links are disabled, so you need to run these commands:

C:\Windows\system32>fsutil behavior set SymlinkEvaluation R2R:1
C:\Windows\system32>fsutil behavior query SymlinkEvaluation

Once the data is migrated to the Cohesity cluster, subsequent read and write operations are performed on the Cohesity host. You can move data back to the environment by mounting the Cohesity target View on a Windows client, and copying it back to the NAS.

 

Configuration Steps

To get started, select File Services, and click on Data Migration.

Click on the Migrate Data to configure a migration job.

You’ll need to give it a name.

 

The next step is to select the Source. If you already have a NAS source configured, you’ll see it here. Otherwise you can register a Source.

Click on the arrow to expand the registered NAS mount points.

Select the mount point you’d like to use.

Once you’ve selected the mount point, click on Add.

You then need to select the Storage Domain (formerly known as a ViewBox) to store the archived data on.

You’ll need to provide a name, and configure schedule options.

You can also configure advanced settings, including QoS and exclusions. Once you’re happy, click on Migrate and the job will be created.

You can then run the job immediately, or wait for the schedule to kick in.

 

Other Things To Consider

You’ll need to think about your anti-virus options as well. You can register external anti-virus software or install the anti-virus app from the Cohesity Marketplace

 

Thoughts And Further Reading

Cohesity have long positioned their secondary storage solution as something more than just a backup and recovery solution. There’s some debate about the difference between storage management and data management, but Cohesity seem to have done a good job of introducing yet another feature that can help users easily move data from their primary storage to their secondary storage environment. Plenty of backup solutions have positioned themselves as archive solutions, but many have been focused on moving protection data, rather than primary data from the source. You’ll need to do some careful planning around sizing your environment, as there’s always a chance that an end user will turn up and start accessing files that you thought were stale. And I can’t say with 100% certainty that this solution will transparently work with every line of business application in your environment. But considering it’s aimed at SMB and NFS shares, it looks like it does what it says on the tin, and moves data from one spot to another.

You can read more about the new features in Cohesity DataPlatform 6.4 (Pegasus) on the Cohesity site, and Blocks & Files covered the feature here. Alastair also shared some thoughts on the feature here.

Random Short Take #24

Want some news? In a shorter format? And a little bit random? This listicle might be for you. Welcome to #24 – The Kobe Edition (not a lot of passing, but still entertaining). 8 articles too. Which one was your favourite Kobe? 8 or 24?

  • I wrote an article about how architecture matters years ago. It’s nothing to do with this one from Preston, but he makes some great points about the importance of architecture when looking to protect your public cloud workloads.
  • Commvault GO 2019 was held recently, and Chin-Fah had some thoughts on where Commvault’s at. You can read all about that here. Speaking of Commvault, Keith had some thoughts as well, and you can check them out here.
  • Still on data protection, Alastair posted this article a little while ago about using the Cohesity API for reporting.
  • Cade just posted a great article on using the right transport mode in Veeam Backup & Replication. Goes to show he’s not just a pretty face.
  • VMware vFORUM is coming up in November. I’ll be making the trip down to Sydney to help out with some VMUG stuff. You can find out more here, and register here.
  • Speaking of VMUG, Angelo put together a great 7-part series on VMUG chapter leadership and tips for running successful meetings. You can read part 7 here.
  • This is a great article on managing Rubrik users from the CLI from Frederic Lhoest.
  • Are you into Splunk? And Pure Storage? Vaughn has you covered with an overview of Splunk SmartStore on Pure Storage here.

Clumio’s DPaaS Is All SaaS

I recently had the chance to speak to Clumio’s Head of Product Marketing, Steve Siegel, about what they do, and thought I’d share a few notes here.

 

Clumio?

Clumio have raised $51M+ in Series A and B funding. They were founded in 2017, built on public cloud technology, and came out of stealth in August.

 

The Approach

Clumio want to be able to deliver a data management platform in the cloud. The first real opportunity they identified was Backup as a Service. The feeling was that there were too many backup models across private, public cloud, Software as a Service (SaaS), and none of them were particularly easy to take advantage of in an effective manner. This can be a real problem when you’re looking to protect critical information assets.

 

Proper SaaS

The answer, as far as Clumio were concerned, was to develop an “authentic SaaS” offering. This offering provides all of the features you’d expect from a SaaS-based DPaaS (yes, we’re now officially in acronym hell), including:

  • On-demand scalability
  • Ease of management
  • Predictable costs
  • Global compliance
  • Always-on security – with data encrypted in-flight and at-rest

The platform is mainly built on AWS at this stage, but there are plans in place to leverage the other hyperscalers in the future. Clumio charge per VM, with the subscription fee including support. They have plans to improve capabilities, with:

  • AWS support in Dec 2019
  • O365 support in Q1 2020

They currently support the following public cloud workloads:

  • VMware Cloud on AWS; and
  • AWS – extending backup and recovery to support EBC and EC2 workloads (RDS to follow soon after)

 

Thoughts and Further Reading

If you’re a regular reader of this blog, you’ll notice that I’ve done a bit with data protection technologies over the years. From the big enterprise software shops to the “next-generation” data protection providers, as well as the consumer-side stuff and the “as a Service crowd”. There are a bunch of different ways to do data protection, and some work better than others. Clumio feel strongly that the “[s]implicity of SaaS is winning”, and there’s definitely an argument to be made that the simplicity of the approach is a big reason why the likes of Clumio will receive some significant attention from the marketplace.

That said, the success of services is ultimately determined by a few factors. In my opinion, a big part of what’s important when evaluating these types of services is whether they can technically service the requirements you have. If you’re an HP-UX shop running a bunch of on-premises tin, you might find that this type of service isn’t going to be much use. And if you’re using a cloud-based service but don’t really have decent connectivity to said cloud, you’re going to have a tough time getting your data back when something goes wrong. But that’s not all there is to it. You also need to look at how much it’s going to cost you to consume the service, and think about what it’s going to cost when something goes wrong. It’s all well and good if your daily ingress charges are relatively low with AWS, but if you need to get a bunch of data back out in a hurry, you might find it’s not a great experience, financially speaking. There are a bunch of factors that will impact this though, so you really need to do some modelling before you go down that path.

I’m a big fan of SaaS offerings when they’re done well, and I hope Clumio continue to innovate in the future and expand their support for workloads and infrastructure topologies. They’ve picked up a few customers, and are hiring smart people. You can read more about them over at Blocks & Files, and Ken Nalbone also covered them over at Gestalt IT.

Pure//Accelerate 2019 – Cloud Block Store for AWS

Disclaimer: I recently attended Pure//Accelerate 2019.  My flights, accommodation, and conference pass were paid for by Pure Storage. There is no requirement for me to blog about any of the content presented and I am not compensated by Pure Storage for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Cloud Block Store for AWS from Pure Storage has been around for a little while now. I had the opportunity to hear about it in more depth at the Storage Field Day Exclusive event at Pure//Accelerate 2019 and thought I’d share some thoughts here. You can grab a copy of my rough notes from the session here, and video from the session is available here.

 

Cloud Vision

Pure Storage have been focused on making everything related to their products effortless from day 1. An example of this approach is the FlashArray setup process – it’s really easy to get up and running and serving up storage to workloads. They wanted to do the same thing with anything they deliver via cloud services as well. There is, however, something of a “cloud divide” in operation in the industry. If you’re familiar with the various cloud deployment options, you’ll likely be aware that on-premises and hosted cloud is a bit different to public cloud. They:

  • Deliver different application architectures;
  • Deliver different management and consumption experience; and
  • Use different storage.

So what if Pure could build application portability and deliver common shared data services?

Pure have architected their cloud service to leverage what they call “Three Pillars”:

  • Build Your Cloud
  • Run anywhere
  • Protect everywhere

 

What Is It?

So what exactly is Cloud Block Store for AWS then? Well, imagine if you will, that you’re watching an episode of Pimp My Ride, and Xzibit is talking to an enterprise punter about how he or she likes cloud, and how he or she likes the way Pure Storage’s FlashArray works. And then X says, “Hey, we heard you liked these two things so we put this thing in the other thing”. Look, I don’t know the exact situation where this would happen. But anyway …

  • 100% software – deploys instantly as a virtual appliance in the cloud, runs only as long as you need it;
  • Efficient – deduplication, compression, and thin provisioning deliver capacity and performance economically;
  • Hybrid – easily migrate data bidirectionally, delivering data portability and protection across your hybrid cloud;
  • Consistent APIs – developers connect to storage the same way on-premises and in the cloud. Automated deployment with Cloud Formation templates;
  • Reliable, secure – delivers industrial-strength perfromance, reliability & protection with Multi-AZ HA, NDU, instant snaps and data at rest encryption; and
  • Flexible – pay as you go consumption model to best match your needs for production and development.

[image courtesy of Pure Storage]

Architecture

At the heart of it, the architecture for CVS is not dissimilar to the FlashArray architecture. There’re controllers, drives, NVRAM, and a virtual shelf.

  • EC2: CBS Controllers
  • EC2: Virtual Drives
  • Virtual Shelf: 7 Virtual drives in Spread Placement Group
  • EBS IO1: NVRAM, Write Buffer (7 total)
  • S3: Durable persistent storage
  • Instance Store: Non-Persistent Read Mirror

[image courtesy of Pure Storage]

What’s interesting, to me at least, is how they use S3 for persistent storage.

Procurement

How do you procure CBS for AWS? I’m glad you asked. There are two procurement options.

A – Pure as-a-Service

  • Offered via SLED / CLED process
  • Minimums 100TiB effective used capacity
  • Unified hybrid contracts (on-premises and CBS, CBS)
  • 1 year to 3 year contracts

B – AWS Marketplace

  • Direct to customer
  • Minimum, 10 TiB effective used capacity
  • CBS only
  • Month to month contract or 1 year contract

 

Use Cases

There are a raft of different use cases for CBS. Some of them made sense to me straight away, some of them took a little time to bounce around in my head.

Disaster Recovery

  • Production instance on-premises
  • Replicate data to public cloud
  • Fail over in DR event
  • Fail back and recover

Lift and shift

  • Production instance on-premises
  • Replicate data to public cloud
  • Run the same architecture as before
  • Run production on CBS

Use case: Dev / test

  • Replicate data to public cloud
  • Instantiate test / dev instances in public cloud
  • Refresh test / dev periodically
  • Bring changes back on-premises
  • Snapshots are more costly and slower to restore in native AWS

ActiveCluster

  • HA within an availability zone and / or across availability zones in an AWS region (ActiveCluster needs <11ms latency)
  • No downtime when a Cloud Block Store Instance goes away or there is a zone outage
  • Pure1 Cloud Mediator Witness (simple to manage and deploy)

Migrating VMware Environments

VMware Challenges

  • AWS does not recognise VMFS
  • Replicating volumes with VMFS will not do any good

Workaround

  • Convert VMFS datastore into vVOLs
  • Now each volume has the Guest VM’s file system (NTFS, EXT3, etc)
  • Replicate VMDK vVOLs to CBS
  • Now the volumes can be mounted to EC2 with matching OS

Note: This is for the VM’s data volumes. The VM boot volume will not be usable in AWS. The VM’s application will need to be redeployed in native AWS EC2.

VMware Cloud

VMware Challenges

  • VMware Cloud does not support external storage, it only supports vSAN

Workaround

  • Connect Guest VMs directly to CBS via iSCSI

Note: I haven’t verified this myself, and I suspect there may be other ways to do this. But in the context of Pure’s offering, it makes sense.

 

Thoughts and Further Reading

There’s been a feeling in some parts of the industry for the last 5-10 years that the rise of the public cloud providers would spell the death of the traditional storage vendor. That’s clearly not been the case, but it has been interesting to see the major storage slingers evolving their product strategies to both accommodate and leverage the cloud providers in a more effective manner. Some have used the opportunity to get themselves as close as possible to the cloud providers, without actually being in the cloud. Others have deployed virtualised versions of their offerings inside public cloud and offered users the comfort of their traditional stack, but off-premises. There’s value in these approaches, for sure. But I like the way that Pure have taken it a step further and optimised their architecture to leverage some of the features of what AWS can offer from a cloud hardware perspective.

In my opinion, the main reason you’d look to leverage something like CBS on AWS is if you have an existing investment in Pure and want to keep doing things a certain way. You’re also likely using a lot of traditional VMs in AWS and want something that can improve the performance and resilience of those workloads. CBS is certainly a great way to do this. If you’re already running a raft of cloud-native applications, it’s likely that you don’t necessarily need the features on offer from CBS, as you’re already (hopefully) using them natively. I think Pure understand this though, and aren’t pushing CBS for AWS as the silver bullet for every cloud workload.

I’m looking forward to seeing what the market uptake on this product is like. I’m also keen to crunch the numbers on running this type of solution versus the cost associated with doing something on-premises or via other means. In any case, I’m looking forward to see how this capability evolves over time, and I think CBS on AWS is definitely worthy of further consideration.