Cohesity DataProtect Delivered As A Service – SaaS Connector

I recently wrote about my experience with Cohesity DataProtect Delivered as a Service. One thing I didn’t really go into in that article was the networking and resource requirements for the SaaS Connector deployment. It’s nothing earth-shattering, but I thought it was worthwhile noting nonetheless.

In terms of the VM that you deploy for each SaaS Connector, it has the following system requirements:

  • 4 CPUs
  • 10 GB RAM
  • 20 GB disk space (100 MB throughput, 100 IOPs)
  • Outbound Internet connection

In terms of scaleability, the advice from Cohesity at the time of writing is to deploy “one SaaS Connector for each 160 VMs or 16 TB of source data. If you have more data, we recommend that you stagger their first full backups”. Note that this is subject to change. The outbound Internet connectivity is important. You’ll (hopefully) have some kind of firewall in place, so the following ports need to be open.

Port
Protocol
Target
Direction (from Connector)
Purpose

443

TCP

helios.cohesity.com

Outgoing

Connection used for control path

443

TCP

helios-data.cohesity.com

Outgoing

Used to send telemetry data

22, 443

TCP

rt.cohesity.com

Outgoing

Support channel

11117

TCP

*.dmaas.helios.cohesity.com

Outgoing

Connection used for data path

29991

TCP

*.dmaas.helios.cohesity.com

Outgoing

Connection used for data path

443

TCP

*.cloudfront.net

Outgoing

To download upgrade packages

443

TCP

*.amazonaws.com

Outgoing

For S3 data traffic

123, 323

UDP

ntp.google.com or internal NTP

Outgoing

Clock sync

53

TCP & UDP

8.8.8.8 or internal DNS

Bidirectional

Host resolution

Cohesity recommends that you deploy more than one SaaS Connector, and you can scale them out depending on the number of VMs / how much data you’re protecting with the service.

If you’re having concerns with bandwidth, you can configure the bandwidth used by the SaaS Connector via Helios.

Navigate to Settings -> SaaS Connections and click on Bandwidth Usage Options. You can then add a rule.

You then schedule bandwidth usage, potentially for quiet times (particularly useful in small environments where Internet connections may be shared with end users). There’s support for upload and download traffic, and multiple schedules as well.

And that’s pretty much it. Once you have your SaaS Connectors deployed you can monitor everything from Helios.

 

Cohesity DataProtect Delivered As A Service – A Few Notes

As part of a recent vExpert giveaway the folks at Cohesity gave me a 30-day trial of the Cohesity DataProtect Delivered as a Service offering. This is a component of Cohesity’s Data Management as a Service (DMaaS) offering and, despite the slightly unwieldy name, it’s a pretty neat solution. I want to be clear that it’s been a little while since I had any real stick time with Cohesity’s DataProtect offering, and I’m looking at this in a friend’s home lab, so I’m making no comments or assertions regarding the performance of the service. I’d also like to be clear that I’m not making any recommendation one way or another with regards to the suitability of this service for your organisation. Every organisation has its own requirements and it’s up to you to determine whether this is the right thing for you.

 

Overview

I’ve added a longer article here that explains the setup process in more depth, but here’s the upshot of what you need to do to get up and running. In short, you sign up, select the region you want to backup workloads to, configure your SaaS Connectors for the particular workloads you’d like to protect, and then go nuts. It’s really pretty simple.

Workloads

In terms of supported workloads, the following environments are currently supported:

  • Hypervisors (VMware and Hyper-V);
  • NAS (generic SMB and NFS, Isilon, and NetApp);
  • Microsoft SQL Server;
  • Oracle;
  • Microsoft 365;
  • Amazon AWS; and
  • Physical hosts.

This list will obviously grow as some of the support for particular workloads with DataProtect and Helios improves over time.

Regions

The service is currently available in seven AWS Regions:

  • US East (Ohio)
  • US East (N. Virginia)
  • US West (Oregon)
  • US West (N. California)
  • Canada (Central)
  • Asia Pacific (Sydney)
  • Europe (Frankfurt)

You’ve got some flexibility in terms of where you store your data, but it’s my understanding that the telemetry data (i.e. Helios) goes to one of the US East Regions. It’s also important to note that once you’ve put data in a particular Region, you can’t then move that data to another Region.

Encryption

Data is encrypted in-flight and at rest, and you have a choice of KMS solutions (Cohesity-managed or DIY AWS KMS). Note that once you choose a KMS, you cannot change your mind. Well, you can, but you can’t do anything about it.

 

Thoughts

Data protection as a service offerings are proving increasingly popular with customers, data protection vendors, and service providers. The appeal for the punters is that they can apply some of the same thinking to protecting their investment in their cloud as they did to standing it up in the first place. The appeal for the vendors and SPs is that they can deliver service across a range of platforms without shipping tin anywhere, and build up annuity business as well.

With regards to this particular solution, it still has some rough edges, but it’s great to see just how much can already be achieved. As I mentioned, it’s been a while since I had some time with DataProtect, and some of the usability and functionality of both it and Helios has really come along in leaps and bounds. And the beauty of this being a vendor-delivered as a Service offering is that features can be rolled out on a frequent basis, rather than waiting for quarterly improvements to arrive via regularly scheduled software maintenance releases. Once you get your head around the workload, things tend to work as expected, and it was fairly simple to get everything setup and working in a short period of time.

This isn’t for everyone, obviously. If you’re not a fan of doing things in AWS, then you’re really not going to like how this works. And if you don’t operate near one of the currently supported Regions, then the tyranny of bandwidth (i.e. physics) may prevent reasonable recovery times from being achievable for you. It might seem a bit silly, but these are nonetheless things you need to consider when looking at adopting a service like this. It’s also important to think of the security posture of these kinds of services. Sure, things are encrypted, and you can use MFA with Helios, but folks outside the US sometimes don’t really dig the idea of any of their telemetry data living in the US. Sure, it’s a little bit tinfoil hat but it you’d be surprised how much it comes up. And it should be noted that this is the same for on-premises Cohesity solutions using Helios. Then again, Cohesity is by no means alone in sending telemetry data back for support and analysis purposes. It’s fairly common and something your infosec will likely already be across how to deal with it.

If you’re fine with that (and you probably should be), and looking to move away from protecting your data with on-premises solutions, or looking for something that gives you some flexible deployment and management options, this could be of interest. As I mentioned, the beauty of SaaS-based solutions is that they’re more frequently updated by the vendor with fixes and features. Plus you don’t need to do a lot of the heavy lifting in terms of care and feeding of the environment. You’ll also notice that this is the DataProtect component, and I imagine that Cohesity has plans to fill out the Data Management part of the solution more thoroughly in the future. If you’d like to try it for yourself, I believe there’s a trial you can sign up for. Finally, thanks to the Cohesity TAG folks for the vExpert giveaway and making this available to people like me.

Ransomware? More Like Ransom Everywhere …

Stupid title, but ransomware has been in the news quite a bit recently. I’ve had some tabs open in my browser for over twelve months with articles about ransomware that I found interesting. I thought it was time to share them and get this post out there. This isn’t comprehensive by any stretch, but rather it’s a list of a few things to look at when looking into anti-ransomware solutions, particularly for NAS environments.

 

It Kicked Him Right In The NAS

The way I see it (and I’m really not the world’s strongest security person), there are (at least) three approaches to NAS and ransomware concerns.

The Endpoint

This seems to be where most companies operate – addressing ransomware as it enters the organisation via the end users. There are a bunch of solutions out there that are designed to protect humans from themselves. But this approach doesn’t always help with alternative attack vectors and it’s only as good as the update processes you have in place to keep those endpoints updated. I’ve worked in a few shops where endpoint protection solutions were deployed and then inadvertently clobbered by system updates or users with too many privileges. The end result was that the systems didn’t do what they were meant to and there was much angst.

The NAS Itself

There are things you can do with NetApp solutions, for example, that are kind of interesting. Something like Stealthbits looks neat, and Varonis also uses FPolicy to get a similar result. Your mileage will vary with some of these solutions, and, again, it comes down to the ability to effectively ensure that these systems are doing what they say they will, when they will.

Data Protection

A number of the data protection vendors are talking about their ability to recover quickly from ransomware attacks. The capabilities vary, as they always do, but most of them have a solid handle on quick recovery once an infection is discovered. They can even help you discover that infection by analysing patterns in your data protection activities. For example, if a whole bunch of data changes overnight, it’s likely that you have a bit of a problem. But, some of the effectiveness of these solutions is limited by the frequency of data protection activity, and whether anyone is reading the alerts. The challenge here is that it’s a reactive approach, rather than something preventative. That said, companies like Rubrik are working hard to enhance its Radar capability into something a whole lot more interesting.

Other Things

Other things that can help limit your exposure to ransomware include adopting generally robust security practices across the board, monitoring all of your systems, and talking to your users about not clicking on unknown links in emails. Some of these things are easier to do than others.

 

Thoughts

I don’t think any of these solutions provide everything you need in isolation, but the challenge is going to be coming up with something that is supportable and, potentially, affordable. It would also be great if it works too. Ransomware is a problem, and becoming a bigger problem every day. I don’t want to sound like I’m selling you insurance, but it’s almost not a question of if, but when. But paying attention to some of the above points will help you on your way. Of course, sometimes Sod’s Law applies, and things will go badly for you no matter how well you think you’ve designed your systems. At that point, it’s going to be really important that you’ve setup your data protection systems correctly, otherwise you’re in for a tough time. Remember, it’s always worth thinking about what your data is worth to you when you’re evaluating the relative value of security and data protection solutions. This article from Chin-Fah had some interesting insights into the problem. And this article from Cohesity outlined a comprehensive approach to holistic cyber security. This article from Andrew over at Pure Storage did a great job of outlining some of the challenges faced by organisations when rolling out these systems. This list of NIST ransomware resources from Melissa is great. And if you’re looking for a useful resource on ransomware from VMware’s perspective, check out this site.

Random Short Take #49

Happy new year and welcome to Random Short Take #49. Not a great many players have worn 49 in the NBA (2 as it happens). It gets better soon, I assure you. Let’s get random.

  • Frederic has written a bunch of useful articles around useful Rubrik things. This one on setting up authentication to use Active Directory came in handy recently. I’ll be digging in to some of Rubrik’s multi-tenancy capabilities in the near future, so keep an eye out for that.
  • In more things Rubrik-related, this article by Joshua Stenhouse on fully automating Rubrik EDGE / AIR deployments was great.
  • Speaking of data protection, Chris Colotti wrote this useful article on changing the Cloud Director database IP address. You can check it out here.
  • You want more data protection news? How about this press release from BackupAssist talking about its partnership with Wasabi?
  • Fine, one more data protection article. Six backup and cloud storage tips from Backblaze.
  • Speaking of press releases, WekaIO has enjoyed some serious growth in the last year. Read more about that here.
  • I loved this article from Andrew Dauncey about things that go wrong and learning from mistakes. We’ve all likely got a story about something that went so spectacularly wrong that you only made that mistake once. Or twice at most. It also reminds me of those early days of automated ESX 2.5 builds and building magical installation CDs that would happily zap LUN 0 on FC arrays connected to new hosts. Fun times.
  • Finally, I was lucky enough to talk to Intel Senior Fellow Al Fazio about what’s happening with Optane, how it got to this point, and where it’s heading. You can read the article and check out the video here.

Pure Storage and Cohesity Announce Strategic Partnership and Pure FlashRecover

Pure Storage and Cohesity announced a strategic partnership and a new joint solution today. I had the opportunity to speak with Amy Fowler and Biswajit Mishra from Pure Storage, along with Anand Nadathur and Chris Wiborg from Cohesity, and thought I’d share my notes here.

 

Friends In The Market

The announcement comes in two parts, with the first being that Pure Storage and Cohesity are forming a strategic partnership. The idea behind this is that, together, the companies will deliver “industry-leading storage innovations from Pure Storage with modern, flash-optimised backup from Cohesity”.  There are plenty of things in common between the companies, including the fact that they’re both, as Wiborg puts it, “keenly focused on doing the right thing for the customer”.

 

Pure FlashRecover Powered By Cohesity

Partnerships are exciting and all, but what was of more interest was the Pure FlashRecover announcement. What is it exactly? It’s basically Cohesity DataProtect running on Cohesity-certified compute nodes (the whitebox gear you might be familiar with if you’ve bought Cohesity tin previously), using Pure’s FlashBlades as the storage backend.

[image courtesy of Pure Storage]

FlashRecover has a targeted general availability for Q4 CY2020 (October). It will be released in the US initially, with other regions to follow. From a go to market perspective, Pure will handle level 1 and level 2 support, with Cohesity support being engaged for escalations. Cohesity DataProtect will be added to the Pure price list, and Pure becomes a Cohesity Technology Partner.

 

Thoughts

My first thought when I heard about this was why would you? I’ve traditionally associated scalable data protection and secondary storage with slower, high-capacity appliances. But as we talked through the use cases, it started to make sense. FlashBlades by themselves aren’t super high capacity devices, but neither are the individual nodes in Cohesity appliances. String a few together and you have enough capacity to do data protection and fast recovery in a predictable fashion. FlashBlade supports 75 nodes (I think) [Edit: it scales up to 150x 52TB nodes. Thanks for the clarification from Andrew Miller] and up to 1PB of data in a single namespace. Throw in some of the capabilities that Cohesity DataProtect brings to the table and you’ve got an interesting solution. The knock on some of the next-generation data protection solutions has been that recovery can still be quite time-consuming. The use of all-flash takes away a lot of that pain, especially when coupled with a solution like FlashBlade that delivers some pretty decent parallelism in terms of getting data recovered back to production quickly.

An evolving use case for protection data is data reuse. For years, application owners have been stuck with fairly clunky ways of getting test data into environments to use with application development and testing. Solutions like FlashRecover provide a compelling story around protection data being made available for reuse, not just recovery. Another cool thing is that when you invest in FlashBlade, you’re not locking yourself into a particular silo, you can use the FlashBlade solution for other things too.

I don’t work with Pure Storage and Cohesity on a daily basis anymore, but in my previous role I had the opportunity to kick the tyres extensively with both the Cohesity DataProtect solution and the Pure Storage FlashBlade. I’m an advocate of both of these companies because of the great support I received from both companies from pre-sales through to post-sales support. They are relentlessly customer focused, and that really translates in both the technology and the field experience. I can’t speak highly enough of the engagement I’ve experienced with both companies, from both a blogger’s experience, and as an end user.

FlashRecover isn’t going to be appropriate for every organisation. Most places, at the moment, can probably still get away with taking a little time to recover large amounts of data if required. But for industries where time is money, solutions like FlashRecover can absolutely make sense. If you’d like to know more, there’s a comprehensive blog post over at the Pure Storage website, and the solution brief can be found here.

Random Short Take #30

Welcome to Random Short Take #30. You’d think 30 would be an easy choice, given how much I like Wardell Curry II, but for this one I’m giving a shout out to Rasheed Wallace instead. I’m a big fan of ‘Sheed. I hope you all enjoy these little trips down NBA memory lane. Here we go.

  • Veeam 10’s release is imminent. Anthony has been doing a bang up job covering some of the enhancements in the product. This article was particularly interesting because I work in a company selling Veeam and using vCloud Director.
  • Sticking with data protection, Curtis wrote an insightful article on backups and frequency.
  • If you’re in Europe or parts of the US (or can get there easily), like writing about technology, and you’re into cars and stuff, this offer from Cohesity could be right up your alley.
  • I was lucky enough to have a chat with Sheng Liang from Rancher Labs a few weeks ago about how it’s going in the market. I’m relatively Kubernetes illiterate, but it sounds like there’s a bit going on.
  • For something completely different, this article from Christian on Raspberry Pi, volumio and HiFiBerry was great. Thanks for the tip!
  • Spinning disk may be as dead as tape, if these numbers are anything to go by.
  • This was a great article from Matt Crape on home lab planning.
  • Speaking of home labs, Shanks posted an interesting article on what he has running. The custom-built rack is inspired.

Random Short Take #27

Welcome to my semi-regular, random news post in a short format. This is #27. You’d think it would be hard to keep naming them after basketball players, and it is. None of my favourite players ever wore 27, but Marvin Barnes did surface as a really interesting story, particularly when it comes to effective communication with colleagues. Happy holidays too, as I’m pretty sure this will be the last one of these posts I do this year. I’ll try and keep it short, as you’ve probably got stuff to do.

  • This story of serious failure on El Reg had me in stitches.
  • I really enjoyed this article by Raj Dutt (over at Cohesity’s blog) on recovery predictability. As an industry we talk an awful lot about speeds and feeds and supportability, but sometimes I think we forget about keeping it simple and making sure we can get our stuff back as we expect.
  • Speaking of data protection, I wrote some articles for Druva about, well, data protection and things of that nature. You can read them here.
  • There have been some pretty important CBT-related patches released by VMware recently. Anthony has provided a handy summary here.
  • Everything’s an opinion until people actually do it, but I thought this research on cloud adoption from Leaseweb USA was interesting. I didn’t expect to see everyone putting their hands up and saying they’re all in on public cloud, but I was also hopeful that we, as an industry, hadn’t made things as unclear as they seem to be. Yay, hybrid!
  • Site sponsor StorONE has partnered with Tech Data Global Computing Components to offer an All-Flash Array as a Service solution.
  • Backblaze has done a nice job of talking about data protection and cloud storage through the lens of Star Wars.
  • This tip on removing particular formatting in Microsoft Word documents really helped me out recently. Yes I know Word is awful.
  • Someone was nice enough to give me an acknowledgement for helping review a non-fiction book once. Now I’ve managed to get a character named after me in one of John Birmingham’s epics. You can read it out of context here. And if you’re into supporting good authors on Patreon – then check out JB’s page here. He’s a good egg, and his literary contributions to the world have been fantastic over the years. I don’t say this just because we live in the same city either.

Cohesity – NAS Data Migration Overview

Data Migration

Cohesity NAS Data Migration, part of SmartFiles, was recently announced as a generally available feature within the Cohesity DataPlatform 6.4 release (after being mentioned in the 6.3 release blog post). The idea behind it is that you can use the feature to perform the migration of NAS data from a primary source to the Cohesity DataPlatform. It is supported for NAS storage registered as SMB or NFS (so it doesn’t necessarily need to be a NAS appliance as such, it can also be a file share hosted somewhere).

 

What To Think About

There are a few things to think about when you configure your migration policy, including:

  • The last time the file was accessed;
  • Last time the file was modified; and
  • The size of the file.

You also need to think about how frequently you want to run the job. Finally, it’s worth considering which View you want the archived data to reside on.

 

What Happens?

When the data is migrated an SMB2 symbolic link is left in place of the file with the same name as the file and the original data is moved to the Cohesity View. Note that on Windows boxes, remote to remote symbolic links are disabled, so you need to run these commands:

C:\Windows\system32>fsutil behavior set SymlinkEvaluation R2R:1
C:\Windows\system32>fsutil behavior query SymlinkEvaluation

Once the data is migrated to the Cohesity cluster, subsequent read and write operations are performed on the Cohesity host. You can move data back to the environment by mounting the Cohesity target View on a Windows client, and copying it back to the NAS.

 

Configuration Steps

To get started, select File Services, and click on Data Migration.

Click on the Migrate Data to configure a migration job.

You’ll need to give it a name.

 

The next step is to select the Source. If you already have a NAS source configured, you’ll see it here. Otherwise you can register a Source.

Click on the arrow to expand the registered NAS mount points.

Select the mount point you’d like to use.

Once you’ve selected the mount point, click on Add.

You then need to select the Storage Domain (formerly known as a ViewBox) to store the archived data on.

You’ll need to provide a name, and configure schedule options.

You can also configure advanced settings, including QoS and exclusions. Once you’re happy, click on Migrate and the job will be created.

You can then run the job immediately, or wait for the schedule to kick in.

 

Other Things To Consider

You’ll need to think about your anti-virus options as well. You can register external anti-virus software or install the anti-virus app from the Cohesity Marketplace

 

Thoughts And Further Reading

Cohesity have long positioned their secondary storage solution as something more than just a backup and recovery solution. There’s some debate about the difference between storage management and data management, but Cohesity seem to have done a good job of introducing yet another feature that can help users easily move data from their primary storage to their secondary storage environment. Plenty of backup solutions have positioned themselves as archive solutions, but many have been focused on moving protection data, rather than primary data from the source. You’ll need to do some careful planning around sizing your environment, as there’s always a chance that an end user will turn up and start accessing files that you thought were stale. And I can’t say with 100% certainty that this solution will transparently work with every line of business application in your environment. But considering it’s aimed at SMB and NFS shares, it looks like it does what it says on the tin, and moves data from one spot to another.

You can read more about the new features in Cohesity DataPlatform 6.4 (Pegasus) on the Cohesity site, and Blocks & Files covered the feature here. Alastair also shared some thoughts on the feature here.

Random Short Take #24

Want some news? In a shorter format? And a little bit random? This listicle might be for you. Welcome to #24 – The Kobe Edition (not a lot of passing, but still entertaining). 8 articles too. Which one was your favourite Kobe? 8 or 24?

  • I wrote an article about how architecture matters years ago. It’s nothing to do with this one from Preston, but he makes some great points about the importance of architecture when looking to protect your public cloud workloads.
  • Commvault GO 2019 was held recently, and Chin-Fah had some thoughts on where Commvault’s at. You can read all about that here. Speaking of Commvault, Keith had some thoughts as well, and you can check them out here.
  • Still on data protection, Alastair posted this article a little while ago about using the Cohesity API for reporting.
  • Cade just posted a great article on using the right transport mode in Veeam Backup & Replication. Goes to show he’s not just a pretty face.
  • VMware vFORUM is coming up in November. I’ll be making the trip down to Sydney to help out with some VMUG stuff. You can find out more here, and register here.
  • Speaking of VMUG, Angelo put together a great 7-part series on VMUG chapter leadership and tips for running successful meetings. You can read part 7 here.
  • This is a great article on managing Rubrik users from the CLI from Frederic Lhoest.
  • Are you into Splunk? And Pure Storage? Vaughn has you covered with an overview of Splunk SmartStore on Pure Storage here.

Random Short Take #18

Here are some links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 18 – buckle up kids! It’s all happening.

  • Cohesity added support for Active Directory protection with version 6.3 of the DataPlatform. Matt covered it pretty comprehensively here.
  • Speaking of Cohesity, Alastair wrote this article on getting started with the Cohesity PowerShell Module.
  • In keeping with the data protection theme (hey, it’s what I’m into), here’s a great article from W. Curtis Preston on SaaS data protection, and what you need to consider to not become another cautionary tale on the Internet. Curtis has written a lot about data protection over the years, and you could do a lot worse than reading what he has to say. And that’s not just because he signed a book for me.
  • Did you ever stop and think just how insecure some of the things that you put your money into are? It’s a little scary. Shell are doing some stuff with Cybera to improve things. Read more about that here.
  • I used to work with Vincent, and he’s a super smart guy. I’ve been at him for years to start blogging, and he’s started to put out some articles. He’s very good at taking complex topics and distilling them down to something that’s easy to understand. Here’s his summary of VMware vRealize Automation configuration.
  • Tom’s take on some recent CloudFlare outages makes for good reading.
  • Google Cloud has announced it’s acquiring Elastifile. That part of the business doesn’t seem to be as brutal as the broader Alphabet group when it comes to acquiring and discarding companies, and I’m hoping that the good folks at Elastifile are looked after. You can read more on that here.
  • A lot of people are getting upset with terms like “disaggregated HCI”. Chris Mellor does a bang up job explaining the differences between the various architectures here. It’s my belief that there’s a place for all of this, and assuming that one architecture will suit every situation is a little naive. But what do I know?