The announcement comes in two parts, with the first being that Pure Storage and Cohesity are forming a strategic partnership. The idea behind this is that, together, the companies will deliver “industry-leading storage innovations from Pure Storage with modern, flash-optimised backup from Cohesity”. There are plenty of things in common between the companies, including the fact that they’re both, as Wiborg puts it, “keenly focused on doing the right thing for the customer”.
Pure FlashRecover Powered By Cohesity
Partnerships are exciting and all, but what was of more interest was the Pure FlashRecover announcement. What is it exactly? It’s basically Cohesity DataProtect running on Cohesity-certified compute nodes (the whitebox gear you might be familiar with if you’ve bought Cohesity tin previously), using Pure’s FlashBlades as the storage backend.
[image courtesy of Pure Storage]
FlashRecover has a targeted general availability for Q4 CY2020 (October). It will be released in the US initially, with other regions to follow. From a go to market perspective, Pure will handle level 1 and level 2 support, with Cohesity support being engaged for escalations. Cohesity DataProtect will be added to the Pure price list, and Pure becomes a Cohesity Technology Partner.
My first thought when I heard about this was why would you? I’ve traditionally associated scalable data protection and secondary storage with slower, high-capacity appliances. But as we talked through the use cases, it started to make sense. FlashBlades by themselves aren’t super high capacity devices, but neither are the individual nodes in Cohesity appliances. String a few together and you have enough capacity to do data protection and fast recovery in a predictable fashion. FlashBlade supports 75 nodes (I think) [Edit: it scales up to 150x 52TB nodes. Thanks for the clarification from Andrew Miller] and up to 1PB of data in a single namespace. Throw in some of the capabilities that Cohesity DataProtect brings to the table and you’ve got an interesting solution. The knock on some of the next-generation data protection solutions has been that recovery can still be quite time-consuming. The use of all-flash takes away a lot of that pain, especially when coupled with a solution like FlashBlade that delivers some pretty decent parallelism in terms of getting data recovered back to production quickly.
An evolving use case for protection data is data reuse. For years, application owners have been stuck with fairly clunky ways of getting test data into environments to use with application development and testing. Solutions like FlashRecover provide a compelling story around protection data being made available for reuse, not just recovery. Another cool thing is that when you invest in FlashBlade, you’re not locking yourself into a particular silo, you can use the FlashBlade solution for other things too.
I don’t work with Pure Storage and Cohesity on a daily basis anymore, but in my previous role I had the opportunity to kick the tyres extensively with both the Cohesity DataProtect solution and the Pure Storage FlashBlade. I’m an advocate of both of these companies because of the great support I received from both companies from pre-sales through to post-sales support. They are relentlessly customer focused, and that really translates in both the technology and the field experience. I can’t speak highly enough of the engagement I’ve experienced with both companies, from both a blogger’s experience, and as an end user.
FlashRecover isn’t going to be appropriate for every organisation. Most places, at the moment, can probably still get away with taking a little time to recover large amounts of data if required. But for industries where time is money, solutions like FlashRecover can absolutely make sense. If you’d like to know more, there’s a comprehensive blog post over at the Pure Storage website, and the solution brief can be found here.
I’ve been doing some testing in the lab recently. The focus of this testing has been primarily on Pure Storage’s ObjectEngine and its associated infrastructure. As part of that, I’ve been doing various things with Veeam Backup & Replication 9.5 Update 4, including setting up a FlashBlade NFS repository. I’ve documented the process in a document here. One thing that I thought worthy of noting separately was the firewall requirements. For my Linux Mount Server, I used a CentOS 7 VM, configured with 8 vCPUs and 16GB of RAM. I know, I normally use Debian, but for some reason (that I didn’t have time to investigate) it kept dying every time I kicked off a backup job.
In any case, I set everything up as per Pure’s instructions, but kept getting timeout errors on the job. The error I got was “5/17/2019 10:03:47 AM :: Processing HOST-01 Error: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond NFSMOUNTHOST:2500“. It felt like it was probably a firewall issue of some sort. I tried to make an exception on the Windows VM hosting the Veeam Backup server, but that didn’t help. The problem was with the Linux VM’s firewall. I used the instructions I found here to add in some custom rules. According to the Veeam documentation, Backup Repository access uses TCP ports 2500 – 5000. Your SecOps people will no doubt have a conniption, but here’s how to open those ports on CentOS.
Firstly, is the firewall running?
[danf@nfsmounthost ~]$ sudo firewall-cmd --state[sudo] password for danf:running
Yes it is. So let’s stop it to see if this line of troubleshooting is worth pursuing.
Remember, it’s never the storage. It’s always the firewall. Also, keep in my mind this article is about the how. I’m not offering my opinion about whether it’s really a good idea to configure your host-based firewalls with more holes than Swiss cheese. Or whatever things have lots of holes in them.