I’ve covered the Cohesity appliance deployment in a howto article previously. I’ve also made use of the VMware-compatible Virtual Edition in our lab to test things like cluster to cluster replication and cloud tiering. The benefits of virtual appliances are numerous. They’re generally easy to deploy, don’t need dedicated hardware, can be re-deployed quickly when you break something, and can be a quick and easy way to validate a particular process or idea. They can also be a problem with regards to performance, and are at the mercy of the platform administrator to a point. But aren’t we all? With 6.1, Cohesity have made available a clustered virtual edition (the snappily titled Cohesity Cluster Virtual Edition ESXi). If you have access to the documentation section of the Cohesity support site, there’s a PDF you can download that explains everything. I won’t go into too much detail but there are a few things to consider before you get started.
Just like the non-clustered virtual edition, there’s a small and large configuration you can choose from. The small configuration supports up to 8TB for the Data disk, while the large configuration supports up to 16TB for the Data disk. The small config supports 4 vCPUs and 16GB of memory, while the large configuration supports 8 vCPUs and 32GB of memory.
Once you’ve deployed the appliance, you’ll need to add the Metadata disk and Data disk to each VM. The Metadata disk should be between 512GB and 1TB. For the large configuration, you can also apparently configure 2x 512GB disks, but I haven’t tried this. The Data disk needs to be between 512GB and 8TB for the small configuration and up to 16TB for the large configuration (with support for 2x 8TB disks). Cohesity recommends that these are formatted as Thick Provision Lazy Zeroed and deployed in Independent – Persistent mode. Each disk should be attached to its own SCSI controller as well, so you’ll have the system disk on SCSI 0:0, the Metadata disk on SCSI 1:0, and so on.
I did discover a weird issue when deploying the appliance on a Pure Storage FA-450 array in the lab. In vSphere this particular array’s datastore type is identified by vCenter as “Flash”. For my testing I had a 512GB Metadata disk and 3TB Data disk configured on the same datastore, with the three nodes living on three different datastores on the FlashArray. This caused errors with the cluster configuration, with the configuration wizard complaining that my SSD volumes were too big.
I moved the Data disk (with storage vMotion) to an all flash Nimble array (that for some reason was identified by vSphere as “HDD”) and the problem disappeared. Interestingly I didn’t have this problem with the single node configuration of 6.0.1 deployed with the same configuration. I raised a ticket with Cohesity support and they got back to me stating that this was expected behaviour in 6.1.0a. They tell me, however, that they’ve modified the behaviour of the configuration routine in an upcoming version so fools like me can run virtualised secondary storage on primary storage.
You can configure the appliance for increased resiliency at the Storage Domain level as well. If you go to Platform – Cluster – Storage Domains you can modify the DefaultStorageDomain (and other ones that you may have created). Depending on the size of the cluster you’ve deployed, you can choose the number of failures to tolerate and whether or not you want erasure coding enabled.
You can also decide whether you want EC to be a post-process activity or something that happens inline.
Once you’ve deployed (a minimum) 3 copies of the Clustered VE, you’ll need to manually add Metadata and Data disks to each VM. The specifications for these are listed above. Fire up the VMs and go to the IP of one of the nodes. You’ll need to log in as the admin user with the appropriate password and you can then start the cluster configuration.
This bit is pretty much the same as any Cohesity cluster deployment, and you’ll need to specify things like a hostname for the cluster partition. As always, it’s a good idea to ensure your DNS records are up to date. You can get away with using IP addresses but, frankly, people will talk about you behind your back if you do.
At this point you can also decide to enable encryption at the cluster level. If you decide not to enable it you can do this on a per Domain basis later.
Click on Create Cluster and you should see something like the following screen.
Once the cluster is created, you can hit the virtual IP you’ve configured, or any one of the attached nodes, to log in to the cluster. Once you log in, you’ll need to agree to the EULA and enter a license key.
The availability of virtual appliance versions for storage and data protection solutions isn’t a new idea, but it’s certainly one I’m a big fan of. These things give me an opportunity to test new code releases in a controlled environment before pushing updates into my production environment. It can help with validating different replication topologies quickly, and validating other configuration ideas before putting them into the wild (or in front of customers). Of course, the performance may not be up to scratch for some larger environments, but for smaller deployments and edge or remote office solutions, you’re only limited by the available host resources (which can be substantial in a lot of cases). The addition of a clustered version of the virtual edition for ESXi and Hyper-V is a welcome sight for those of us still deploying on-premises Cohesity solutions (I think the Azure version has been clustered for a few revisions now). It gets around the main issue of resiliency by having multiple copies running, and can also address some of the performance concerns associated with running virtual versions of the appliance. There are a number of reasons why it may not be the right solution for you, and you should work with your Cohesity team to size any solution to fit your environment. But if you’re running Cohesity in your environment already, talk to your account team about how you can leverage the virtual edition. It really is pretty neat. I’ll be looking into the resiliency of the solution in the near future and will hopefully be able to post my findings in the next few weeks.
Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe.
- This case modification by Ben Hoad really is the business.
- All the cool kids are adding support for Mellanox’s BlueField SmartNIC, including the likes of Excelero and E8 Storage.
- Quobyte Inc’s Data Center File System is now available on Google Cloud Platform. You can read the press release here.
- Say what you will about Gartner’s Magic Quadrant, folks in the industry pay attention to this kind of stuff. Nutanix recently sent me some comms letting me know they’re in a good spot in the latest MQ. If you can get hold of the report, and you’re into HCI, it makes for interesting reading.
- AWS re:Invent was last week in Vegas, and Anthony presented on what Veeam have been up to. Check out his post on what they’ve been doing in the AWS ecosystem, and other cool stuff.
- Speaking of AWS, Ather Beg posted some useful summary articles of the major announcements coming out of re:Invent. You can read them here and here.
- Vembu are taking up to 40% off the price of some of their products until Christmas.
Eric Siebert has opened up voting for the Top vBlog 2018. I’m listed on the vLaunchpad and you can vote for me under storage and independent blog categories as well. There are a bunch of great blogs listed on Eric’s vLaunchpad, so if nothing else you may discover someone you haven’t heard of before, and chances are they’ll have something to say that’s worth checking out. If this stuff seems a bit needy, it is. But it’s also nice to have people actually acknowledging what you’re doing. I’m hoping that people find this blog useful, because it really is a labour of love (random vendor t-shirts notwithstanding).
- Instant recovery for Oracle databases;
- NAS Direct Archive to protect massive unstructured data sets;
- Microsoft Office 365 support via Polaris SaaS Platform;
- SAP-certified protection for SAP HANA;
- Policy-driven protection for Epic EHR; and
- Rubrik works with Rubrik Datos IO to protect NoSQL databases.
New Features and Enhancements
As you can see from the list above, there’s a bunch of new features and enhancements. I’ll try and break down a few of these in the section below.
Rubrik have had some level of capability with Oracle protection for a little while now, but things are starting to hot up with 5.0.
- Simplified configuration (Oracle Auto Protection and Live Mount, Oracle Granular SLA Policy Assignments, and Oracle Automated Instance and Database Discovery)
- Orchestration of operational and PiT recoveries
- Increased control for DBAs
NAS Direct Archive
People have lots of data now. Like, a real lot. I don’t know how many Libraries of Congress exactly, but it can be a lot. Previously, you’d have to buy a bunch of Briks to store this data. Rubrik have recognised that this can be a bit of a problem in terms of footprint. With NAS Direct Archive, you can send the data to an “archive” target of your choice. So now you can protect a big chunk of data that goes through the Rubrik environment to end target such as object storage, public cloud, or NFS. The idea is to reduce the amount of Rubrik devices you need to buy. Which seems a bit weird, but their customers will be pretty happy to spend their money elsewhere.
[image courtesy of Rubrik]
It’s simple to get going, requiring a tick of a box to be configured. The metadata remains protected with the Rubrik cluster, and the good news is that nothing changes from the end user recovery experience.
Elastic App Service (EAS)
Rubrik now provides the ability to ingest DBs across a wider spectrum, allowing you to protect more of the DB-based applications you want, not just SQL and Oracle workloads.
SAP HANA Protection
I’m not really into SAP HANA, but plenty of organisations are. Rubrik now offer a SAP Certified Solution which, if you’ve had the misfortune of trying to protect SAP workloads before, is kind of a neat feature.
[image courtesy of Rubrik]
SQL Server Enhancements
There have been some nice enhancements with SQL Server protection, including:
- A Change Block Tracking (CBT) filter driver to decrease backup windows; and
- Support for group Volume Shadow Copy Service (VSS) snapshots.
So what about Group Backups? The nice thing about these is that you can protect many databases on the same SQL Server. Rather than process each VSS Snapshot individually, Rubrik will group the databases that belong to the same SLA Domain and process the snapshots as a batch group. There are a few benefits to this approach:
- It reduces SQL Server overhead, as well as decreases the amount of time a backup requires to be completed; and
- In turn, allowing customers to take more frequent backups of their databases delivering a lower RPO to the business.
Rubrik have done vSphere things since forever, and this release includes a few nice enhancements, including:
- Live Mount VMDKs from a Snapshot – providing the option to choose to mount specific VMDKs instead of an entire VM; and
- After selecting the VMDKs, the user can select a specific compatible VM to attach the mounted VMDKs.
The Rubrik Andes 5.0 integration with RSA SecurID will include RSA Authentication Manager 8.2 SP1+ and RSA SecurID Cloud Authentication Service. Note that CDM will not be supporting the older RADIUS protocol. Enabling this is a two-step process:
- Add the RSA Authentication Manager or RSA Cloud Authentication Service in the Rubrik Dashboard; and
- Enable RSA and associate a new or existing local Rubrik user or a new or existing LDAP server with the RSA Authentication Manager or RSA Cloud Authentication Service.
You also get the ability to generate API tokens. Note that if you want to interact with the Rubrik CDM CLI (and have MFA enabled) you’ll need these.
Other Bits and Bobs
There are a few other enhancements included, including:
- Windows Bare Metal Recovery;
- SLA Policy Advanced Configuration;
- Additional Reporting and Metrics; and
- Snapshot Retention Enhancements.
Thoughts and Further Reading
Wahl introduced the 5.0 briefing by talking about digital transformation as being, at its core, an automation play. The availability of a bunch of SaaS services can lead to fragmentation in your environment, and legacy technology doesn’t deal with with makes transformation. Rubrik are positioning themselves as a modern company, well-placed to help you with the challenges of protecting what can quickly become a complex and hard to contain infrastructure. It’s easy to sit back and tell people how transformation can change their business for the better, but these kinds of conversations often eschew the high levels of technical debt in the enterprise that the business is doing its best to ignore. I don’t really think that transformation is as simple as some vendors would have us believe, but I do support the idea that Rubrik are working hard to make complex concepts and tasks as simple as possible. They’ve dropped a shedload of features and enhancements in this release, and have managed to do so in a way that you won’t need to install a bunch of new applications to support these features, and you won’t need to do a lot to get up and running either. For me, this is the key advantage that the “next generation” data protection companies have over their more mature competitors. If you haven’t been around for decades, you very likely don’t offer support for every platform and application under the sun. You also likely don’t have customers that have been with you for 20 years that you need to support regardless of the official support status of their applications. This gives the likes of Rubrik the flexibility to deliver features as and when customers require them, while still focussing on keeping the user experience simple.
I particularly like the NAS Direct Archive feature, as it shows that Rubrik aren’t simply in this to push a bunch of tin onto their customers. A big part of transformation is about doing things smarter, not just faster. the folks at Rubrik understand that there are other solutions out there that can deliver large capacity solutions for protecting big chunks of data (i.e. NAS workloads), so they’ve focussed on leveraging other capabilities, rather than trying to force their customers to fill their data centres with Rubrik gear. This is the kind of thinking that potential customers should find comforting. I think it’s also the kind of approach that a few other vendors would do well to adopt.
Here’re some links to other articles on Andes from other folks I read that you may find useful:
Here are a few links to some news items and other content that might be useful. Maybe.
- E8 Storage have added Queen Mary University of London as a customer, deploying an E8-D24 array for use with genomic sequencing work. You can read more about that here.
- Something went pear-shaped with my mail provider, so I decided to use Gmail for my NAS email notifications. QNAP has some built-in smarts and works out of the box. For my OpenMediaVault environment, though, notifications with Gmail take a little extra effort. In short, you’ll need to turn on 2FA and create an app password.
- This project seems finished now (and I love it) – Starting to make a PiDP-11/70.
- Preston recently published a good article on space reclamation in deduplication storage.
- If you’ve been involved in storage administration, this short graphic novel from the good doctor, J Metz, might be just your thing.
- Adam J. Bergh did a great write-up on some recently announced NetApp HCI and Veeam integration announcements.
- I’ve been participating in the tech version of “Blogtober” because I was going to be writing and it’s October. You can find out a bit more about it here.
Vembu are a site sponsor of PenguinPunk.net. They’ve asked me to look at their product and write about it. I’m in the early stages of evaluating the BDR Suite in the lab, but thought I’d pass on some information about their upcoming 4.0 release. As always, if you’re interested in these kind of solutions, I’d encourage you to do your own evaluation and get in touch with the vendor, as everyone’s situation and requirements are different. I can say from experience that the Vembu sales and support staff are very helpful and responsive, and should be able to help you with any queries. I recently did a brief article on getting started with BDR Suite 3.9.1 that you can download from here.
So what’s coming in 4.0?
Hyper-V Cluster Backup
Vembu will support backing up VMs in a Hyper-V cluster and, even if VMs configured for backup are moved from one host to another, the incremental backup will continue to happen without any interruption.
Shared VHDx Backup
Vembu now supports backup of the shared VHDx of Hyper-V.
Vembu uses CBT for incremental backups. And for some CBT failure cases they will be using CheckSum for the incremental to happen without any interruption.
No need to enter credentials every time, Vembu Credential Manager now allows you to manage the credentials of the host and the VMs running in it. This will be particularly handy if you’re doing a lot of application-aware backup job configuration.
I had a chance to speak with Vembu about the product’s functionality. There’s a lot to like in terms of breadth of features. I’m interested in seeing how 4.0 goes when it’s released and hope to do a few more articles on the product then. If you’re looking to evaluate the product, this evaluator’s guide is as good place as any to start. As an aside, Vembu are also offering 10% off their suite this Halloween (until November 2nd) – see here for more details.
I’ve been doing some work with Cohesity in our lab and thought it worth covering some of the basic features that I think are pretty neat. In this edition of Cohesity Basics, I thought I’d quickly cover off how to exclude VMs from protection jobs based on assigned tags. In this example I’m using version 6.0.1b_release-20181014_14074e50 (a “feature release”).
The first step is to find the VM in vCenter that you want to exclude from a protection job. Right-click on the VM and select Tags & Custom Attributes. Click on Assign Tag.
In the Assign Tag window, click on the New Tag icon.
Assign a name to the new tag, and add a description if that’s what you’re into.
In this example, I’ve created a tag called “COH-Test”, and put it in the “Backup” category.
Now go to the protection job you’d like to edit.
Click on the Tag icon on the right-hand side. You can then select the tag you created in vCenter. Note that you may need to refresh your vCenter source for this new tag to be reflected.
When you select the tag, you can choose to Auto Protect or Exclude the VM based on the applied tags.
If you drill in to the objects in the protection job, you can see that the VM I wanted to exclude from this job has been excluded based on the assigned tag.
I’ve written enthusiastically about Cohesity’s Auto Protect feature previously. Sometimes, though, you need to exclude VMs from protection jobs. Using tags is a quick and easy way to do this, and it’s something that your virtualisation admin team will be happy to use too.