Cohesity Continues to Evolve


I’ve been following Cohesity for some time now, and have covered a number of their product announcements and saw them in action at Storage Field Day 8. They announced version 3.0 at the end of June, and Gaetan Castelein kindly offered to give me a brief on where they’re at in the lead up to VMworld US.


What’s a Cohesity?

Cohesity’s goal is to take the complexity out of secondary storage. They argue that SDS has done a good job of this on primary storage platforms, but we’ve all ignored the issues around running secondary storage. The primary vehicle for this is Cohesity DataPlatform, combined with Cohesity DataProtect. Cohesity have a number of use cases for the platform that they cover, and I thought it might be handy to go over these here.


Use Case 1 – DataPlatform as a “better backup target”


Cohesity are taking aim at the likes of Data Domain, and are keen to replace them as backup targets. Cohesity tell me that DataPlatform offers the following features:

  • Scale-out platform (with no single point of failure), simple capacity planning, no forklift upgrades;
  • Global deduplication;
  • Native cloud integration;
  • High performance with parallelized ingest; and
  • QoS and multitenancy.

These all seem like nice things to have.


Use Case 2 – Simpler Data Protection


Cohesity tell me that the DataPlatform also makes a great option for VMware-based backups, providing data protection folks with the ability to leverage the following features:

  • Converged infrastructure with single pane of glass;
  • Policy-based automation;
  • Fast SLAs (15 min RPOs and instantaneous RTOs); and
  • Productive data (instant clones for test/dev, deep visibility into the data for indexing, custom analytics, etc).

While the single pane of glass often becomes the single pain, the last point about making data productive, depending on the environment you’re working in, is particularly important. There’re a tonne of enterprises out there where people are following some mighty cumbersome processes on snapshots of data to do analytics on the data. Any platform that makes this easier and more accessible seems like a great idea.


Use Case 3 – NFS & SMB Interfaces


You can also use the DataPlatform for file consolidation. Cohesity have even started positioning a combination of VMware VSAN as your primary storage platform (great for running VMs), with Cohesity offering secondary storage and the ability to deliver it over SMB or NFS. You can read more about this here.


Use Case 4 – Test/Dev


Cohesity’s first foray into the market revolved around providing enhanced capabilities for developers, and this remains a key selling point of the platform, with a full set of APIs exposed (which can be easily leveraged for use with Chef, Puppet, etc).


Use Case 5 – Analytics
Analytics have also been a major part of Cohesity’s early forays into secondary storage, with native reporting providing:

  • Utilization metrics (storage utilization, capacity forecasting); and
  • Performance metrics (ingrest rates, date reduction, IOPS, latency).

There’s also content indexing and search, providing data indexing (index upon ingest, VM and file metadata, files within VMs), and “Google-like” search. You can also access an analytics workbench with built-in MapReduce.


What Have You Done For Me Lately?

So with the Cohesity 3.0 Announcement a bunch of expanded application and OS integrations were announced, with a particular focus on SQL, Exchange, SharePoint, MS Windows, Linux, Oracle DBs (RMAN and remote adapter). Here’s a table that Cohesity provided that covers off a lot of the new features.


In addition to the DataProtect enhancements, a number of enhancements have been made to both the DataPlatform and File Services components of the product. I’m particularly interested in the ROBO solution, and I think this could end up being a very clever attempt by Cohesity at capturing the secondary storage market at a very broad level.




Cohesity have been moving ahead in leaps and bounds, and I’ve been impressed by what they’ve had to say, and the development of their narrative compared to some of the earlier messaging. It remains to be seen whether they’ll get to where they want to be, but I think they’re giving it a good shake. They’ll be present at VMworld US next week (Booth 827), where you can hear more about what they’re doing with VSAN and vRealize Automation.

EMC – MirrorView Secondary Image States

I sometimes get asked what the definition of these states is, and I frequently have trouble defining it clearly. Fortunately, EMC’s MirrorView Knowledgebook has been updated to incorporate Release 32, and Appendix A has some succinct definitions. If you can’t be bothered looking them up for yourself, here they are.

  • Synchronized – The secondary image is identical to the primary. This state persists only until the next write to the primary image, at which time the image state becomes Consistent.
  • Consistent – The secondary image is identical to either the current primary image or to some previous instance of the primary image. This means that the secondary image is available for recovery when you promote it.
  • Synchronizing – The software is applying changes to the secondary image to mirror the primary image, but the current contents of the secondary are not known and are not usable for recovery.
  • Out-of-Sync – The secondary image requires synchronization with the primary image. The image is unusable for recovery.
  • Rolling Back (MV/A only) – A successful promotion occurred where there was an unfinished update to the secondary image. This state persists until the Rollback operation completes.

I think one of the key things here is to pay attention to the various image states, particularly if you’re seeing a lot of out-of sync states on your secondaries. You don’t want to have to explain to people why they can’t recover secondaries in the event of a serious failure. And, more importantly, get on Powerlink and check out the MirrorView Knowledgebook (H2417).