Random Short Take #13

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Let’s dive in to lucky number 13.

Cohesity Marketplace – A Few Notes

 

Cohesity first announced their Marketplace offering in late February. I have access to a Cohesity environment (physical and virtual) in my lab, and I’ve recently had the opportunity to get up and running on some of the Marketplace-ready code, so I thought I’d share my experiences here.

 

Prerequisites

I’m currently running version 6.2 of Cohesity’s DataPlatform. I’m not sure whether this is widely available yet or still only available for early adopter testing. My understanding is that the Marketplace feature will be made generally available to Cohesity customers when 6.3 ships. The Cohesity team did install a minor patch (6.2a) on my cluster as it contained some small but necessary fixes. In this version of the code, a gflag is set to show the Apps menu. The “Enable Apps Management” in the UI under Admin – Cluster Settings was also enabled. You’ll also need to nominate an unused private subnet for the apps to use.

 

Current Application Availability

The Cohesity Marketplace has a number of Cohesity-developed and third-party apps available to install, including:

  • Splunk – Turn machine data into answers
  • SentinelOne – AI-powered threat prevention purpose built for Cohesity
  • Imanis Data – NoSQL backup, recovery, and replication
  • Cohesity Spotlight – Analyse file audit logs and find anomalous file-access patterns
  • Cohesity Insight – Search inside unstructured data
  • Cohesity EasyScript – Create, upload, and execute customised scripts
  • ClamAV – Anti-virus scans for file data

Note that none of the apps need more than Read permissions on the nominated View(s).

 

Process

App Installation

To install the app you want to run on your cluster, click on “Get App”, then enter your Helios credentials.

Review the EULA and click on “Accept & Get” to proceed. You’ll then be prompted to select the cluster(s) you want to deploy the app on. In this example, I have 5 clusters in my Helios environment. I want to install the app on C1, as it’s the physical cluster.

Using An App

Once your app is installed, it’s fairly straightforward to run it. Click on More, then Apps to access your installed apps.

 

Then you just need to click on “Run App” to get started

You’ll be prompted to set the Read Permissions for the App, along with QoS. It’s my understanding that the QoS settings are relative to other apps running on the cluster, not data protection activities, etc. The Read Permissions are applied to one or more Views. This can be changed after the initial configuration. Once the app is running you can click on Open App. In this example I’m using the Cohesity Insight app to look through some unstructured data stored on a View.

 

Thoughts

I’ve barely scratched the surface of what you achieve with the Marketplace on Cohesity’s DataPlatform. The availability of the Marketplace (and the ability to run apps on the platform) is another step closer to Cohesity’s vision of extracting additional value from secondary storage. Coupled with Cohesity’s C4000 series hardware (or perhaps whatever flavour you want to run from Cisco or HPE or the like), I can imagine you’re going to be able to do a heck a lot with this capability, particularly as more apps are validated with the platform.

I hope to do a lot more testing of this capability over the next little while, and I’ll endeavour to report back with my findings. If you’re a current Cohesity customer and haven’t talked to your account team about this capability, it’s worth getting in touch to see what you can do in terms of an evaluation. Of course, it’s also worth noting that, as with most things technology related, just because you can, doesn’t always mean you should. But if you have the use case, this is a cool capability on top of an already interesting platform.

Cohesity Is (Data)Locked In

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Cohesity recently presented at Storage Field Day 18. You can see their videos from Storage Field Day 18 here, and download a PDF copy of my rough notes from here.

 

The Cohesity Difference?

Cohesity covered a number of different topics in its presentation, and I thought I’d outline some of the Cohesity features before I jump into the meat and potatoes of my article. Some of the key things you get with Cohesity are:

  • Global space efficiency;
  • Data mobility;
  • Data resiliency & compliance;
  • Instant mass restore; and
  • Apps integration.

I’m going to cover 3 of the 5 here, and you can check the videos for details of the Cohesity MarketPlace and the Instant Mass Restore demonstration.

Global Space Efficiency

One of the big selling points for the Cohesity data platform is the ability to deliver data reduction and small file optimisation.

  • Global deduplication
    • Modes: inline, post-process
  • Archive to cloud is also deduplicated
  • Compression
    • Zstandard algorithm (read more about that here)
  • Small file optimisation
    • Better performance for reads and writes
    • Benefits from deduplication and compression

Data Mobility

There’s also an excellent story when it comes to data mobility, with the platform delivering the following data mobility features:

  • Data portability across clouds
  • Multi-cloud replication and archival (1:many)
  • Integrated indexing and search across locations

You also get simultaneous, multi-protocol access and a comprehensive set of file permissions to work with.

 

But What About Archives And Stuff?

Okay, so all of that stuff is really cool, and I could stop there and you’d probably be happy enough that Cohesity delivers the goods when it comes to a secondary storage platform that delivers a variety of features. In my opinion, though, it gets a lot more interesting when you have a look at some of the archival features that are built into the platform.

Flexible Archive Solutions

  • Archive either on-premises or to cloud;
  • Policy driven archival schedule for long term data retention
  • Data an be retrieved to the same or a different Cohesity cluster; and
  • Archived data is subject to further deduplication.

Data Resiliency and Compliance – ensures data integrity

  • Erasure coding;
  • Highly available; and
  • DataLock and legal hold.

Achieving Compliance with File-level DataLock

In my opinion, DataLock is where it gets interesting in terms of archive compliance.

  • DataLock enables WORM functionality at a file level;
  • DataLock adheres to regulatory acts;
  • Can automatically lock a file after a period of inactivity;
  • Files can be locked manually by setting file attributes;
  • Minimum and maximum retention times can be set; and
  • Cohesity provides a unique RBAC role for Data Security administration.

DataLock on Backups

  • DataLock enables WORM functionality;
  • Prevent changes by locking Snapshots;
  • Applied via backup policy; and
  • Operations performed by Data Security administrators.

 

Ransomware Detection

Cohesity also recently announced the ability to look within Helios for Ransomware. The approach taken is as follows: Prevent. Detect. Respond.

Prevent

There’s some good stuff built into the platform to help prevent ransomware in the first place, including:

  • Immutable file system
  • DataLock (WORM)
  • Multi-factor authentication

Detect

  • Machine-driven anomaly detection (backup data, unstructured data)
  • Automated alert

Respond

  • Scalable file system to store years worth of backup copies
  • Google-like global actionable search
  • Instant mass restore

 

Thoughts and Further Reading

The conversation with Cohesity got a little spirited in places at Storage Field Day 18. This isn’t unusual, as Cohesity has had some problems in the past with various folks not getting what they’re on about. Is it data protection? Is it scale-out NAS? Is it an analytics platform? There’s a lot going on here, and plenty of people (both inside and outside Cohesity) have had a chop at articulating the real value of the solution. I’m not here to tell you what it is or isn’t. I do know that a lot of the cool stuff with Cohesity wasn’t readily apparent to me until I actually had some stick time with the platform and had a chance to see some of its key features in action.

The DataLock / Security and Compliance piece is interesting to me though. I’m continually asking vendors what they’re doing in terms of archive platforms. A lot of them look at me like I’m high. Why wouldn’t you just use software to dump your old files up to the cloud or onto some cheap and deep storage in your data centre? After all, aren’t we all using software-defined data centres now? That’s certainly an option, but what happens when that data gets zapped? What if the storage platform you’re using, or the software you’re using to store the archive data, goes bad and deletes the data you’re managing with it? Features such as DataLock can help with protecting you from some really bad things happening.

I don’t believe that data protection data should be treated as an “archive” as such, although I think that data protection platform vendors such as Cohesity are well placed to deliver “archive-like” solutions for enterprises that need to retain protection data for long periods of time. I still think that pushing archive data to another, dedicated, tier is a better option than simply calling old protection data “archival”. Given Cohesity’s NAS capabilities, it makes sense that they’d be an attractive storage target for dedicated archive software solutions.

I like what Cohesity have delivered to date in terms of a platform that can be used to deliver data insights to derive value for the business. I think sometimes the message is a little muddled, but in my opinion some of that is because everyone’s looking for something different from these kinds of platforms. And these kinds of platforms can do an awful lot of things nowadays, thanks in part to some pretty smart software and some grunty hardware. You can read some more about Cohesity’s Security and Compliance story here,  and there’s a fascinating (if a little dated) report from Cohasset Associates on Cohesity’s compliance capabilities that you can access here. My good friend Keith Townsend also provided some thoughts on Cohesity that you can read here.

Random Short Take #11

Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe. Happy New Year too. I hope everyone’s feeling fresh and ready to tackle 2019.

  • I’m catching up with the good folks from Scale Computing in the next little while, but in the meantime, here’s what they got up to last year.
  • I’m a fan of the fruit company nowadays, but if I had to build a PC, this would be it (hat tip to Stephen Foskett for the link).
  • QNAP announced the TR-004 over the weekend and I had one delivered on Tuesday. It’s unusual that I have cutting edge consumer hardware in my house, so I’ll be interested to see how it goes.
  • It’s not too late to register for Cohesity’s upcoming Helios webinar. I’m looking forward to running through some demos with Jon Hildebrand and talking about how Helios helps me manage my Cohesity environment on a daily basis.
  • Chris Evans has published NVMe in the Data Centre 2.0 and I recommend checking it out.
  • I went through a basketball card phase in my teens. This article sums up my somewhat confused feelings about the card market (or lack thereof).
  • Elastifile Cloud File System is now available on the AWS Marketplace – you can read more about that here.
  • WekaIO have posted some impressive numbers over at spec.org if you’re into that kind of thing.
  • Applications are still open for vExpert 2019. If you haven’t already applied, I recommend it. The program is invaluable in terms of vendor and community engagement.

 

 

Cohesity – Helios Article and Upcoming Webinar

I’ve written about Cohesity’s Helios offering previously, and also wrote a short article on upgrading multiple clusters using Helios. I think it’s a pretty neat offering, so to that end I’ve written an article on Cohesity’s blog about some of the cool stuff you can do with Helios. I’m also privileged to be participating in a webinar in late January with Cohesity’s Jon Hildebrand. We’ll be running through some of these features from a more real-world perspective, including doing silly things like live demos. You can get further details on the webinar here.

Random Short Take #10

Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe. This will be the last one for this year. I hope you and yours have a safe and merry Christmas / holiday break.

  • Scale Computing have finally entered the Aussie market in partnership with Amnesium. You can read more about that here
  • Alastair is back in the classroom, teaching folks about AWS. He published a bunch of very useful notes from a recent class here.
  • The folks at Backblaze are running a “Refer-A-Friend” promotion. If you’re looking to become a new Backblaze customer and sign up with my referral code, you’ll get some free time on your account. And I will too! Hooray! I’ve waxed lyrical about Backblaze before, and I recommend it. The offer runs out on January 6th 2019, so get a move on.
  • Howard did a nice article on VVols that I recommend checking out.
  • GDPR has been a challenge (within and outside the EU), but I enjoyed Mark Browne‘s take on Cohesity’s GDPR compliance.
  • I’m quite a fan of the Netflix Tech Blog, and this article on the Netflix Media Database was a ripper.
  • From time to time I like to poke fun at my friends in the US for what seems like an excessive amount of shenanigans happening in that country, but there’s plenty of boneheaded stuff happening in Australia too. Read Preston’s article on the recently passed anti-encryption laws to get a feel for the heady heights of stupidity that we’ve been able to reach recently.

 

Updated Articles Page

I recently had the opportunity to upgrade my Cohesity lab environment using Helios and thought I’d run through the basics. There’s a new document outlining the process on the articles page.

Cohesity – Cohesity Cluster Virtual Edition ESXi – A Few Notes

I’ve covered the Cohesity appliance deployment in a howto article previously. I’ve also made use of the VMware-compatible Virtual Edition in our lab to test things like cluster to cluster replication and cloud tiering. The benefits of virtual appliances are numerous. They’re generally easy to deploy, don’t need dedicated hardware, can be re-deployed quickly when you break something, and can be a quick and easy way to validate a particular process or idea. They can also be a problem with regards to performance, and are at the mercy of the platform administrator to a point. But aren’t we all? With 6.1, Cohesity have made available a clustered virtual edition (the snappily titled Cohesity Cluster Virtual Edition ESXi). If you have access to the documentation section of the Cohesity support site, there’s a PDF you can download that explains everything. I won’t go into too much detail but there are a few things to consider before you get started.

 

Specifications

Base Appliance 

Just like the non-clustered virtual edition, there’s a small and large configuration you can choose from. The small configuration supports up to 8TB for the Data disk, while the large configuration supports up to 16TB for the Data disk. The small config supports 4 vCPUs and 16GB of memory, while the large configuration supports 8 vCPUs and 32GB of memory.

Disk Configuration

Once you’ve deployed the appliance, you’ll need to add the Metadata disk and Data disk to each VM. The Metadata disk should be between 512GB and 1TB. For the large configuration, you can also apparently configure 2x 512GB disks, but I haven’t tried this. The Data disk needs to be between 512GB and 8TB for the small configuration and up to 16TB for the large configuration (with support for 2x 8TB disks). Cohesity recommends that these are formatted as Thick Provision Lazy Zeroed and deployed in Independent – Persistent mode. Each disk should be attached to its own SCSI controller as well, so you’ll have the system disk on SCSI 0:0, the Metadata disk on SCSI 1:0, and so on.

I did discover a weird issue when deploying the appliance on a Pure Storage FA-450 array in the lab. In vSphere this particular array’s datastore type is identified by vCenter as “Flash”. For my testing I had a 512GB Metadata disk and 3TB Data disk configured on the same datastore, with the three nodes living on three different datastores on the FlashArray. This caused errors with the cluster configuration, with the configuration wizard complaining that my SSD volumes were too big.

I moved the Data disk (with storage vMotion) to an all flash Nimble array (that for some reason was identified by vSphere as “HDD”) and the problem disappeared. Interestingly I didn’t have this problem with the single node configuration of 6.0.1 deployed with the same configuration. I raised a ticket with Cohesity support and they got back to me stating that this was expected behaviour in 6.1.0a. They tell me, however, that they’ve modified the behaviour of the configuration routine in an upcoming version so fools like me can run virtualised secondary storage on primary storage.

Erasure Coding

You can configure the appliance for increased resiliency at the Storage Domain level as well. If you go to Platform – Cluster – Storage Domains you can modify the DefaultStorageDomain (and other ones that you may have created). Depending on the size of the cluster you’ve deployed, you can choose the number of failures to tolerate and whether or not you want erasure coding enabled.

You can also decide whether you want EC to be a post-process activity or something that happens inline.

 

Process

Once you’ve deployed (a minimum) 3 copies of the Clustered VE, you’ll need to manually add Metadata and Data disks to each VM. The specifications for these are listed above. Fire up the VMs and go to the IP of one of the nodes. You’ll need to log in as the admin user with the appropriate password and you can then start the cluster configuration.

This bit is pretty much the same as any Cohesity cluster deployment, and you’ll need to specify things like a hostname for the cluster partition. As always, it’s a good idea to ensure your DNS records are up to date. You can get away with using IP addresses but, frankly, people will talk about you behind your back if you do.

At this point you can also decide to enable encryption at the cluster level. If you decide not to enable it you can do this on a per Domain basis later.

Click on Create Cluster and you should see something like the following screen.

Once the cluster is created, you can hit the virtual IP you’ve configured, or any one of the attached nodes, to log in to the cluster. Once you log in, you’ll need to agree to the EULA and enter a license key.

 

Thoughts

The availability of virtual appliance versions for storage and data protection solutions isn’t a new idea, but it’s certainly one I’m a big fan of. These things give me an opportunity to test new code releases in a controlled environment before pushing updates into my production environment. It can help with validating different replication topologies quickly, and validating other configuration ideas before putting them into the wild (or in front of customers). Of course, the performance may not be up to scratch for some larger environments, but for smaller deployments and edge or remote office solutions, you’re only limited by the available host resources (which can be substantial in a lot of cases). The addition of a clustered version of the virtual edition for ESXi and Hyper-V is a welcome sight for those of us still deploying on-premises Cohesity solutions (I think the Azure version has been clustered for a few revisions now). It gets around the main issue of resiliency by having multiple copies running, and can also address some of the performance concerns associated with running virtual versions of the appliance. There are a number of reasons why it may not be the right solution for you, and you should work with your Cohesity team to size any solution to fit your environment. But if you’re running Cohesity in your environment already, talk to your account team about how you can leverage the virtual edition. It really is pretty neat. I’ll be looking into the resiliency of the solution in the near future and will hopefully be able to post my findings in the next few weeks.

Updated Articles Page

I recently had the opportunity to configure multi-tenancy in my Cohesity lab environment and thought I’d run through the basics. There’s a new document outlining the process on the articles page.

Cohesity Basics – Excluding VMs Using Tags

I’ve been doing some work with Cohesity in our lab and thought it worth covering some of the basic features that I think are pretty neat. In this edition of Cohesity Basics, I thought I’d quickly cover off how to exclude VMs from protection jobs based on assigned tags. In this example I’m using version 6.0.1b_release-20181014_14074e50 (a “feature release”).

 

Process

The first step is to find the VM in vCenter that you want to exclude from a protection job. Right-click on the VM and select Tags & Custom Attributes. Click on Assign Tag.

In the Assign Tag window, click on the New Tag icon.

Assign a name to the new tag, and add a description if that’s what you’re into.

In this example, I’ve created a tag called “COH-Test”, and put it in the “Backup” category.

Now go to the protection job you’d like to edit.

Click on the Tag icon on the right-hand side. You can then select the tag you created in vCenter. Note that you may need to refresh your vCenter source for this new tag to be reflected.

When you select the tag, you can choose to Auto Protect or Exclude the VM based on the applied tags.

If you drill in to the objects in the protection job, you can see that the VM I wanted to exclude from this job has been excluded based on the assigned tag.

 

Thoughts

I’ve written enthusiastically about Cohesity’s Auto Protect feature previously. Sometimes, though, you need to exclude VMs from protection jobs. Using tags is a quick and easy way to do this, and it’s something that your virtualisation admin team will be happy to use too.