Random Short Take #18

Here are some links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 18 – buckle up kids! It’s all happening.

  • Cohesity added support for Active Directory protection with version 6.3 of the DataPlatform. Matt covered it pretty comprehensively here.
  • Speaking of Cohesity, Alastair wrote this article on getting started with the Cohesity PowerShell Module.
  • In keeping with the data protection theme (hey, it’s what I’m into), here’s a great article from W. Curtis Preston on SaaS data protection, and what you need to consider to not become another cautionary tale on the Internet. Curtis has written a lot about data protection over the years, and you could do a lot worse than reading what he has to say. And that’s not just because he signed a book for me.
  • Did you ever stop and think just how insecure some of the things that you put your money into are? It’s a little scary. Shell are doing some stuff with Cybera to improve things. Read more about that here.
  • I used to work with Vincent, and he’s a super smart guy. I’ve been at him for years to start blogging, and he’s started to put out some articles. He’s very good at taking complex topics and distilling them down to something that’s easy to understand. Here’s his summary of VMware vRealize Automation configuration.
  • Tom’s take on some recent CloudFlare outages makes for good reading.
  • Google Cloud has announced it’s acquiring Elastifile. That part of the business doesn’t seem to be as brutal as the broader Alphabet group when it comes to acquiring and discarding companies, and I’m hoping that the good folks at Elastifile are looked after. You can read more on that here.
  • A lot of people are getting upset with terms like “disaggregated HCI”. Chris Mellor does a bang up job explaining the differences between the various architectures here. It’s my belief that there’s a place for all of this, and assuming that one architecture will suit every situation is a little naive. But what do I know?

Random Short Take #17

Here are some links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 17 – am I over-sharing? There’s so much I want you to know about.

  • I seem to always be including a link from the Backblaze blog. That’s mainly because they write about things I’m interested in. In this case, they’ve posted an article discussing the differences between availability and durability that I think is worth your time.
  • Speaking of interesting topics, Preston posted an article on NetWorker Pools with Data Domain that’s worth looking at if you’re into that kind of thing.
  • Maintaining the data protection theme, Alastair wrote an interesting article titled “The Best Automation Is One You Don’t Write” (you know, like the best IO is one you don’t need to do?) as part of his work with Cohesity. It’s a good article, and not just because he mentions my name in it.
  • I recently wanted to change the edition of Microsoft Office I was using on my MacBook Pro and couldn’t really work out how to do it. In the end, the answer is simple. Download a Microsoft utility to remove your Office licenses, and then fire up an Office product and it will prompt you to re-enter your information at that point.
  • This is an old article, but it answered my question about validating MD5 checksums on macOS.
  • Excelero have been doing some cool stuff with Imperial College London – you can read more about that here.
  • Oh hey, Flixster Video is closing down. I received this in my inbox recently: “[f]ollowing the announcement by UltraViolet that it will be discontinuing its service on July 31, 2019, we are writing to provide you notice that Flixster Video is planning to shut down its website, applications and operations on October 31, 2019”. It makes sense, obviously, given UltraViolet’s demise, but it still drives me nuts. The ephemeral nature of digital media is why I still have a house full of various sized discs with various kinds of media stored on them. I think the answer is to give yourself over to the streaming lifestyle, and understand that you’ll never “own” media like you used to think you did. But I can’t help but feel like people outside of the US are getting shafted in that scenario.
  • In keeping up with the “random” theme of these posts, it was only last week that I learned that “Television, the Drug of the Nation” from the very excellent album “Hypocrisy Is the Greatest Luxury” by The Disposable Heroes of Hiphoprisy was originally released by Michael Franti and Rono Tse when they were members of The Beatnigs. If you’re unfamiliar with any of this I recommend you check them out.

Cohesity Basics – Configuring An External Target For Cloud Archive

I’ve been working in the lab with Pure Storage’s ObjectEngine and thought it might be nice to document the process to set it up as an external target for use with Cohesity’s Cloud Archive capability. I’ve written in the past about Cloud Tier and Cloud Archive, but in that article I focused more on the Cloud Tier capability. I don’t want to sound too pretentious, but I’ll quote myself from the other article: “With Cloud Archive you can send copies of snapshots up to the cloud to keep as a copy separate to the backup data you might have replicated to a secondary appliance. This is useful if you have some requirement to keep a monthly or six-monthly copy somewhere for compliance reasons.”

I would like to be clear that this process hasn’t been blessed or vetted by Pure Storage or Cohesity. I imagine they are working on delivering a validated solution at some stage, as they have with Veeam and Commvault. So don’t go out and jam this in production and complain to me when Pure or Cohesity tell you it’s wrong.

There are a couple of ways you can configure an external target via the Cohesity UI. In this example, I’ll do it from the dashboard, rather than during the protection job configuration. Click on Protection and select External Target.

You’ll then be presented with the New Target configuration dialogue.

In this example, I’m calling my external target PureOE, and setting its purpose as Archival (as opposed to Tiering).

The Type of target is “S3 Compatible”.

Once you select that, you’ll be asked for a bunch of S3-type information, including Bucket Name and Access Key ID. This assumes you’ve already created the bucket and configured appropriate security on the ObjectEngine side of things.

Enter the required information. I’ve de-selected compression and source side deduplication, as I’m wanting that the data reduction to be done by the ObjectEngine. I’ve also disabled encryption, as I’m guessing this will have an impact on the ObjectEngine as well. I need to confirm that with my friends at Pure. I’m using the fully qualified domain name of the ObjectEngine as the endpoint here as well.

Once you click on Register, you’ll be presented with a summary of the configuration.

You’re then right to use this as an external target for Archival parts of protection jobs within your Cohesity environment. Once you’ve run a few protection jobs, you should start to see files within the test bucket on the ObjectEngine. Don’t forget that, as fas as I’m aware, it’s still very difficult (impossible?) to remove external targets from the the Cohesity Data Platform, so don’t get too carried away with configuring a bunch of different test targets thinking that you can remove them later.

Random Short Take #16

Here are a few links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 16 – please enjoy these semi-irregular updates.

  • Scale Computing has been doing a bit in the healthcare sector lately – you can read news about that here.
  • This was a nice roundup of the news from Apple’s recent WWDC from Six Colors. Hat tip to Stephen Foskett for the link. Speaking of WWDC news, you may have been wondering what happened to all of your purchased content with the imminent demise of iTunes on macOS. It’s still a little fuzzy, but this article attempts to shed some light on things. Spoiler: you should be okay (for the moment).
  • There’s a great post on the Dropbox Tech Blog from James Cowling discussing the mission versus the system.
  • The more things change, the more they remain the same. For years I had a Windows PC running Media Center and recording TV. I used IceTV as the XMLTV-based program guide provider. I then started to mess about with some HDHomeRun devices and the PC died and I went back to a traditional DVR arrangement. Plex now has DVR capabilities and it has been doing a reasonable job with guide data (and recording in general), but they’ve decided it’s all a bit too hard to curate guides and want users (at least in Australia) to use XMLTV-based guides instead. So I’m back to using IceTV with Plex. They’re offering a free trial at the moment for Plex users, and setup instructions are here. No, I don’t get paid if you click on the links.
  • Speaking of axe-throwing, the Cohesity team in Queensland is organising a social event for Friday 21st June from 2 – 4 pm at Maniax Axe Throwing in Newstead. You can get in contact with Casey if you’d like to register.
  • VeeamON Forum Australia is coming up soon. It will be held at the Hyatt Regency Hotel in Sydney on July 24th and should be a great event. You can find out more information and register for it here. The Vanguards are also planning something cool, so hopefully we’ll see you there.
  • Speaking of Veeam, Anthony Spiteri recently published his longest title in the Virtualization is Life! catalogue – Orchestration Of NSX By Terraform For Cloud Connect Replication With vCloud Director. It’s a great article, and worth checking out.
  • There’s a lot of talk and slideware devoted to digital transformation, and a lot of it is rubbish. But I found this article from Chin-Fah to be particularly insightful.

Cohesity Basics – Excluding VMs Using Tags – Real World Example

I’ve written before about using VM tags with Cohesity to exclude VMs from a backup. I wanted to write up a quick article using a real world example in the test lab. In this instance, we had someone deploying 200 VMs over a weekend to test a vendor’s storage array with a particular workload. The problem was that I had Cohesity set to automatically protect any new VMs that are deployed in the lab. This wasn’t a problem from a scalability perspective. Rather, the problem was that we were backing up a bunch of test data that didn’t dedupe well and didn’t need to be protected by what are ultimately finite resources.

As I pointed out in the other article, creating tags for VMs and using them as a way to exclude workloads from Cohesity is not a new concept, and is fairly easy to do. You can also apply the tags in bulk using the vSphere Web Client if you need to. But a quicker way to do it (and something that can be done post-deployment) is to use PowerCLI to search for VMs with a particular naming convention and apply the tags to those.

Firstly, you’ll need to log in to your vCenter.

PowerCLI C:\> Connect-VIServer vCenter

In this example, the test VMs are deployed with the prefix “PSV”, so this makes it easy enough to search for them.

PowerCLI C:\> get-vm | where {$_.name -like "PSV*"} | New-TagAssignment -Tag "COH-NoBackup"

This assumes that the tag already exists on the vCenter side of things, and you have sufficient permissions to apply tags to VMs. You can check your work with the following command.

PowerCLI C:\> get-vm | where {$_.name -like "PSV*"} | Get-TagAssignment

One thing to note. If you’ve updated the tags of a bunch of VMs in your vCenter environment, you may notice that the objects aren’t immediately excluded from the Protection Job on the Cohesity side of things. The reason for this is that, by default, Cohesity only refreshes vCenter source data every 4 hours. One way to force the update is to manually refresh the source vCenter in Cohesity. To do this, go to Protection -> Sources. Click on the ellipsis on the right-hand side of your vCenter source you’d like to refresh, and select Refresh.

You’ll then see that the tagged VMs are excluded in the Protection Job. Hat tip to my colleague Mike for his help with PowerCLI. And hat tip to my other colleague Mike for causing the problem in the first place.

Random Short Take #14

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Episode 14 – giddy-up!

Brisbane VMUG – May 2019

hero_vmug_express_2011

The May 2019 edition of the Brisbane VMUG meeting will be held on Tuesday 28th May at Fishburners from 4pm – 6pm. It’s sponsored by Cohesity and promises to be a great afternoon.

Here’s the agenda:

  • VMUG Intro
  • Cohesity Presentation: Changing Data Protection from Nightmares to Sweet Dreams
  • vCommunity Presentation – Introduction to Hyper-converged Infrastructure
  • Q&A
  • Light refreshments.

Cohesity have gone to great lengths to make sure this will be a fun and informative session and I’m really looking forward to hearing about how they can make recovery simple. You can find out more information and register for the event here. I hope to see you there. Also, if you’re interested in sponsoring one of these events, please get in touch with me and I can help make it happen.

Random Short Take #13

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Let’s dive in to lucky number 13.

Cohesity Marketplace – A Few Notes

 

Cohesity first announced their Marketplace offering in late February. I have access to a Cohesity environment (physical and virtual) in my lab, and I’ve recently had the opportunity to get up and running on some of the Marketplace-ready code, so I thought I’d share my experiences here.

 

Prerequisites

I’m currently running version 6.2 of Cohesity’s DataPlatform. I’m not sure whether this is widely available yet or still only available for early adopter testing. My understanding is that the Marketplace feature will be made generally available to Cohesity customers when 6.3 ships. The Cohesity team did install a minor patch (6.2a) on my cluster as it contained some small but necessary fixes. In this version of the code, a gflag is set to show the Apps menu. The “Enable Apps Management” in the UI under Admin – Cluster Settings was also enabled. You’ll also need to nominate an unused private subnet for the apps to use.

 

Current Application Availability

The Cohesity Marketplace has a number of Cohesity-developed and third-party apps available to install, including:

  • Splunk – Turn machine data into answers
  • SentinelOne – AI-powered threat prevention purpose built for Cohesity
  • Imanis Data – NoSQL backup, recovery, and replication
  • Cohesity Spotlight – Analyse file audit logs and find anomalous file-access patterns
  • Cohesity Insight – Search inside unstructured data
  • Cohesity EasyScript – Create, upload, and execute customised scripts
  • ClamAV – Anti-virus scans for file data

Note that none of the apps need more than Read permissions on the nominated View(s).

 

Process

App Installation

To install the app you want to run on your cluster, click on “Get App”, then enter your Helios credentials.

Review the EULA and click on “Accept & Get” to proceed. You’ll then be prompted to select the cluster(s) you want to deploy the app on. In this example, I have 5 clusters in my Helios environment. I want to install the app on C1, as it’s the physical cluster.

Using An App

Once your app is installed, it’s fairly straightforward to run it. Click on More, then Apps to access your installed apps.

 

Then you just need to click on “Run App” to get started

You’ll be prompted to set the Read Permissions for the App, along with QoS. It’s my understanding that the QoS settings are relative to other apps running on the cluster, not data protection activities, etc. The Read Permissions are applied to one or more Views. This can be changed after the initial configuration. Once the app is running you can click on Open App. In this example I’m using the Cohesity Insight app to look through some unstructured data stored on a View.

 

Thoughts

I’ve barely scratched the surface of what you achieve with the Marketplace on Cohesity’s DataPlatform. The availability of the Marketplace (and the ability to run apps on the platform) is another step closer to Cohesity’s vision of extracting additional value from secondary storage. Coupled with Cohesity’s C4000 series hardware (or perhaps whatever flavour you want to run from Cisco or HPE or the like), I can imagine you’re going to be able to do a heck a lot with this capability, particularly as more apps are validated with the platform.

I hope to do a lot more testing of this capability over the next little while, and I’ll endeavour to report back with my findings. If you’re a current Cohesity customer and haven’t talked to your account team about this capability, it’s worth getting in touch to see what you can do in terms of an evaluation. Of course, it’s also worth noting that, as with most things technology related, just because you can, doesn’t always mean you should. But if you have the use case, this is a cool capability on top of an already interesting platform.

Cohesity Is (Data)Locked In

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Cohesity recently presented at Storage Field Day 18. You can see their videos from Storage Field Day 18 here, and download a PDF copy of my rough notes from here.

 

The Cohesity Difference?

Cohesity covered a number of different topics in its presentation, and I thought I’d outline some of the Cohesity features before I jump into the meat and potatoes of my article. Some of the key things you get with Cohesity are:

  • Global space efficiency;
  • Data mobility;
  • Data resiliency & compliance;
  • Instant mass restore; and
  • Apps integration.

I’m going to cover 3 of the 5 here, and you can check the videos for details of the Cohesity MarketPlace and the Instant Mass Restore demonstration.

Global Space Efficiency

One of the big selling points for the Cohesity data platform is the ability to deliver data reduction and small file optimisation.

  • Global deduplication
    • Modes: inline, post-process
  • Archive to cloud is also deduplicated
  • Compression
    • Zstandard algorithm (read more about that here)
  • Small file optimisation
    • Better performance for reads and writes
    • Benefits from deduplication and compression

Data Mobility

There’s also an excellent story when it comes to data mobility, with the platform delivering the following data mobility features:

  • Data portability across clouds
  • Multi-cloud replication and archival (1:many)
  • Integrated indexing and search across locations

You also get simultaneous, multi-protocol access and a comprehensive set of file permissions to work with.

 

But What About Archives And Stuff?

Okay, so all of that stuff is really cool, and I could stop there and you’d probably be happy enough that Cohesity delivers the goods when it comes to a secondary storage platform that delivers a variety of features. In my opinion, though, it gets a lot more interesting when you have a look at some of the archival features that are built into the platform.

Flexible Archive Solutions

  • Archive either on-premises or to cloud;
  • Policy driven archival schedule for long term data retention
  • Data an be retrieved to the same or a different Cohesity cluster; and
  • Archived data is subject to further deduplication.

Data Resiliency and Compliance – ensures data integrity

  • Erasure coding;
  • Highly available; and
  • DataLock and legal hold.

Achieving Compliance with File-level DataLock

In my opinion, DataLock is where it gets interesting in terms of archive compliance.

  • DataLock enables WORM functionality at a file level;
  • DataLock adheres to regulatory acts;
  • Can automatically lock a file after a period of inactivity;
  • Files can be locked manually by setting file attributes;
  • Minimum and maximum retention times can be set; and
  • Cohesity provides a unique RBAC role for Data Security administration.

DataLock on Backups

  • DataLock enables WORM functionality;
  • Prevent changes by locking Snapshots;
  • Applied via backup policy; and
  • Operations performed by Data Security administrators.

 

Ransomware Detection

Cohesity also recently announced the ability to look within Helios for Ransomware. The approach taken is as follows: Prevent. Detect. Respond.

Prevent

There’s some good stuff built into the platform to help prevent ransomware in the first place, including:

  • Immutable file system
  • DataLock (WORM)
  • Multi-factor authentication

Detect

  • Machine-driven anomaly detection (backup data, unstructured data)
  • Automated alert

Respond

  • Scalable file system to store years worth of backup copies
  • Google-like global actionable search
  • Instant mass restore

 

Thoughts and Further Reading

The conversation with Cohesity got a little spirited in places at Storage Field Day 18. This isn’t unusual, as Cohesity has had some problems in the past with various folks not getting what they’re on about. Is it data protection? Is it scale-out NAS? Is it an analytics platform? There’s a lot going on here, and plenty of people (both inside and outside Cohesity) have had a chop at articulating the real value of the solution. I’m not here to tell you what it is or isn’t. I do know that a lot of the cool stuff with Cohesity wasn’t readily apparent to me until I actually had some stick time with the platform and had a chance to see some of its key features in action.

The DataLock / Security and Compliance piece is interesting to me though. I’m continually asking vendors what they’re doing in terms of archive platforms. A lot of them look at me like I’m high. Why wouldn’t you just use software to dump your old files up to the cloud or onto some cheap and deep storage in your data centre? After all, aren’t we all using software-defined data centres now? That’s certainly an option, but what happens when that data gets zapped? What if the storage platform you’re using, or the software you’re using to store the archive data, goes bad and deletes the data you’re managing with it? Features such as DataLock can help with protecting you from some really bad things happening.

I don’t believe that data protection data should be treated as an “archive” as such, although I think that data protection platform vendors such as Cohesity are well placed to deliver “archive-like” solutions for enterprises that need to retain protection data for long periods of time. I still think that pushing archive data to another, dedicated, tier is a better option than simply calling old protection data “archival”. Given Cohesity’s NAS capabilities, it makes sense that they’d be an attractive storage target for dedicated archive software solutions.

I like what Cohesity have delivered to date in terms of a platform that can be used to deliver data insights to derive value for the business. I think sometimes the message is a little muddled, but in my opinion some of that is because everyone’s looking for something different from these kinds of platforms. And these kinds of platforms can do an awful lot of things nowadays, thanks in part to some pretty smart software and some grunty hardware. You can read some more about Cohesity’s Security and Compliance story here,  and there’s a fascinating (if a little dated) report from Cohasset Associates on Cohesity’s compliance capabilities that you can access here. My good friend Keith Townsend also provided some thoughts on Cohesity that you can read here.