Cohesity – NAS Data Migration Overview

Data Migration

Cohesity NAS Data Migration, part of SmartFiles, was recently announced as a generally available feature within the Cohesity DataPlatform 6.4 release (after being mentioned in the 6.3 release blog post). The idea behind it is that you can use the feature to perform the migration of NAS data from a primary source to the Cohesity DataPlatform. It is supported for NAS storage registered as SMB or NFS (so it doesn’t necessarily need to be a NAS appliance as such, it can also be a file share hosted somewhere).

 

What To Think About

There are a few things to think about when you configure your migration policy, including:

  • The last time the file was accessed;
  • Last time the file was modified; and
  • The size of the file.

You also need to think about how frequently you want to run the job. Finally, it’s worth considering which View you want the archived data to reside on.

 

What Happens?

When the data is migrated an SMB2 symbolic link is left in place of the file with the same name as the file and the original data is moved to the Cohesity View. Note that on Windows boxes, remote to remote symbolic links are disabled, so you need to run these commands:

C:\Windows\system32>fsutil behavior set SymlinkEvaluation R2R:1
C:\Windows\system32>fsutil behavior query SymlinkEvaluation

Once the data is migrated to the Cohesity cluster, subsequent read and write operations are performed on the Cohesity host. You can move data back to the environment by mounting the Cohesity target View on a Windows client, and copying it back to the NAS.

 

Configuration Steps

To get started, select File Services, and click on Data Migration.

Click on the Migrate Data to configure a migration job.

You’ll need to give it a name.

 

The next step is to select the Source. If you already have a NAS source configured, you’ll see it here. Otherwise you can register a Source.

Click on the arrow to expand the registered NAS mount points.

Select the mount point you’d like to use.

Once you’ve selected the mount point, click on Add.

You then need to select the Storage Domain (formerly known as a ViewBox) to store the archived data on.

You’ll need to provide a name, and configure schedule options.

You can also configure advanced settings, including QoS and exclusions. Once you’re happy, click on Migrate and the job will be created.

You can then run the job immediately, or wait for the schedule to kick in.

 

Other Things To Consider

You’ll need to think about your anti-virus options as well. You can register external anti-virus software or install the anti-virus app from the Cohesity Marketplace

 

Thoughts And Further Reading

Cohesity have long positioned their secondary storage solution as something more than just a backup and recovery solution. There’s some debate about the difference between storage management and data management, but Cohesity seem to have done a good job of introducing yet another feature that can help users easily move data from their primary storage to their secondary storage environment. Plenty of backup solutions have positioned themselves as archive solutions, but many have been focused on moving protection data, rather than primary data from the source. You’ll need to do some careful planning around sizing your environment, as there’s always a chance that an end user will turn up and start accessing files that you thought were stale. And I can’t say with 100% certainty that this solution will transparently work with every line of business application in your environment. But considering it’s aimed at SMB and NFS shares, it looks like it does what it says on the tin, and moves data from one spot to another.

You can read more about the new features in Cohesity DataPlatform 6.4 (Pegasus) on the Cohesity site, and Blocks & Files covered the feature here. Alastair also shared some thoughts on the feature here.

Random Short Take #24

Want some news? In a shorter format? And a little bit random? This listicle might be for you. Welcome to #24 – The Kobe Edition (not a lot of passing, but still entertaining). 8 articles too. Which one was your favourite Kobe? 8 or 24?

  • I wrote an article about how architecture matters years ago. It’s nothing to do with this one from Preston, but he makes some great points about the importance of architecture when looking to protect your public cloud workloads.
  • Commvault GO 2019 was held recently, and Chin-Fah had some thoughts on where Commvault’s at. You can read all about that here. Speaking of Commvault, Keith had some thoughts as well, and you can check them out here.
  • Still on data protection, Alastair posted this article a little while ago about using the Cohesity API for reporting.
  • Cade just posted a great article on using the right transport mode in Veeam Backup & Replication. Goes to show he’s not just a pretty face.
  • VMware vFORUM is coming up in November. I’ll be making the trip down to Sydney to help out with some VMUG stuff. You can find out more here, and register here.
  • Speaking of VMUG, Angelo put together a great 7-part series on VMUG chapter leadership and tips for running successful meetings. You can read part 7 here.
  • This is a great article on managing Rubrik users from the CLI from Frederic Lhoest.
  • Are you into Splunk? And Pure Storage? Vaughn has you covered with an overview of Splunk SmartStore on Pure Storage here.

VMware – VMworld 2019 – HBI2537PU – Cloud Provider CXO Panel with Cohesity, Cloudian and PhoenixNAP

Disclaimer: I recently attended VMworld 2019 – US.  My flights and accommodation were paid for by Digital Sense, and VMware provided me with a free pass to the conference and various bits of swag. There is no requirement for me to blog about any of the content presented and I am not compensated by VMware for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my rough notes from “HBI2537PU – Cloud Provider CXO Panel with Cohesity, Cloudian and PhoenixNAP”, a panel-type presentation with the following people:

You can grab a PDF copy of my notes from here.

Introductions are done.

YR: William, given your breadth of experience, what are some of the emerging trends you’ve been seeing?

WB: Companies are struggling to keep up with the pace of information generation. Understanding the data, storing and retaining it, and protecting it. Multi-cloud adds a lot of complexity. We’ve heard studies that say 22% of data generated is actually usable. It’s just sitting there. Public cloud is still hot, but it’s settling down a little.

YR: William comes from a massive cloud provider. What are you guys using?

WB: We’ve standardised on vCloud Director (vCD) and vSphere. We came from build our own but it wasn’t providing the value that we hoped it would. Customers want a seamless way to manage multiple cloud resources.

YR: Are you guys familiar with VCPP?

AP: VCPP is the crown jewel of our partner program at VMware. 4000+ providers, 120+ countries, 10+ million VMs, 10000+ DCs. We help you save money, make money (things are services ready). We’re continuing to invest in vCD. Kubernetes, GPUs, etc. Lots of R&D.

YR: William, you mentioned you standardised on the VMware platform. Talk to us about your experience. Why vCD?

WB: It’s been a checkered past for vCD. We were one of the first five on the vCloud Express program in 2010 / 11. We didn’t like vCD in its 1.0 version. We thought we can do this better. And we did. We launched the first on-demand, pay by the hour public cloud for enterprise in 2011. But it didn’t really work out. 2012 / 13 we started to see investments being made in vCD. 5.0 / 5.5 improved. Many people thought vCD was gong to die. We now see a modern, flexible portal that can be customised. And we can take our devs and have them customise vCD, rather than build a customised portal. That’s where we can put our time and effort. We’ve always done things differently. Always been doing other things. How do we bring our work in visual cloud into that cloud provider portal with vCD?

YR: You have an extensive career at VMware.

RR: I was one of the first people to take vCD out to the world. But Enterprise wasn’t mature enough. When we focused on SPs, it was the right thing to do. DIY portals needs a lot of investment. VMware allows a lot of extensibility now. For us, as Cohesity, we want to be able to plug in to that as well.

WB: At one point we had 45 devs working on a proprietary portal.

YR: We’ve been doing a lot on the extensibility side. What role are services playing in cloud providers?

AP: It takes away the complexities of deploying the stack.

JT: We’re specifically in object. A third of our customers are service providers. You guys know that object is built for scale, easy to manage, cost-effective. 20% of the data gets used. We hear that customers want to improve on that. People are moving away from tape. There’s a tremendous opportunity for services built on storage. Amazon has shown that. Data protection like Cohesity. Big data with Splunk. You can offer an industry standard, but differentiate based on other services.

YR: As we move towards a services-oriented world, William how do you see cloud management services evolving?

WB: It’s not good enough to provide some compute infrastructure any more. You have to do something more. We’re stubbornly focussed on different types of IaaS. We’re not doing generic x86 on top of vSphere. Backup, DR – those are in our wheelhouse. From a platform perspective, more and more customers want some kind of single pane of glass across their data. For some that’s on-premises, for some its public, for some it’s SaaS. You have to be able to provide value to the customer, or they will disappear. Object storage, backup with Cohesity. You need to keep pace with data movement. Any cloud, any data, any where.

AP: I’ve been at VMware long enough not to drink the Kool-Aid. Our whole cloud provider business is rooted in some humility. vCD can help other people doing better things to integrate. vCD has always been about reducing OPEX. Now we’re hitting the top line. Any cloud management platform today needs to open, extensible, not try to do anything.

YR: Is the crowd seeing pressure on pure IaaS?

Commentator: Coming from an SP to enterprise is different. Economics. Are you able to do a show back with vCD 9 and vROps?

WB: We’re putting that in the hands of customers. Looking at CloudHealth. There’s a benefit to being in the business management space. You have the opportunity to give customers a better service. That, and more flexible business models. Moving into flexible billing models – gives more freedom to the enterprise customer. Unless you’re the largest of the large – enterprises have difficulty acting as a service provider. Citibank are an exception to this. Honeywell do it too. If you’re Discount Tire – it’s hard. You’re the guy providing the service, and you’re costing them money. There’s animosity – and there’s no choice.

Commentator: Other people have pushed to public because chargeback is more effective than internal show back with private cloud.

WB: IT departments are poorly equipped to offer a breadth of services to their customers.

JT: People are moving workloads around. They want choice and flexibility. VMware with S3 compatible storage. A common underlying layer.

YR: Economics, chargeback. Is VMware (and VCPP) doing enough?

WB: The two guys to my right (RR and JT) have committed to building products that let me do that. I’ve been working on object storage use cases. I was talking to a customer. They’re using our IaaS and connected to Amazon S3. You’ve gone to Amazon. They didn’t know about it though. Experience and cost that can be the same or better. Egress in Amazon S3 is ridiculous. You don’t know what you don’t know. You can take that service and deliver it cost-effectively.

YR: RR talk to us about the evolution of data protection.

RR: Information has grown. Data is fragmented. Information placement is almost unmanageable. Services have now become available in a way that can be audited, secured, managed. At Cohesity, first thing we did was data protection, and I knew the rest was coming. Complexity’s a problem.

YR: JT. We know Cloudian’s a leader in object storage. Where do you see object going?

JT: It’s the underlying storage layer of the cloud. Brings down cost of your storage layer. It’s all about TCO. What’s going to help you build more revenue streams? Cloudian has been around since 2011. New solutions in backup, DR, etc, to help you build new revenue streams. S3 users on Amazon are looking for alternatives. Many of Cloudian’s customers are ex-Amazon customers. What are we doing? vCD integration. Search Cloudian and vCD on YouTube. Continuously working to drive down the cost of managing storage. 1.5PB in a 4RU box in collaboration with Seagate.

WB: Expanding service delivery, specifically around object storage, is important. You can do some really cool stuff – not just backup, it’s M&E, it’s analytics. Very few of our customers are using object just to store files and folders.

YR: We have a lot of providers in the room. JT can you talk more about these key use cases?

JT: It runs the gamut. You can break it down by verticals. M&E companies are offering editing suites via service providers. People are doing that for the legal profession. Accounting – storing financial records. Dental records and health care. The back end is the same thing – compute with S3 storage behind it. Cloudian provides multi-tenanted, scalable performance. Cost is driven down as you get larger.

YR: RR your key use cases?

RR: DRaaS is hot right now. When I was at VMware we did stuff with SRM. DR is hard. It’s so simple now. Now every SP can do it themselves. Use S3 to move data around from the same interface. And it’s very needed too. Everyone should have ubiquitous access to their data. We have that capability. We can now do vulnerability scans on the data we store on the platform. We can tell you if a VM is compromised. You can orchestrate the restoration of an environment – as a service.

YR: WB what are the other services you want us to deliver?

WB: We’re an odd duck. One of our major practices is information security. The idea that we have intelligent access to data residing in our infrastructure. Being able to detect vulnerabilities, taking action, sending an email to the customer, that’s the type of thing that cloud providers have. You might not be doing it yet – but you could.

YR: Security, threat protection. RR – do you see Cohesity as the driver to solve that problem?

RR: Cohesity will provide the platform. Data is insecure because it’s fragmented. Cohesity lets you run applications on the platform. Virus scanners, run books, all kinds of stuff you can offer as a service provider.

YR: William, where does the onus lie, how do you see it fitting together?

WB: The key for us is being open. Eg Cohesity integration into vCD. If I don’t want to – I don’t have to. Freedom of choice to pick and choose where we went to deliver our own IP to the customer. I don’t have to use Cohesity for everything.

JT: That’s exactly what we’re into. Choice of hardware, management. That’s the point. Standards-based top end.

YR: Security

*They had 2 minutes to go but I ran out of time and had to get to another meeting. Informative session. 4 stars.

Random Short Take #18

Here are some links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 18 – buckle up kids! It’s all happening.

  • Cohesity added support for Active Directory protection with version 6.3 of the DataPlatform. Matt covered it pretty comprehensively here.
  • Speaking of Cohesity, Alastair wrote this article on getting started with the Cohesity PowerShell Module.
  • In keeping with the data protection theme (hey, it’s what I’m into), here’s a great article from W. Curtis Preston on SaaS data protection, and what you need to consider to not become another cautionary tale on the Internet. Curtis has written a lot about data protection over the years, and you could do a lot worse than reading what he has to say. And that’s not just because he signed a book for me.
  • Did you ever stop and think just how insecure some of the things that you put your money into are? It’s a little scary. Shell are doing some stuff with Cybera to improve things. Read more about that here.
  • I used to work with Vincent, and he’s a super smart guy. I’ve been at him for years to start blogging, and he’s started to put out some articles. He’s very good at taking complex topics and distilling them down to something that’s easy to understand. Here’s his summary of VMware vRealize Automation configuration.
  • Tom’s take on some recent CloudFlare outages makes for good reading.
  • Google Cloud has announced it’s acquiring Elastifile. That part of the business doesn’t seem to be as brutal as the broader Alphabet group when it comes to acquiring and discarding companies, and I’m hoping that the good folks at Elastifile are looked after. You can read more on that here.
  • A lot of people are getting upset with terms like “disaggregated HCI”. Chris Mellor does a bang up job explaining the differences between the various architectures here. It’s my belief that there’s a place for all of this, and assuming that one architecture will suit every situation is a little naive. But what do I know?

Random Short Take #17

Here are some links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 17 – am I over-sharing? There’s so much I want you to know about.

  • I seem to always be including a link from the Backblaze blog. That’s mainly because they write about things I’m interested in. In this case, they’ve posted an article discussing the differences between availability and durability that I think is worth your time.
  • Speaking of interesting topics, Preston posted an article on NetWorker Pools with Data Domain that’s worth looking at if you’re into that kind of thing.
  • Maintaining the data protection theme, Alastair wrote an interesting article titled “The Best Automation Is One You Don’t Write” (you know, like the best IO is one you don’t need to do?) as part of his work with Cohesity. It’s a good article, and not just because he mentions my name in it.
  • I recently wanted to change the edition of Microsoft Office I was using on my MacBook Pro and couldn’t really work out how to do it. In the end, the answer is simple. Download a Microsoft utility to remove your Office licenses, and then fire up an Office product and it will prompt you to re-enter your information at that point.
  • This is an old article, but it answered my question about validating MD5 checksums on macOS.
  • Excelero have been doing some cool stuff with Imperial College London – you can read more about that here.
  • Oh hey, Flixster Video is closing down. I received this in my inbox recently: “[f]ollowing the announcement by UltraViolet that it will be discontinuing its service on July 31, 2019, we are writing to provide you notice that Flixster Video is planning to shut down its website, applications and operations on October 31, 2019”. It makes sense, obviously, given UltraViolet’s demise, but it still drives me nuts. The ephemeral nature of digital media is why I still have a house full of various sized discs with various kinds of media stored on them. I think the answer is to give yourself over to the streaming lifestyle, and understand that you’ll never “own” media like you used to think you did. But I can’t help but feel like people outside of the US are getting shafted in that scenario.
  • In keeping up with the “random” theme of these posts, it was only last week that I learned that “Television, the Drug of the Nation” from the very excellent album “Hypocrisy Is the Greatest Luxury” by The Disposable Heroes of Hiphoprisy was originally released by Michael Franti and Rono Tse when they were members of The Beatnigs. If you’re unfamiliar with any of this I recommend you check them out.

Cohesity Basics – Configuring An External Target For Cloud Archive

I’ve been working in the lab with Pure Storage’s ObjectEngine and thought it might be nice to document the process to set it up as an external target for use with Cohesity’s Cloud Archive capability. I’ve written in the past about Cloud Tier and Cloud Archive, but in that article I focused more on the Cloud Tier capability. I don’t want to sound too pretentious, but I’ll quote myself from the other article: “With Cloud Archive you can send copies of snapshots up to the cloud to keep as a copy separate to the backup data you might have replicated to a secondary appliance. This is useful if you have some requirement to keep a monthly or six-monthly copy somewhere for compliance reasons.”

I would like to be clear that this process hasn’t been blessed or vetted by Pure Storage or Cohesity. I imagine they are working on delivering a validated solution at some stage, as they have with Veeam and Commvault. So don’t go out and jam this in production and complain to me when Pure or Cohesity tell you it’s wrong.

There are a couple of ways you can configure an external target via the Cohesity UI. In this example, I’ll do it from the dashboard, rather than during the protection job configuration. Click on Protection and select External Target.

You’ll then be presented with the New Target configuration dialogue.

In this example, I’m calling my external target PureOE, and setting its purpose as Archival (as opposed to Tiering).

The Type of target is “S3 Compatible”.

Once you select that, you’ll be asked for a bunch of S3-type information, including Bucket Name and Access Key ID. This assumes you’ve already created the bucket and configured appropriate security on the ObjectEngine side of things.

Enter the required information. I’ve de-selected compression and source side deduplication, as I’m wanting that the data reduction to be done by the ObjectEngine. I’ve also disabled encryption, as I’m guessing this will have an impact on the ObjectEngine as well. I need to confirm that with my friends at Pure. I’m using the fully qualified domain name of the ObjectEngine as the endpoint here as well.

Once you click on Register, you’ll be presented with a summary of the configuration.

You’re then right to use this as an external target for Archival parts of protection jobs within your Cohesity environment. Once you’ve run a few protection jobs, you should start to see files within the test bucket on the ObjectEngine. Don’t forget that, as fas as I’m aware, it’s still very difficult (impossible?) to remove external targets from the the Cohesity Data Platform, so don’t get too carried away with configuring a bunch of different test targets thinking that you can remove them later.

Random Short Take #16

Here are a few links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 16 – please enjoy these semi-irregular updates.

  • Scale Computing has been doing a bit in the healthcare sector lately – you can read news about that here.
  • This was a nice roundup of the news from Apple’s recent WWDC from Six Colors. Hat tip to Stephen Foskett for the link. Speaking of WWDC news, you may have been wondering what happened to all of your purchased content with the imminent demise of iTunes on macOS. It’s still a little fuzzy, but this article attempts to shed some light on things. Spoiler: you should be okay (for the moment).
  • There’s a great post on the Dropbox Tech Blog from James Cowling discussing the mission versus the system.
  • The more things change, the more they remain the same. For years I had a Windows PC running Media Center and recording TV. I used IceTV as the XMLTV-based program guide provider. I then started to mess about with some HDHomeRun devices and the PC died and I went back to a traditional DVR arrangement. Plex now has DVR capabilities and it has been doing a reasonable job with guide data (and recording in general), but they’ve decided it’s all a bit too hard to curate guides and want users (at least in Australia) to use XMLTV-based guides instead. So I’m back to using IceTV with Plex. They’re offering a free trial at the moment for Plex users, and setup instructions are here. No, I don’t get paid if you click on the links.
  • Speaking of axe-throwing, the Cohesity team in Queensland is organising a social event for Friday 21st June from 2 – 4 pm at Maniax Axe Throwing in Newstead. You can get in contact with Casey if you’d like to register.
  • VeeamON Forum Australia is coming up soon. It will be held at the Hyatt Regency Hotel in Sydney on July 24th and should be a great event. You can find out more information and register for it here. The Vanguards are also planning something cool, so hopefully we’ll see you there.
  • Speaking of Veeam, Anthony Spiteri recently published his longest title in the Virtualization is Life! catalogue – Orchestration Of NSX By Terraform For Cloud Connect Replication With vCloud Director. It’s a great article, and worth checking out.
  • There’s a lot of talk and slideware devoted to digital transformation, and a lot of it is rubbish. But I found this article from Chin-Fah to be particularly insightful.

Cohesity Basics – Excluding VMs Using Tags – Real World Example

I’ve written before about using VM tags with Cohesity to exclude VMs from a backup. I wanted to write up a quick article using a real world example in the test lab. In this instance, we had someone deploying 200 VMs over a weekend to test a vendor’s storage array with a particular workload. The problem was that I had Cohesity set to automatically protect any new VMs that are deployed in the lab. This wasn’t a problem from a scalability perspective. Rather, the problem was that we were backing up a bunch of test data that didn’t dedupe well and didn’t need to be protected by what are ultimately finite resources.

As I pointed out in the other article, creating tags for VMs and using them as a way to exclude workloads from Cohesity is not a new concept, and is fairly easy to do. You can also apply the tags in bulk using the vSphere Web Client if you need to. But a quicker way to do it (and something that can be done post-deployment) is to use PowerCLI to search for VMs with a particular naming convention and apply the tags to those.

Firstly, you’ll need to log in to your vCenter.

PowerCLI C:\> Connect-VIServer vCenter

In this example, the test VMs are deployed with the prefix “PSV”, so this makes it easy enough to search for them.

PowerCLI C:\> get-vm | where {$_.name -like "PSV*"} | New-TagAssignment -Tag "COH-NoBackup"

This assumes that the tag already exists on the vCenter side of things, and you have sufficient permissions to apply tags to VMs. You can check your work with the following command.

PowerCLI C:\> get-vm | where {$_.name -like "PSV*"} | Get-TagAssignment

One thing to note. If you’ve updated the tags of a bunch of VMs in your vCenter environment, you may notice that the objects aren’t immediately excluded from the Protection Job on the Cohesity side of things. The reason for this is that, by default, Cohesity only refreshes vCenter source data every 4 hours. One way to force the update is to manually refresh the source vCenter in Cohesity. To do this, go to Protection -> Sources. Click on the ellipsis on the right-hand side of your vCenter source you’d like to refresh, and select Refresh.

You’ll then see that the tagged VMs are excluded in the Protection Job. Hat tip to my colleague Mike for his help with PowerCLI. And hat tip to my other colleague Mike for causing the problem in the first place.

Random Short Take #15

Here are a few links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 15 – it could become a regular thing. Maybe every other week? Fortnightly even.

Random Short Take #14

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Episode 14 – giddy-up!