VMware Cloud Disaster Recovery – Firewall Ports

I published an article a while ago on getting started with VMware Cloud Disaster Recovery (VCDR). One thing I didn’t cover in any real depth was the connectivity requirements between on-premises and the VCDR service. VMware has worked pretty hard to ensure this is streamlined for users, but it’s still something you need to pay attention to. I was helping a client work through this process for a proof of concept recently and thought I’d cover it off more clearly here. The diagram below highlights the main components you need to look at, being:

  • The Cloud File System (frequently referred to as the SCFS)
  • The VMware Cloud DR SaaS Orchestrator (the Orchestrator); and
  • VMware Cloud DR Auto-support.

It’s important to note that the first two services are assigned IP addresses when you enable the service in the Cloud Service Console, and the Auto-support service has three public IP addresses that you need to be able to communicate with. All of this happens outbound over TCP 443. The Auto-support service is not required, but it is strongly recommended, as it makes troubleshooting issues with the service much easier, and provides VMware with an opportunity to proactively resolve cases. Network connectivity requirements are documented here.

[image courtesy of VMware]

So how do I know my firewall rules are working? The first sign that there might be a problem is that the DRaaS Connector deployment will fail to communicate with the Orchestrator at some point (usually towards the end), and you’ll see a message similar to the following. “ERROR! VMware Cloud DR authentication is not configured. Contact support.”

How can you troubleshoot the issue? Fortunately, we have a tool called the DRaaS Connector Connectivity Check CLI that you can run to check what’s not working. In this instance, we suspected an issue with outbound communication, and ran the following command on the console of the DRaaS Connector to check:

drc network test --scope cloud

This returned a status of “reachable” for the Orchestrator and Auto-support services, but the SCFS was unreachable. Some negotiations with the firewall team, and we were up and running.

Note, also, that VMware supports the use of proxy servers for communicating with Auto-support services, but I don’t believe we support the use of a proxy for Orchestrator and SCFS communications. If you’re worried about VCDR using up all your bandwidth, you can throttle it. Details on how to do that can be found here. We recommend a minimum of 100Mbps, but you can go as low as 20Mbps if required.

Using A Pure Storage FlashBlade As A Veeam Repository

I’ve been doing some testing in the lab recently. The focus of this testing has been primarily on Pure Storage’s ObjectEngine and its associated infrastructure. As part of that, I’ve been doing various things with Veeam Backup & Replication 9.5 Update 4, including setting up a FlashBlade NFS repository. I’ve documented the process in a document here. One thing that I thought worthy of noting separately was the firewall requirements. For my Linux Mount Server, I used a CentOS 7 VM, configured with 8 vCPUs and 16GB of RAM. I know, I normally use Debian, but for some reason (that I didn’t have time to investigate) it kept dying every time I kicked off a backup job.

In any case, I set everything up as per Pure’s instructions, but kept getting timeout errors on the job. The error I got was “5/17/2019 10:03:47 AM :: Processing HOST-01 Error: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond NFSMOUNTHOST:2500“. It felt like it was probably a firewall issue of some sort. I tried to make an exception on the Windows VM hosting the Veeam Backup server, but that didn’t help. The problem was with the Linux VM’s firewall. I used the instructions I found here to add in some custom rules. According to the Veeam documentation, Backup Repository access uses TCP ports 2500 – 5000. Your SecOps people will no doubt have a conniption, but here’s how to open those ports on CentOS.

Firstly, is the firewall running?

[danf@nfsmounthost ~]$ sudo firewall-cmd --state
[sudo] password for danf:

Yes it is. So let’s stop it to see if this line of troubleshooting is worth pursuing.

[danf@nfsmounthost ~]$ sudo systemctl stop firewalld

The backup job worked after that. Okay, so let’s start it up again and open up some ports to test.

[danf@nfsmounthost ~]$ sudo systemctl start firewalld
[danf@nfsmounthost ~]$ sudo firewall-cmd --add-port=2500-5000/tcp

That worked, so I wanted to make it a more permanent arrangement.

[danf@nfsmounthost ~]$ sudo firewall-cmd --permanent --add-port=2500-5000/tcp
[danf@nfsmounthost ~]$ sudo firewall-cmd --permanent --list-ports

Remember, it’s never the storage. It’s always the firewall. Also, keep in my mind this article is about the how. I’m not offering my opinion about whether it’s really a good idea to configure your host-based firewalls with more holes than Swiss cheese. Or whatever things have lots of holes in them.