VMware Cloud on AWS – TMCHAM – Part 13 – Delete the SDDC

Following on from my article on host removal, in this edition of Things My Customers Have Asked Me (TMCHAM), I’m going to cover SDDC removal on the VMware-managed VMware Cloud on AWS platform. Don’t worry, I haven’t lost my mind in a post-acquisition world. Rather, this is some of the info you’ll find useful if you’ve been running a trial or a proof of concept (as opposed to a pilot) deployment of VMware Cloud Disaster Recovery (VCDR) and / or VMware Cloud on AWS and want to clean some stuff up when you’re all done.

 

Process

Firstly, if you’re using VCDR and want to deactivate the deployment, the steps to perform are outlined here, and I’ve copied the main bits from that page below.

  1. Remove all DRaaS Connectors from all protected sites. See Remove a DRaaS Connector from a Protected Site.
  2. Delete all recovery SDDCs. See Delete a Recovery SDDC.
  3. Deactivate the recovery region from the Global DR Console. (Do this step last.) See Deactivate a Recovery Region. Usage charges for VMware Cloud DR are not stopped until this step is completed.

Funnily enough, as I was writing this, someone zapped our lab for reasons. So this is what a Region deactivation looks like in the VCDR UI.

Note that it’s important you perform these steps in that order, or you’ll have more cleanup work to do to get everything looking nice and tidy. I have witnessed firsthand someone doing it the other way and it’s not pretty. Note also that if your Recovery SDDC had services such as HCX connected, you should hold off deleting the Recovery SDDC until you’ve cleaned that bit up.

Secondly, if you have other workloads deployed in a VMware Cloud on AWS SDDC and want to remove a PoC SDDC, there are a few steps that you will need to follow.

If you’ve been using HCX to test migrations or network extension, you’ll need to follow these steps to remove it. Note that this should be initiated from the source side, and your HCX deployment should be in good order before you start (site pairings functioning, etc). You might also wish to remove a vCenter Cloud Gateway, and you can find information on that process here.

Finally, there are some AWS activities that you might want to undertake to clean everything up. These include:

  • Removing VIFs attached to your AWS VPC.
  • Deleting the VPC (this will likely be required if your organisation has a policy about how PoC deployments  are managed).
  • Tidy up and on-premises routing and firewall rules that may have been put in place for the PoC activity.

And that’s it. There’s not a lot to it, but tidying everything up after a PoC will ensure that you avoid any unexpected costs popping up in the future.

VMware Cloud on AWS – What’s New – February 2024

It’s been a little while since I posted an update on what’s new with VMware Cloud on AWS, so I thought I’d share some of the latest news.

 

M7i.metal-24xl Announced

It’s been a few months since it was announced at AWS re:Invent 2023, but the M7i.metal-24xl (one of the catchier host types I’ve seen) is going to the change the way we approach storage-heavy VMC on AWS deployments.

What is it?

It’s a host without local storage. There are 48 physical cores (96 logical cores with Hyper-Threading enabled). It has 384 GiB memory. The key point is that there are flexible NFS storage options to choose from – VMware Cloud Flex Storage or Amazon FSx for NetApp ONTAP. There’s support for up to 37.5 Gbps networking speed, and it supports always-on memory encryption using Intel Total Memory Encryption (TME).

Why?

Some of the potential use cases for this kind of host type are as follows:

  • CPU Intensive workloads
    • Image processing
    • Video encoding
    • Gaming servers
  • AI/ML Workloads
    • Code Generation
    • Natural Language Processing
    • Classical Machine Learning
    • Workloads with limited resource requirements
  • Web and application servers
    • Microservices/Management services
    • Secondary data stores/database applications
  • Ransomware & Disaster Recovery
    • Modern Ransomware Recovery
    • Next-gen DR
    • Compliance and Risk Management

Other Notes

New (greenfield) customers can deploy the M7i.metal-24xl in the first cluster using 2-16 nodes. Existing (brownfield) customers can deploy the M7i.metal-24xl in secondary clusters in the same SDDC. In terms of connectivity, we recommend you take advantage of VPC peering for your external storage connectivity. Note that there is no support for multi-AZ deployments, nor is there support for single node deployments. If you’d like to know more about the M7i.metal-24xl, there’s an excellent technical overview here.

 

vSAN Express Storage Architecture on VMware Cloud on AWS

SDDC Version 1.24 was announced in November 2023, and with that came support for vSAN Express Storage Architecture (ESA) on VMC on AWS. There’s some great info on what’s included in the 1.24 release here, but I thought I’d focus on some of the key constraints you need to look at when considering ESA in your VMC on AWS environment.

Currently, the following restrictions apply to vSAN ESA in VMware Cloud on AWS:
  • vSAN ESA is available for clusters using i4i hosts only.
  • vSAN ESA is not supported with stretched clusters.
  • vSAN ESA is not supported with 2-host clusters.
  • After you have deployed a cluster, you cannot convert from vSAN ESA to vSAN OSA or vice versa.
So why do it? There are plenty of reasons, including better performance, enhanced resource efficiency, and several improvements in terms of speed and resiliency. You can read more about it here.

VMware Cloud Disaster Recovery Updates

There have also been some significant changes to VCDR, with the recent announcement that we now support a 15-minute Recovery Point Objective (down from 30 minutes). There have also been a number of enhancements to the ransomware recovery capability, including automatic Linux security sensor installation in the recovery workflow (trust me, once you’ve done it manually a few times you’ll appreciate this). With all the talk of supplemental storage above, it should be noted that “VMware Cloud DR does not support recovering VMs to VMware Cloud on AWS SDDC with NFS-mounted external datastores including Amazon FSx for NetApp datastores, Cloud Control Volumes or VMware Cloud Flex Storage”. Just in case you had an idea that this might be something you want to do.

 

Thoughts

Much of the news about VMware has been around the acquisition by Broadcom. It certainly was news. In the meantime, however, the VMware Cloud on AWS product and engineering teams have continued to work on releasing innovative features and incremental improvements. The encouraging thing about this is that they are listening to customers and continuing to adapt the solution architecture to satisfy those requirements. This is a good thing for both existing and potential customers. If you looked at VMware Cloud on AWS three years ago and ruled it out, I think it’s worth looking at again.

VMware Cloud Disaster Recovery – Using A Script VM

This is a quick post covering the steps required to configure a script VM for use in a recovery plan with VMware Cloud Disaster Recovery (VCDR). Why would you want to do this? You might be running a recovery for a Linux VM and you need to run a script to update the DNS settings of the VM once it’s powered on at another site. Or you might have a site-specific application that needs to be installed. Whatever. The point is that VCDR gives you that ability to do that via the Script VM. You can read the documentation on the feature here.

Firstly, you configure the Script VM as part of the Recovery Plan creation process. Specify the name of the VM and the vCenter it’s hosted on.

Under Recovery steps, click on Add Step to add a step to the recovery process.

When you add the step, you’ll want to add an action for the post-recovery phase.

You can then select “Run script on the Script VM”.

At this point you can specify the full path to the script file, keeping in mind that Windows looks different to Linux. You can also set a timeout for the script.

And that’s pretty much it. Remember that you’ll need working DNS, or, failing that, valid IP addresses for things to work.

VMware Cloud Disaster Recovery – Ransomware Recovery Activation

One of the cool features of VMware Cloud Disaster Recovery (VCDR) is the Enhanced Ransomware Recovery capability. This is a quick post to talk through how to turn it on in your VCDR environment, and things you need to consider.

 

Organization Settings

The first step is to enable the ransomware services integration in your VCDR dashboard. You’ll need to be an Organisation owner to do this. Go to Settings, and click on Ransomware Recovery Services.

You’ll then have the option to select where the data analysis is performed.

You’ll also need to tick some boxes to ensure that you understand that an appliance will be deployed in each of your Recovery SDDCs, Windows VMs will get a sensor installed, and some preinstalled sensors may clash with Carbon Black.

Click on Activate and it will take a few moments. If it takes much longer than that, you’ll need to talk to someone in support.

Once the analysis integration is activated, you can then activate NSX Advanced Firewall. Page 245 of the PDF documentation covers this better than I can, but note that NSX Advanced Firewall is a chargeable service (if you don’t already have a subscription attached to your Recovery SDDC). There’s some great documentation here on what you do and don’t have access to if you allow the activation of NSX Advanced Firewall.

Like your favourite TV chef would say, here’s one I’ve prepared earlier.

Recovery Plan Configuration

Once the services integration is done, you can configure Ransomware Recovery on a per Recovery Plan basis.

Start by selecting Activate ransomware recovery. You’ll then need to acknowledge that this is a chargeable feature.

You can also choose whether you want to use integrated analysis (i.e. Carbon Black Cloud), and if you want to manually remove other security sensors when you recover. You can, also, choose to use your own tools if you need to.

And that’s it from a configuration perspective. The actual recovery bit? A story for another time.

VMware Cloud Disaster Recovery – Firewall Ports

I published an article a while ago on getting started with VMware Cloud Disaster Recovery (VCDR). One thing I didn’t cover in any real depth was the connectivity requirements between on-premises and the VCDR service. VMware has worked pretty hard to ensure this is streamlined for users, but it’s still something you need to pay attention to. I was helping a client work through this process for a proof of concept recently and thought I’d cover it off more clearly here. The diagram below highlights the main components you need to look at, being:

  • The Cloud File System (frequently referred to as the SCFS)
  • The VMware Cloud DR SaaS Orchestrator (the Orchestrator); and
  • VMware Cloud DR Auto-support.

It’s important to note that the first two services are assigned IP addresses when you enable the service in the Cloud Service Console, and the Auto-support service has three public IP addresses that you need to be able to communicate with. All of this happens outbound over TCP 443. The Auto-support service is not required, but it is strongly recommended, as it makes troubleshooting issues with the service much easier, and provides VMware with an opportunity to proactively resolve cases. Network connectivity requirements are documented here.

[image courtesy of VMware]

So how do I know my firewall rules are working? The first sign that there might be a problem is that the DRaaS Connector deployment will fail to communicate with the Orchestrator at some point (usually towards the end), and you’ll see a message similar to the following. “ERROR! VMware Cloud DR authentication is not configured. Contact support.”

How can you troubleshoot the issue? Fortunately, we have a tool called the DRaaS Connector Connectivity Check CLI that you can run to check what’s not working. In this instance, we suspected an issue with outbound communication, and ran the following command on the console of the DRaaS Connector to check:

drc network test --scope cloud

This returned a status of “reachable” for the Orchestrator and Auto-support services, but the SCFS was unreachable. Some negotiations with the firewall team, and we were up and running.

Note, also, that VMware supports the use of proxy servers for communicating with Auto-support services, but I don’t believe we support the use of a proxy for Orchestrator and SCFS communications. If you’re worried about VCDR using up all your bandwidth, you can throttle it. Details on how to do that can be found here. We recommend a minimum of 100Mbps, but you can go as low as 20Mbps if required.

VMware Cloud on AWS – Melbourne Region Added

VMware recently announced that VMware Cloud on AWS is now available in the AWS Asia-Pacific (Melbourne) Region. I thought I’d share some brief thoughts here along with a video I did with my colleague Satya.

 

What?

VMware Cloud on AWS is now available to consume in three Availability Zones (apse4-az1, apse4-az2, apse4-az3) in the Melbourne Region. From a host type – you have the option to deploy either I3en.metal or I4i.metal hosts. There is also support for stretched clusters and PCI-DSS compliance if required. The full list of VMware Cloud on AWS Regions and Availability Zones is here.

 

Why Is This Interesting?

Since the launch of VMware Cloud on AWS, customers have only had one choice when it comes to a Region – Sydney. This announcement gives organisations the ability to deploy architectures that can benefit from both increased availability and resiliency by leveraging multi-regional capabilities.

Availability

VMware Cloud on AWS already offers platform availability at a number of levels, including a choice of Availability Zones, Partition Placement groups, and support for stretched clusters across two Availability Zones. There’s also support for VMware High Availability, as well as support for automatically remediating failed hosts.

Resilience

In addition to the availability options customers can take advantage of, VMware Cloud on AWS also provides support for a number of resilience solutions, including VMware Cloud Disaster Recovery (VCDR) and VMware Site Recovery. Previously, customers in Australia and New Zealand were able to leverage these VMware (or third-party) solutions and deploy them across multiple Availability Zones. Invariably, it would look like the below diagram, with workloads hosted in one Availability Zone, and a second Availability Zone being used as the recovery location for those production workloads.

With the introduction of a second Region in A/NZ, customers can now look to deploy resilience solutions that are more like this diagram:

In this example, they can choose to run production workloads in the Melbourne Region and recover workloads into the Sydney Region if something goes pear-shaped. Note that VCDR is not currently available to deploy in the Melbourne Region, although it’s expected to be made available before the end of 2023.

 

Why Else Should I Care?

Data Sovereignty 

There are a variety of legal, regulatory, and administrative obligations governing the access, use, security and preservation of information within various government and commercial organisations in Victoria. These regulations are both national and state-based, and in the case of the Melbourne Region, provide organisations in Victoria the opportunity to store data in VMware Cloud on AWS that may not otherwise have been possible.

Data Locality

Not all applications and data reside in the same location. Many organisations have a mix of workloads residing on-premises and in the cloud. Some of these applications are latency-sensitive, and the launch of the Melbourne Region provides organisations with the ability to host applications closer to that data, as well as accessing native AWS services with improved responsiveness over applications hosted in the Sydney Region.

 

How?

If you’re an existing VMware Cloud on AWS customer, head over to https://cloud.vmware.com. Login to the Cloud Services Console. Click on the VMware Cloud on AWS tile. Click on Inventory. Then click on Create SDDC.

 

Thoughts

Some of the folks in the US and Europe are probably wondering why on earth this is such a big deal for the Australian and New Zealand market. And plenty of folks in this part of the world are probably not that interested either. Not every organisation is going to benefit from or look to take advantage of the Melbourne Region. Many of them will continue to deploy workloads into one or two of the Sydney-based Availability Zones, with DR in another Availability Zone, and not need to do any more. But for those organisations looking for resiliency across geographical regions, this is a great opportunity to really do some interesting stuff from a disaster recovery perspective. And while it seems delightfully antiquated to think that, in this global world we live in, some information can’t cross state lines, there are plenty of organisations in Victoria facing just that issue, and looking at ways to store that data in a sensible fashion close to home. Finally, we talk a lot about data having gravity, and this provides many organisations in Victoria with the ability to run workloads closer to that centre of data gravity.

If you’d like to hear me talking about this with my learned colleague Satya, you can check out the video here. Thanks to Satya for prompting me to do the recording, and for putting it all together. We’re aiming to do this more regularly on a variety of VMware-related topics, so keep an eye out.

Updated Articles Page

I recently had the opportunity to run through a VMware Cloud on Disaster Recovery deployment with a customer and thought I’d run through the basics. It’s important to note that there a variety of topologies supported with VCDR, and many things that need to be considered before you click deploy, and this is just one way of doing it. In any case, there’s a new document outlining the process on the articles page.