In this edition of Things My Customers Have Asked Me (TMCHAM), I’m going to cover Managed Storage Policy Profiles (MSPPs) on the VMware-managed VMware Cloud on AWS platform.
Background
VMware Cloud on AWS has MSPPs deployed on clusters to ensure that customers have sufficient resilience built into the cluster to withstand disk or node failures. By default, clusters are configured with RAID 1, Failures to Tolerate (FTT):1 for 2 – 5 nodes, and RAID 6, FTT:2 for clusters with 6 or more nodes. Note that single-node clusters have no Service Level Agreement (SLA) attached to them, as you generally only run those on a trial basis, and if the node fails, there’s nowhere for the data to go. You can read more about vSAN Storage Polices and MSPPs here, and there’s a great Tech Zone article here. The point of these policies is that they are designed to ensure your cluster(s) remain in compliance with the SLAs for the platform. You can view the policies in your environment by going to Policies and Profiles in vCenter and selecting VM Storage Policies.
Can I Change Them?
The MSPPs are maintained by VMware, and so it’s not a great idea to change the default policies on your cluster, as the system will change them back at some stage. And why would you want to change the policies on your cluster? Well, you might decide that 4 or 5 nodes could actually run better (from a capacity perspective) using RAID 5, rather than RAID 1. This is a reasonable thing to want to do, and as the SLA talks about FTT numbers, not RAID types, you can change the RAID type and remain in compliance. And the capacity difference can be material in some cases, particularly if you’re struggling to fit your workloads onto a smaller node count.
So How Do I Do It Then?
Clone The Policy
There are a few ways to approach this, but the simplest is by cloning an existing policy. In this example, I’ll clone the vSAN Default Storage Policy. In the VMware Cloud on AWS, there is an MSPP assigned to each cluster with the name “VMC Workload Storage Policy – ClusterName“. Select the policy you want to clone and then click on Clone.
The first step is to give the VM Storage Policy a name. Something cool with your initials should do the trick.
You can edit the policy structure at this point, or just click Next.
Here you can configure your Availability options. You can also do other things, like configure Tags and Advanced Policy Rules.
Once this is configured, the system will check that your vSAN datastore are compatible with your policy.
And then you’re ready to go. Click Finish, make yourself a beverage, bask in the glory of it all.
Apply The Policy
So you have a fresh new policy, now what? You can choose to apply it to your workload datastore, or apply it to specific Virtual Machines. To apply it to your datastore, select the datastore you want to modify, click on General, then click on Edit next to the Default Storage Policy option. The process to apply the policy to VMs is outlined here. Note that if you create a non-compliant policy and apply it to your datastore, you’ll get hassled about it and you should likely consider changing your approach.
Thoughts
The thing about managed platforms is that the service provider is on the hook for architecture decisions that reduce the resilience of the platform. And the provider is trying to keep the platform running within the parameters of the SLA. This is why you’ll come across configuration items in VMware Cloud on AWS that you either can’t change, or have some default options that seem conservative. Many of these decisions have been made with the SLAs and the various use cases in mind for the platform. That said, it doesn’t mean there’s no flexibility here. If you need a little more capacity, particularly in smaller environments, there are still options available that won’t reduce the platform’s resilience, while still providing additional capacity options.