VMware – VMworld 2016 – STO7914 – Revamped vSphere Storage DRS and SIOC for automating the Data Centers

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here’re my rough notes from “STO7914 – Revamped vSphere Storage DRS and SIOC for automating the Data Centers”, presented by Naveen Nagaraj and Ben Meadowcroft.

 

Agenda

  • Typical storage management related questions
  • SDRS/SIOC by the numbers
  • technical deep dive
  • What are we working on?
  • best practice guidelines
  • advanced options
  • Q&A

 

Typical Questions

  • Backup jobs – what’s the impact on production?
  • which is efficient – thin over thin? thick over thin?
  • SDRS and FAST?
  • Datstore maintenance mode – how to avoid overwhelming the array controller
  • SIOC – does it throttle svMotion? lower latency with AFA? vMSC?

 

SDRS/SIOC by the numbers

  • 96% svMotion
  • 58% SDRS
  • 38% SIOC – tedious, pain-staking, error-prone – working to make this easier to manage through policies

SDRS – 68% fully automated, 32% manual

This is similar to where DRS was at the same stage of its product lifecycle in terms of maturity and automation.
Popular features

  • 100% space load balancing
  • 92% datastore maintenance mode
  • 77% IO Load balancing

Cluster sizes

  • Maximum supported? 64
  • 50% 32-148 datastores per POD
  • interoperability tested up to 64, not higher
  • 30% 16-31
  • 20% 4-15

Default threshold space utilisation

95% stick with 80% (default)

 

Technical Deep Dive

Affinity and anti-affinity rules

  • initial placement – VMDKs together or separate
  • load balancing
  • Add disk to VM? -pre-requisite moves, entire collection is moved
  • maintenance mode – mandatory action (marching orders – have to evacuate VMDKs) – generates faults (you can override affinity/anti-affinity rules)

Growth rate and data store correlation

  • SDRS constantly tracks VMDK growth
  • leverages it for both initial placement and load balancing

Correlation

  • SDRS figures out if 2 or more datastore share same spindle
  • 2 approaches – IO modelling, VASA

Storage DRS is now aware of storage capabilities through VASA 2.0

  • array-based thin-provisioning
  • array-based deduplication
  • array-based auto-tiering
  • array-based snapshot

Thin provisioned datastores

  • visibility into back-end pool utilisation (capacity)

Deduplication capability

  • dedupe pool spans across multiple datastore
  • datastore appears to store more data than capacity
  • SDRS uses free space in datastore rather than just computing sum of vodka capacity
  • VASA integration allows SDRS to know the mapping of dat stores to dedupe pools

FAST Arrays

  • multiple storage tires
  • VM across tiers
  • tier use changes workload

SDRS becomes very conservative – lets the array deal with SLA guarantees

  • array threshold < SDRS threshold < SIOC threshold
  • space load balancing
  • rule enforcement
  • maintenance mode

 

What are we working on?

Today

SIOC Performance controls – reservations, shares and limits

  • reservations – minimum guaranteed IOPS per VM
  • limit – maximum IOPS allowed per VM
  • shares – relative importance of VMs during contention

Storage policy overview

A way to describe storage requirements

  • user-defined tags, default VMware system tags, VASA – storage vendor published capabilities

they can be associated at VM level or VMDK level
define the policy and associate it with it.

need more IOPS for an app? go to your policy and change it as required
Adding SDRS policy (composition)

  • one policy to rule them all
  • Tag datastore (AFA, Gold, Silver)
  • Then use those tags in your policy

SDRS with Policy

  • initial placement
  • compliance alert
  • remediation

 

Best Practice Guidelines

Homogenous data stores in the POD

  • type (VMFS/NFS)
  • performance characteristics

Full connectivity between hosts and datastores

  • maximum flexibility for initial placement
  • optimal for Mmode operations

Do not mix virtualised and non-virtualised IO workload

  • SDRS/SIOC IO-modelling will be inaccurate
  • hard to ensure SLA guarantees

latency vs throughput

  • setting very low latency will impact IOPS
  • strike a right balance between throughput needs and latency expectations

SIOC and vMSC

  • vMSC and SIOC intro is very deployment specific
  • latency between sites and LUN config (RW/RO) impacts IO modelling
  • throttling IO queue may or may not help due to interference from WAN latency

 

Advanced Config Options

EnforceCorrelationForAffinity

Use datastore correlation while enforcing/fixing anti-affinity rules

  • 0 – disabled
  • 1 – soft enforcement
  • 2 – hard enforcement

EnforceStorageProfiles

Storage profile requirements are enforced during initial placement and load-balancing

  • false – bed effort, relaxed enforcement
  • true – strict enforcement both during IP and LB

MaxConcurrentIOMoves

number of concurrent svMotions per DS during LB and Mmode

adjust only if the storage controller is not able to keep up

  • default – 3
  • min value – 1
  • max value – 8

 

And that’s about it. Solid session. 3.5 stars.