VMware – VMworld 2014 – STO1424 – Massively Scaling Virtual SAN Implementations

Disclaimer: I recently attended VMworld 2014 – SF.  My flights and accommodation were paid for by myself, however VMware provided me with a free pass to the conference and various bits of swag. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.


STO1424 – Massively Scaling Virtual SAN Implementations


This session was presented by

Adobe Marketing Cloud Background

  • Massively scaled SAAS base infrastructure
  • Globally distributed
  • 10s of thousands of servers
  • World class operations – Techops
  • Supports operations for multiple product teams that form the Adobe Marketing Cloud

How do they do massive scale?

Frans covered some pretty simple stuff here, but it’s worth listing as a reminder.

  • Standardisation – build on standard commodity hardware
  • Configuration management – CMDB to track and manage device services, roles, etc
  • Automation – Cobbler and Puppet are used to deploy a large amount of machines in a short period of time
  • Self service – provision resources based on workload and product requirements

They don’t want to build “snowflakes”.
VSAN is just another tool in their toolbox. VSAN isn’t going to replace their current storage, it’s complimentary. It’s not going to solve every problem you have, so you need to know your workload.

First Use Case: Core

  • A simple solution to provide core services in every DC – DNS, mail, monitoring, authentication, kickstart, etc
  • Beach Head – DC standup tool.
  • Highly available
  • Not dependent on SAN
  • Standard hardware
  • Took a 1RU configuration, added memory, NICs and reconfigured disk setup to produce “Core” platform.
  • Becomes the building block used to build and manage other services from (Cloud, vCache)

Cache to vCache (It was a Journey)

  • Cache: a server role in a digital marketing group with a large server footprint (approx 8000)
  • Processes billions of hits a day
  • Very sensitive to MTTR
  • Hardware only, mostly blades
  • Actual servers small footprint – 16GB RAM, 146GB HDD, Low CPU usage
  • CentOS – custom monitoring and mgmt tools
  • Widely distributed
  • Current hardware was up for refresh
  • Software wasn’t able to take advantage of the hardware

Requirements: Enter vCache

  • Keep the hardware close to the original platform
  • Do not change server configs
  • Better MTTR
  • NO SAN
  • 4:1 consolidation ratio, starting with 3:1
  • Solution for in-depth monitoring and anomaly detection
  • Automate deployment
  • Deploy 3500 hosts and 14000 VMs globally

vCache version 0.1 (PoC)
Step 1

  • Needed to see if Cache could even run as a VM
  • Used William Lam’s post on virtuallyghetto for SSD emulation on existing hardware
  • Kicked a lot of hosts (7) at once – not happy. 1 at a time was ok – not enough IO to do it.
  • Did ok with 10 million hits per day – but had problems with vMotion and HA.
  • Result: sort of worked, but you really need SSDs to do it properly.

vCache Version 0.5
Step 2 – Meeting the requirements

  • Blade chassis is NOT the best platform for VSAN deployment. For them it works because they had low disk requirements and a 4:1 consolidation ratio
  • Selected MLC SSD – This was down to Cost for them.
  • Setup a VSAN cluster chassis (16 nodes)
  • vCenter resides on Core
  • HA enabled and DRS fully automated

Lessons learned from 0.1 – 0.5

  • Use real world traffic to understand the load capability
  • Use VSAN Observer
  • Test as many scenarios as possible – Chaos Monkey
  • With no memory reservation, they filled disks quicker than expected
  • Stick to the HCL or lose data
  • There’s a KB on enabling VSAN Observer

The Final design

  • Management cluster – Core runs the vCenter appliance
  • Multiple vCenters for segmented impact when failure occurs
  • Setup auto deploy
  • Build host profiles
  • Establish a standard server naming strategy
  • 6 clusters per vCenter, 16 hosts per cluster, 4 VMs per host
  • VSAN spans a chassis but no more (they don’t always have 10Gbps in their DCs)
  • VMs: 16GB, 146GB, 8vCPU and Memory reservation set to 16GB
  • Blade: 96GB, …

Use Adobe SKMS / CMDB as the automation platform

  • SKMS – a portal for device management
  • CMDB – configuration management database
  • Custom build that has tools for deployment (virtual / physical)
  • Tracks device states
  • Contains device information
  • Provides API access to other services to consume
  • Some services including: Cobbler, Puppet, DNS, self service portal
  • Used a lot of concepts from Project Zombie

Automation of vCache

  • Deploy vCenter appliance via Puppet
  • https://forge.puppetlabs.com/vmware/vcsa

Auto deploy

  • Does a lot of the work
  • It has shortcomings – can only deploy to one vCenter in a DC
  • Alan Renouf has a workaround

Chassis Setup

  • DC receives, racks and cables, sets up the management IP, sets to “Racked pending deployment”
  • Chassis configuration script goes out
  • Blades boot via iPXE chaining, checks if it’s configured, runs a firmware update if required and vCache disk configuration script then chains to Auto Deploy.
  • Configured blades boot via Auto Deploy to vCenter for the configured subnet

Blade Setup

  • Cluster gets created in vCenter via a script.

VM Setup

  • Creates an empty vCache template
  • Clone 48 VMs via template
  • MAC addresses, devices names, etc get added to CMDB
  • Set to “Pending Config”
  • Cobbler set to “Ready to PXE”
  • VMs power on at this point
  • VMs kick and puppet manifest is applied
  • Machines marked as “Image complete”
  • They are then added to monitoring and moved to the cache spare pool ready for use

Final steps

  • Standard Operating Procedure (SOP) Design
  • Additional Testing – finding the upper limits of what they can do with this design
  • Incident simulations
  • Alert Tooling – keeping an eye on trends in the environment

What’s Next?

  • Move away from big blade chassis to something smaller
  • Look at Puppet Razor as a deployment option
  • Testing Mixed workloads on VSAN
  • All Flash VSAN
  • Using OpenStack as the CML
  • Looking at Python for provisioning

Andrew then came on and spoke about getting into the Experiment – Prototype – Scale Cycle as a way to get what you need done.

VSAN Automation Building Blocks

VSAN Observer with vCOps

VSAN and OpenStack

Workloads on VSAN

  • Design policy per workload type
  • IO dependent? CPU? or RAM?
  • Core management services: vCenter, Log Insight, vCenter Operations Manager
  • Scale-out services: Storm, Cassandra, Zookeeper cluster
  • What would you like to run? Anything you can virtualise.

And that’s it. Very useful session. I always enjoy customer presentations that don’t just cover the marketing side of things but actually talk about real-world use cases. 4.5 stars.