Random Short Take #56

Welcome to Random Short Take #56. Only three players have worn 56 in the NBA. I may need to come up with a new bit of trivia. Let’s get random.

  • Are we nearing the end of blade servers? I’d hoped the answer was yes, but it’s not that simple, sadly. It’s not that I hate them, exactly. I bought blade servers from Dell when they first sold them. But they can present challenges.
  • 22dot6 emerged from stealth mode recently. I had the opportunity to talk to them and I’ll post something soon about that. In the meantime, this post from Mellor covers it pretty well.
  • It may be a Northern Hemisphere reference that I don’t quite understand, but Retrospect is running a “Dads and Grads” promotion offering 90 days of free backup subscriptions. Worth checking out if you don’t have something in place to protect your desktop.
  • Running VMware Cloud Foundation and want to stretch your vSAN cluster across two sites? Tony has you covered.
  • The site name in VMware Cloud Director can look a bit ugly. Steve O gives you the skinny on how to change it.
  • Pure//Accelerate happened recently / is still happening, and there was a bit of news from the event, including the new and improved Pure1 Digital Experience. As a former Pure1 user I can say this was a big part of the reason why I liked using Pure Storage.
  • Speaking of press releases, this one from PDI and its investment intentions caught my eye. It’s always good to see companies willing to spend a bit of cash to make progress.
  • I stumbled across Oxide on Twitter and fell for the aesthetic and design principles. Then I read some of the articles on the blog and got even more interested. Worth checking out. And I’ll be keen to see just how it goes for the company.

*Bonus Round*

I was recently on the Restore it All podcast with W. Curtis Preston and Prasanna Malaiyandi. It was a lot of fun as always, despite the fact that we talked about something that’s a pretty scary subject (data (centre) loss). No, I’m not a DC manager in real life, but I do have responsibility for what goes into our DC so I sort of am. Don’t forget there’s a discount code for the book in the podcast too.

Random Short Take #54

Welcome to Random Short Take #54. A few players have worn 54 in the NBA, but my favourite was Horace Grant. Let’s get random.

  • This project looked like an enjoyable, and relatively accessible, home project – building your own NVMe-based storage server.
  • When I was younger I had nightmares based on horror movies and falling out of bed (sometimes with both happening at the same time). Now this is the kind of thing that keeps me awake at night.
  • Speaking of disastrous situations, the OVH problem was a real problem for a lot of people. I wish them all the best with the recovery.
  • Tony has been doing things with vSAN in his lab and in production – worth checking out.
  • The folks at StorageOS have been hard at work improving their Kubernetes storage platform. You can read more about that here.
  • DH2i has a webinar coming up on SQL Server resilience that’s worth checking out. Details here.
  • We’re talking more about burnout in the tech industry, but probably not enough still. This article from Tom was insightful.

VMware – VMworld 2017 – STO1179BU – Understanding the Availability Features of vSAN

Disclaimer: I recently attended VMworld 2017 – US.  My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my rough notes from “STO1179BU – Understanding the Availability Features of vSAN”, presented by GS Khalsa (@gurusimran) and Jeff Hunter (@jhuntervmware). You can grab a PDF of the notes from here. Note that these posts don’t provide much in the way of opinion, analysis, or opinionalysis. They’re really just a way of providing you with a snapshot of what I saw. Death by bullet point if you will.

 

Components and Failure

vSAN Objects Consist of Components

VM

  • VM Home – multiple components
  • Virtual Disk – multiple components
  • Swap File – multiple components

vSAN has a cache tier and capacity tier (objects are stored here)

 

Quorum

Greater than 50% must be online to achieve quorum

  • Each component has one vote by default
  • Odd number of votes required to break tie – preserves data integrity
  • Greater than 50% of components (votes) must be online
  • Components can have more than one vote
  • Votes added by vSAN, if needed, to ensure odd number

 

Component Vote Counts Are Visible Using RVC CLI

/<vcenter>/datacenter/vms> vsan_vm_object_info <vm>

 

Storage Policy Determines Component Number and Placement

  • Primary level of failures to tolerate
  • Failure Tolerance Method

Primary level of failures to tolerate = 0 Means only one copy

  • Maximum component size is 255GB
  • vSAN will split bigger into smaller sized VMDKs
  • RAID-5/6 Erasure Coding Uses Stripes and Parity (need to be using all-flash)
  • Consumes less RAW capacity
  • Number of stripes also affects component counts

 

Each Host is an Implicit Fault Domain

  • Multiple components can end up in the same rack
  • Configure Fault Domains in the UI
  • Add at least one more host or fault domain for rebuilds

 

Component States Change as a Result of a Failure

  • Active
  • Absent
  • Degraded

vSAN selects most efficient way

Which is most efficient? Repair or Rebuild? It depends. Partial repairs are performed if full repair is not possible

 

vSAN Maintenance Mode

Three vSAN Options for Host Maintenance Mode

  • Evacuate all data to other hosts
  • Ensure data accessibility from other hosts
  • No data evacuation

 

Degraded Device Handling (DDH) in vSAN 6.6

  • vSAN 6.6 is more “intelligent”, builds on previous versions of DDH
  • When device is degraded, components are evaluated …
  • If component does not belong to last replica, mark as absent – “Lazy” evacuation since another replica of the object exists
  • If component belongs to last replica, start evacuation
  • Degraded devices will not be used for new component placement
  • Evacuation failures reported in UI

 

DDH and S.M.A.R.T.

Following items logged in vmkernel.log when drive is identified as unhealthy

  • Sectors successfully reallocated 0x05
  • Reported uncorrectable sectors 0xBB
  • Disk command timeouts 0xBC
  • Sector reallocation events 0xC4
  • Pending sector reallocations 0xC5
  • Uncorrectable sectors 0xC6

Helps GSS determine what to do with drive after evacuation

 

Stretched Clusters

Stretched Cluster Failure Scenarios

  • Extend the idea of fault domains from racks to sites
  • Witness component (tertiary site) – witness host
  • 5ms RTT (around 60 miles)
  • VM will have a preferred and secondary site
  • When component fails, starts rebuilding of preferred site

 

Stretched Cluster Local Failure Protection – new in vSAN 6.6

  • Redundancy against host failure and site failure
  • If site fails, vSAN maintains local redundancy in surviving site
  • No change in stretched cluster configuration steps
  • Optimised logic to minimise I/O traffic across sites
    • Local read, local resync
    • Single inter-site write for multiple replicas
  • RAID-1 between the sites, and then RAID-5 in the local sites

What happens during network partition or site failure?

  • HA Restart

Inter-site network disconnected (split brain)

  • HA Power-off

Witness Network Disconnected

  • Witness leaves cluster

VMs continue to operate normally. Very simple to redeploy a new one. Recommended host isolation response in a stretched cluster is power off

Witness Host Offline

  • Recover or redeploy witness host

New in 6.6 – change witness host

 

vSAN Backup, Replication and DR

Data Protection

  • vSphere APIs – Data Protection
  • Same as other datastore (VMFS, etc)
  • Verify support with backup vendor
  • Production and backup data on vSAN
    • Pros: Simple, rapid restore
    • Cons: Both copies lost if vSAN datastore is lost, can consume considerable capacity

 

Solutions …

  • Store backup data on another datastore
    • SAN or NAS
    • Another vSAN cluster
    • Local drives
  • Dell EMC Avamar and NetWorker
  • Veeam Backup and Replication
  • Cohesity
  • Rubrik
  • Others …

vSphere Replication included with Essentials Plus Kit and higher. With this you get per-VM RPOs as low as 5 minutes

 

Automated DR with Site Recovery Manager

  • HA with Stretched Cluster, Automated DR with SRM
  • SRM at the tertiary site

Useful session. 4 stars.

EMC Announces VxRail

Yes, yes, I know it was a little while ago now. I’ve been occupied by other things and wanted to let the dust settle on the announcement before I covered it off here. And it was really a VCE announcement. But anyway. I’ve been doing work internally around all things hyperconverged and, as I work for a big EMC partner, people have been asking me about VxRail. So I thought I’d cover some of the more interesting bits.

So, let’s start with the reasonably useful summary links:

  • The VxRail datasheet (PDF) is here;
  • The VCE landing page for VxRail is here;
  • Chad’s take (worth the read!) can be found here; and
  • Simon from El Reg did a write-up here.

 

So what is it?

Well it’s a re-envisioning of VMware’s EVO:RAIL hyperconverged infrastructure in a way. But it’s a bit better than that, a bit more flexible, and potentially more cost effective. Here’s a box shot, because it’s what you want to see.

VxRail_002

Basically it’s a 2RU appliance housing 4 nodes. You can scale these nodes out in increments as required. There’s a range of hybrid configurations available.

VxRail_006

As well as some all flash versions.

VxRail_007

By default the initial configuration must be fully populated with 4 nodes, with the ability to scale up to 64 nodes (with qualification from VCE). Here are a few other notes on clusters:

  • You can’t mix All Flash and Hybrid nodes in the same cluster (this messes up performance);
  • All nodes within the cluster must have the same license type (Full License or BYO/ELA); and
  • First generation VSPEX BLUE appliances can be used in the same cluster with second generation appliances but EVC must be set to align with the G1 appliances for the whole cluster.

 

On VMware Virtual SAN

I haven’t used VSAN/Virtual SAN enough in production to have really firm opinions on it, but I’ve always enjoyed tracking its progress in the marketplace. VMware claim that the use of Virtual SAN over other approaches has the following advantages:

  • No need to install Virtual Storage Appliances (VSA);
  • CPU utilization <10%;
  • No reserved memory required;
  • Provides the shortest path for I/O; and
  • Seamlessly handles VM migrations.

If that sounds a bit like some marketing stuff, it sort of is. But that doesn’t mean they’re necessarily wrong either. VMware state that the placement of Virtual SAN directly in the hypervisor kernel allows it to “be fast, highly efficient, and be able to scale with flash and modern CPU architectures”.

While I can’t comment on this one way or another, I’d like to point out that this appliance is really a VMware play. The focus here is on the benefit of using an established hypervisor (vSphere), and established management solution (vCenter) and a (soon-to-be) established software defined storage solution (Virtual SAN). If you’re looking for the flexibility of multiple hypervisors or incorporating other storage solutions this really isn’t for you.

 

Further Reading and Final Thoughts

Enrico has a good write-up on El Reg about Virtual SAN 6.2 that I think is worth a look. You might also be keen to try something that’s NSX-ready. This is as close as you’ll get to that (although I can’t comment on the reality of one of those configurations). You’ve probably noticed there have been a tonne of pissing matches on the Twitters recently between VMware and Nutanix about their HCI offerings and the relative merits (or lack thereof) of their respective architectures. I’m not telling you to go one way or another. The HCI market is reasonably young, and I think there’s still plenty of change to come before the market has determined whether this really is the future of data centre infrastructure. In the meantime though, if you’re already slow-dancing with EMC or VCE and get all fluttery when people mention VMware, then the VxRail is worth a look if you’re HCI-curious but looking to stay with your current partner. It may not be for the adventurous amongst you, but you already know where to get your kicks. In any case, have a look at the datasheet and talk to your local EMC and VCE folk to see if this is the right choice for you.

Storage Field Day 7 – Day 2 – VMware

Disclaimer: I recently attended Storage Field Day 7.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the VMware presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the VMware website that covers some of what they presented.

 

Overview

I’d like to say a few things about the presentation. Firstly, it was held in the “Rubber Chicken” Room at VMware HQ.

Secondly, Rawlinson was there, but we ran out of time to hear him present. This seems to happen each time I see him in real life. Still, it’s not everyday you get to hear Christos Karamanolis (@XtosK) talk about this stuff, so I’ll put my somewhat weird @PunchingClouds fanboy thing to the side for the moment.

SFD7_Day2_VMware_XtosK_HA

Thirdly, and I’ll be upfront about this, I was a bit disappointed that VMware didn’t go outside some fairly fixed parameters as far as what they could and couldn’t talk about with regards to Virtual SAN. I understand that mega software companies have to be a bit careful about what they can say publicly, but I had hoped for something fresher in this presentation. In any case, I’ve included my notes on Christos’s view on the VSAN architecture – I hope it’s useful.

 

Architecture

VMware adopted the following principles when designing VSAN.

Hyper-converged

  • Compute + storage scalability
  • Unobtrusive to existing data centre architecture
  • Distributed software running on every host
  • Pools local storage (flash + HDD) on hosts (virtual shared datastore)
  • Symmetric architecture – no single point of failure, no bottleneck

The hypervisor opens up new opportunities, with the virtualisation platform providing:

  • Visibility to individual VMs and application storage
  • Manages all applications’ resource requirements
  • Sits directly in the I/O path
  • A global view of underlying infrastructure
  • Supports an extensive hardware compatibility list (HCL)

Critical paths in ESX kernel

The cluster service allows for

  • Fast failure detection
  • High performance (especially for writes)

The data path provides

  • Low latency
  • Minimal CPU per IO
  • Minimal Mem consumption
  • Physical access to devices

This equals minimal impact on consolidation rates. This is a Good Thing™.

Optimized internet protocol

As ESXi is both the “consumer” and “producer” of data there is no need for a standard data access protocol.

Per-object coordinator = client

  • Distributed “metadata server”
  • Transactions span only object distribution

Efficient reliable data transport (RDT)

  • Protocol agnostic (now TCP/IP)
  • RDMA friendly

Standard protocol for external access?

Two tiers of storage: Hybrid

Optimise the cost of physical storage resources

  • HDDS: cheap capacity, expensive IOPS
  • Flash: expensive capacity, cheap IOPS

Combine best of both worlds

  • Performance from flash (read cache + write back)
  • Capacity from HDD (capacity tier)

Optimise workload per tier

  • Random IO to flash (high IOPS)
  • Sequential IO to HDD (high throughput)

Storage organised in disk groups (flash device and magnetic disks) – up to 5 disk groups, 1 SSD + 7 HDDs – this is the fault domain. 70% of flash is read cache, 30% is write buffer. Writes are accumulated, then staged in a magnetic disk-friendly fashion. Proximal IO – writing blocks within a certain number of cylinders. Filesystem on the magnetic disks is slightly different to the one on the SSDs. Uses the back-end of the Virsto filesystem, but doesn’t use the log-structure filesystem component.

Distributed caching

Flash device: cache of disk group (70% read cache, 30% write-back buffer)

No caching on “local” flash where VM runs

  • Flash latencies 100x network latencies
  • No data transfers, no perf hit during VM migration
  • Better overall flash utilisation (most expensive resource)

Use local cache when it matters

  • In-memory CBRC (RAM << Network latency)
  • Lots of block sharing (VDI)
  • More options in the future …

Deduplicated RAM-based caching

Object-based storage

  • VM consists of a number of objects – each object individually distributed
  • VSAN doesn’t know about VMs and VMDKs
  • Up to 62TB useable
  • Single namespace, multiple mount points
  • VMFS created in sub-namespace

The VM Home directory object is formatted with VMFS to allow a VM’s config files to be stored on it. Mounted under the root dir vsanDatastore.

  • Availability policy reflected on number of replicas
  • Performance policy may include a stripe width per replica
  • Object “components” may reside in different disks and / or hosts

VSAN cluster = vSphere cluster

Ease of management

  • Piggyback on vSphere management workflow, e.g. EMM
  • Ensure coherent configuration of hosts in vSphere cluster

Adapt to the customer’s data centre architecture while working with network topology constraints.

Maintenance mode – planned downtime.

Three options:

  • Ensure accessibility;
  • Full data migration; and
  • No data migration.

HA Integration

VM-centric monitoring and troubleshooting

VMODL APIs

  • Configure, manage, monitor

Policy compliance reporting

Combination of tools for monitoring in 5.5

  • CLI commmands
  • Ruby vSphere console
  • VSAN observer

More to come soon …

Real *software* defined storage

Software + hardware – component based (individual components), Virtual SAN ready node (40 OEM validated server configurations are ready for VSAN deployment)

VMware EVO:RAIL = Hyper-converged infrastructure

It’s a big task to get all of this working with everything (supporting the entire vSphere HCL).

 

Closing Thoughts and Further Reading

I like VSAN. And I like that VMware are working so hard at getting it right. I don’t like some of the bs that goes with their marketing of the product, but I think it has its place in the enterprise and is only going to go from strength to strength with the amount of resources VMware is throwing at it. In the meantime, check out Keith’s background post on VMware here. In my opinion, you can’t go past Cormac’s posts on VSAN if you want a technical deep dive. Also, buy his book.

EMC Announces VSPEX BLUE

EMC today announced their VSPEX BLUE offering and I thought I’d share some pictures and words from the briefing I received recently. PowerPoint presentations always look worse when I distil them down to a long series of dot points, so I’ll try and add some colour commentary along the way. Please note that I’m only going off EMC’s presentation, and haven’t had an opportunity to try the solution for myself. Nor do I know what the pricing is like. Get in touch with your local EMC representative or partner if you want to know more about that kind of thing.

EMC describe VSPEX BLUE as “an all-inclusive, Hyper-Converged Infrastructure Appliance, powered by VMware EVO:RAIL software”. Which seems like a nice thing to have in the DC. With VSPEX BLUE, the key EMC message is simplicity:

  • Simple to order – purchase with a single SKU
  • Simple to configure – through an automated, wizard driven interface
  • Simple to manage – with the new VSPEX BLUE Manager
  • Simple to scale – with automatic scale-out, where new appliances are automatically discovered and easily added to a cluster with a few mouse clicks
  • Simple to support  – with EMC 24 x 7 Global support offering a single point of accountability for all hardware and software, including all VMware software

It also “eliminates the need for advanced infrastructure planning” by letting you “start with one 2U/4-node appliance and scale up to four”.

Awwww, this guy seems sad. Maybe he doesn’t believe that the hyperconverged unicorn warriors of the data centre are here to save us all from ourselves.

BLUE_man

I imagine the marketing line would be something like “IT is hard, but you don’t need to be blue with VSPEX BLUE”.

Foundation

VSPEX BLUE Software

  • VSPEX BLUE Manager extends hardware monitoring, and integrates with EMC Connect Home and online support facilities.
  • VSPEX BLUE Market offers value-add EMC software products included with VSPEX BLUE.

VMware EVO:RAIL Engine

  • Automates cluster deployment and configuration, as well as scale-out and non-disruptive updates
  • Simple design with a clean interface, pre-sized VM templates and single-click policies

Resilient Cluster Architecture

  • VSAN distributed datastore provides consistent and resilient fault tolerance
  • VMotion provides system availability during maintenance and DRS load balances workloads

Software-defined data center (SDDC) building block

  • Combines compute, storage, network and management resources into a single virtualized software stack with vSphere and VSAN

Hardware

While we live in a software-defined world, the hardware is still somewhat important. EMC is offering 2 basic configurations to keep ordering and buying simple. You getting it yet? It’s all very simple.

  • VSPEX BLUE Standard which comes with 128GB of memory per node; or
  • VSPEX BLUE Performance comes with 192GB of memory per node.

Each configuration has a choice of a 1GbE copper or 10GbE fibre network interface. Here’re some pretty pictures of what the chassis looks like, sans EMC bezel. Note the similarities with EMC’s ECS offering.

BLUE_front

BLUE_rear

Processors (per node)

  • Intel Ivy Bridge (up to 130W)
  • Dual processor

Memory/processors (per node)

  • Four channels of Native DDR3 (1333)
  • Up to eight DDR3 ECC R-DIMMS per server node

Inputs/outputs (I/Os) (per node)

  • Dual GbE ports
  • Optional IB QDR/FDR or 10GbE integrated
  • 1 x 8 PCIe Gen3 I/O Mezz Option (Quad GbE or Dual 10GbE)
  • 1 x 16 PCIe Gen3HBA slots
  • Integrated BMC with RMM4 support

Chassis

  • 2U chassis supporting four hot swap nodes with half-width MBs
  • 2 x 1200W (80+ & CS Platoinum) redundant hot-swap PS
  • Dedicated cooling/node (no SPoF) – 3 x 40mm dual rotor fans
  • Front panel with separate power control per node
  • 17.24” x 30.35” x 3.46”

Disk

  • Integrated 4-Port SATA/SAS controller (SW RAID)
  • Up to 16 (four per node) 2.5” HDD

BLUE12

The VSPEX BLUE Standard configuration consists of four independent nodes consisting of the following:

  • 2 x Intel Dual Intel Ivy Bridge E5-2620 V2 (12 cores, 2.1 Ghz)
  • 8 x 16GB (128GB) , 1666MHz DIMMS Memory
  • 3 x 1.2TB 2.5” 10K RPM SASHDD
  • 1 x 400GB 2.5” SAS SSD (VSAN Cache)
  • 1 x 32GB SLC SATADOM  (ESXi Boot Image)
  • 2 x 10GBE BaseT or SFP+

The Performance configuration only differs from the standard in the amount of memory it contains, going from 128GB in the standard configuration to 192GB in the performance model, ideal for applications such as VDI.

VSPEX BLUE Manager

EMC had a number of design goals for the VSPEX BLUE Manager product, including:

  • Simplified the support experience
  • Embedded ESRS/VE
  • Seamless integration with EVO:RAIL and its facilities
  • The implementation of a management framework that allows driving EMC value-add software as services
  • Extended management orchestration for other use cases
  • Enablement of the VSPEX partner ecosystem

Software Inventory Management

  • Displays installed software versions
  • Discovers and downloads software updates
  • Automated, non-disruptive software upgrades

VB_market

Hardware Awareness

In my mind, this is the key bit of value-add that EMC offer with VSPEX BLUE – seeing what else is going on outside of EVO:RAIL.

  • Provides  information not available in EVO:RAIL
  • Maps alerts to graphical representation of hardware configuration
  • Displays detailed configuration of hardware parts for field services
  • Aggregates health monitoring from vCenter and hardware BMC IPMI
  • Integrates with ESRS Connect Home for proactive notification and problem resolution
  • Integrates with eServices online support resources
  • Automatically collects diagnostic logs and ingrates with vRealize Log Insight

RecoverPoint for VMs

I’m a bit of a fan of RecoverPoint for VMs. The VSPEX BLUE appliance includes an EMC Recoverpoint for VMs license entitling 15 VMs with support for free. The version shipping with this solution also no longer requires storage external to VMware VSAN to store replica and journal volumes.

  • Protect VMs at VM-level granularity
  • Asynchronous and synchronous replication
  • Consistency group for application-consistent recovery
  • vCenter plug-in integration
  • Discovery, provisioning, and orchestration of DR workflow management
  • WAN compression and deduplication to optimize bandwidth utilization

Conclusion

One final thing to note – VMware ELAs not supported. VSPEX BLUE is an all-inclusive SKU, so you can’t modify support options, licensing, etc. But the EVO:RAIL thing was never really a good option for people who want that kind of ability to tinker with configurations.

Based on the briefing I received, the on-paper specs, and the general thought that seems to have gone into the overall delivery of this product, it all looks pretty solid. I’ll be interested to see if any of my customers will be deploying this in the wild. If you’re hyperconverged-curious and want to look into this kind of thing then the EMC VSPEX BLUE may well be just the thing for you.

VMware – VMworld 2014 – STO1424 – Massively Scaling Virtual SAN Implementations

Disclaimer: I recently attended VMworld 2014 – SF.  My flights and accommodation were paid for by myself, however VMware provided me with a free pass to the conference and various bits of swag. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

banner-hero-mon-stream

STO1424 – Massively Scaling Virtual SAN Implementations

STO1424

This session was presented by

Adobe Marketing Cloud Background

  • Massively scaled SAAS base infrastructure
  • Globally distributed
  • 10s of thousands of servers
  • World class operations – Techops
  • Supports operations for multiple product teams that form the Adobe Marketing Cloud

How do they do massive scale?

Frans covered some pretty simple stuff here, but it’s worth listing as a reminder.

  • Standardisation – build on standard commodity hardware
  • Configuration management – CMDB to track and manage device services, roles, etc
  • Automation – Cobbler and Puppet are used to deploy a large amount of machines in a short period of time
  • Self service – provision resources based on workload and product requirements

They don’t want to build “snowflakes”.
VSAN is just another tool in their toolbox. VSAN isn’t going to replace their current storage, it’s complimentary. It’s not going to solve every problem you have, so you need to know your workload.

First Use Case: Core

  • A simple solution to provide core services in every DC – DNS, mail, monitoring, authentication, kickstart, etc
  • Beach Head – DC standup tool.
  • Highly available
  • Not dependent on SAN
  • Standard hardware
  • Took a 1RU configuration, added memory, NICs and reconfigured disk setup to produce “Core” platform.
  • Becomes the building block used to build and manage other services from (Cloud, vCache)

Cache to vCache (It was a Journey)

  • Cache: a server role in a digital marketing group with a large server footprint (approx 8000)
  • Processes billions of hits a day
  • Very sensitive to MTTR
  • Hardware only, mostly blades
  • Actual servers small footprint – 16GB RAM, 146GB HDD, Low CPU usage
  • CentOS – custom monitoring and mgmt tools
  • Widely distributed
  • Current hardware was up for refresh
  • Software wasn’t able to take advantage of the hardware

Requirements: Enter vCache

  • Keep the hardware close to the original platform
  • Do not change server configs
  • Better MTTR
  • NO SAN
  • 4:1 consolidation ratio, starting with 3:1
  • Solution for in-depth monitoring and anomaly detection
  • Automate deployment
  • Deploy 3500 hosts and 14000 VMs globally

vCache version 0.1 (PoC)
Step 1

  • Needed to see if Cache could even run as a VM
  • Used William Lam’s post on virtuallyghetto for SSD emulation on existing hardware
  • Kicked a lot of hosts (7) at once – not happy. 1 at a time was ok – not enough IO to do it.
  • Did ok with 10 million hits per day – but had problems with vMotion and HA.
  • Result: sort of worked, but you really need SSDs to do it properly.

vCache Version 0.5
Step 2 – Meeting the requirements

  • Blade chassis is NOT the best platform for VSAN deployment. For them it works because they had low disk requirements and a 4:1 consolidation ratio
  • Selected MLC SSD – This was down to Cost for them.
  • Setup a VSAN cluster chassis (16 nodes)
  • vCenter resides on Core
  • HA enabled and DRS fully automated

Lessons learned from 0.1 – 0.5

  • Use real world traffic to understand the load capability
  • Use VSAN Observer
  • Test as many scenarios as possible – Chaos Monkey
  • With no memory reservation, they filled disks quicker than expected
  • Stick to the HCL or lose data
  • There’s a KB on enabling VSAN Observer

The Final design

  • Management cluster – Core runs the vCenter appliance
  • Multiple vCenters for segmented impact when failure occurs
  • Setup auto deploy
  • Build host profiles
  • Establish a standard server naming strategy
  • 6 clusters per vCenter, 16 hosts per cluster, 4 VMs per host
  • VSAN spans a chassis but no more (they don’t always have 10Gbps in their DCs)
  • VMs: 16GB, 146GB, 8vCPU and Memory reservation set to 16GB
  • Blade: 96GB, …

Use Adobe SKMS / CMDB as the automation platform

  • SKMS – a portal for device management
  • CMDB – configuration management database
  • Custom build that has tools for deployment (virtual / physical)
  • Tracks device states
  • Contains device information
  • Provides API access to other services to consume
  • Some services including: Cobbler, Puppet, DNS, self service portal
  • Used a lot of concepts from Project Zombie

Automation of vCache

  • Deploy vCenter appliance via Puppet
  • https://forge.puppetlabs.com/vmware/vcsa

Auto deploy

  • Does a lot of the work
  • It has shortcomings – can only deploy to one vCenter in a DC
  • Alan Renouf has a workaround

Chassis Setup

  • DC receives, racks and cables, sets up the management IP, sets to “Racked pending deployment”
  • Chassis configuration script goes out
  • Blades boot via iPXE chaining, checks if it’s configured, runs a firmware update if required and vCache disk configuration script then chains to Auto Deploy.
  • Configured blades boot via Auto Deploy to vCenter for the configured subnet

Blade Setup

  • Cluster gets created in vCenter via a script.

VM Setup

  • Creates an empty vCache template
  • Clone 48 VMs via template
  • MAC addresses, devices names, etc get added to CMDB
  • Set to “Pending Config”
  • Cobbler set to “Ready to PXE”
  • VMs power on at this point
  • VMs kick and puppet manifest is applied
  • Machines marked as “Image complete”
  • They are then added to monitoring and moved to the cache spare pool ready for use

Final steps

  • Standard Operating Procedure (SOP) Design
  • Additional Testing – finding the upper limits of what they can do with this design
  • Incident simulations
  • Alert Tooling – keeping an eye on trends in the environment

What’s Next?

  • Move away from big blade chassis to something smaller
  • Look at Puppet Razor as a deployment option
  • Testing Mixed workloads on VSAN
  • All Flash VSAN
  • Using OpenStack as the CML
  • Looking at Python for provisioning

Andrew then came on and spoke about getting into the Experiment – Prototype – Scale Cycle as a way to get what you need done.

VSAN Automation Building Blocks

VSAN Observer with vCOps

VSAN and OpenStack

Workloads on VSAN

  • Design policy per workload type
  • IO dependent? CPU? or RAM?
  • Core management services: vCenter, Log Insight, vCenter Operations Manager
  • Scale-out services: Storm, Cassandra, Zookeeper cluster
  • What would you like to run? Anything you can virtualise.

And that’s it. Very useful session. I always enjoy customer presentations that don’t just cover the marketing side of things but actually talk about real-world use cases. 4.5 stars.

Brisbane VMUG – May 2014

The Brisbane VMUG meeting on May 8 promises to be a ripper. Highlights include:

  • Disruptive Thinking with Virtual SAN;
  • Virtual SAN Deep Dive from VMware;
  • Impact on Storage Market Today and Architecture Decisions; and
  • Live Demo.

It will be held at the EMC office in Brisbane on Thursday, May 8 from 4pm – 6pm. You can get more details about the event (including registration) from here. I look forward to seeing you there.

VMware – VMworld 2013 announcements

I’m not there, but a few of the announcements coming out of VMworld looked pretty neat. Here’s a few links that shed some light on things. I hope to put some thoughts together on VMware Virtual SAN in the next week or so.

VMware Unveils Next-Generation Products and Services to Further Enable the Software-Defined Data Center – VMware’s announcement.

VSAN Part 1 – A first look at VSAN – A typically great post by Cormac on Virtual SAN.

VMware Virtual SAN (Beta) – VMware’s product page for Virtual SAN.

What’s new in vSphere 5.5?? – An excellent summary by Aravind Sivaraman.

Updated articles page

I’ve added another document to my articles page. This one covers the creation of port-channels between Cisco MDS 9513 switches. I was clueless about a lot of this until a friend from EMC took me through the steps. So I’ve created this document as a way to capture those steps for future reference. Hopefully you’ll find it of use.