VMware – VMworld 2016 – STO7549 – Achieving Agility, Flexibility , Scalability and Performance with VMware SDS and VVOLs for Business critical databases

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here are my notes on “STO7549 – Achieving Agility, Flexibility , Scalability and Performance with VMware Software-Defined Storage (SDS) and VVOLs for Business critical databases” presented by Sudhir Balasubramanian and Potheri Mohan, with support from Zeehan Khan.

STO7549

This was the third-last session of the week for me, and the first one talking about VVOLs. Possibly also the longest title of any session I’ve attended this week.

 

Key takeaways

Primary Day 1 Operation Challenges for Oracle workloads

  • Provisioning storage – how VMware Storage Policy Based Management (SPBM) helps storage provisioning

Primary Day 2 Operation Challenges for Oracle databases

  • Backup and recovery, cloning and data refresh from production
    • Oracle backup & cloning process at different levels – pros and cons of each approach
  • How VMware Virtual Volumes helps overcome those challenges
  • Use case – Backup and recovery with VVOLs

 

Check out Oracle in a Virtual World.

How many of you have heard of EMC Unity? How many of you are using or evaluating it? Not so many.

Download the UnityVSA

 

Traditional challenges for virtualised business critical databases

Common concerns

  • Day 1 operations of DB has to meet business SLAs
    • provisioning new production DBs to meet performance
    • different DBs have different IO characteristics and capability/criticality
  • Day 2 Operations of production DB has to meet business SLA
    • production backup has to complete in set window
    • cloning needs to complete fast with affecting production performance
    • DB refresh from production has to complete in set window

 

Provisioning storage for an Oracle DB – A detailed view

To meet DB performance demands, you

  • need to understand DB requirements
    • understand DB workload IO profile / characteristics
    • criticality of the application accessing the DB (SLAs)
  • need to understand current infrastructure constraints
    • do the data stores have the capability to sustain current workload?
    • are the data stores able to support the workload if the IO demand scales up?
    • are there storage policies that can be leveraged to meet DB storage requirements? (storage capabilities based on array features and data services)
  • not feasible to do all that in a short time
    • study all data stores’ storage policies
    • choose the right one for placement

 

VMDK vs RDM – the battle rages on :)

 

Oracle backup – Day 2 Operations

  • Requirement of DB backup and recovery
    • Short backup windows – with least production impact, and recoverable and repeatable
  • 3 levels of triggering DB backup on vSphere
    • application level backup (Oracle RMAN / Data pump)
    • vSphere level backup via VMware snapshots
    • storage level backup (storage snapshots)
  • For large DBs, DBAs traditionally prefer
    • DB level backup
    • storage based snapshots

 

Backup methods – Pros and Cons

  • For multi-TB DBs with a high rate of change
    • Oracle RMAN offers a fine level of granularity but is not always the fastest
    • VM level snapshot would be ideal but KB1002836 (a snapshot removal can stop a VM for a long time) and VM level snapshots can stun a VM for a long time.
    • Storage based snapshots would be the fastest at the LUN / data store level but no VMDK level granularity (VMDKs of other apps are also part of the backup thus the time for backup increases)

 

What if we could?

  • Trigger backups / clones from a VM level with VMDK level granularity (ideal)
  • Do a storage based snapshot / clone at the VM level (fastest way of all 3 ways)

 

Introducing VVOLs

Challenges in Legacy Shared Storage Architectures

  • Create fixed-size, uniform LUNs
  • Lack of granular control
  • Complex provisioning cycles
  • LUN-centric storage configurations
  • Extensive manual bookkeeping to match VMs to LUNs
  • LUN granularity hinders per-VM SLAs
  • Over provisioning (better safe than sorry)
  • Wasted resources, wasted time, high costs
  • Frequent data migration
  • Every tier requires a different array
  • Providing multiple levels of service is hard

 

Instead – an app-centric model drives agility and QoS

  • Dynamic delivery of storage service levels when needed
  • Fine control of data services at the VM level
  • Common management across heterogeneous devices
  • Rapid provisioning
  • No over provisioning of resources
  • QoS automation
  • Simple change management

 

The SDS Model

  • Goal is to leverage SDS architecture to bring about storage efficiencies
  • Storage services are dynamically created and delivered on a per VM basis
  • Aligns storage requirements with those of the DB
  • Storage policies are leveraged to precisely meet application requirements
  • Reduces storage over provisioning, IT management cycles, and cost

 

The Policy Driven Control Plane

  • New management layer for SDS
  • Provides orchestration and automation of storage consumption
  • SPBM is VMware’s implementation of the control plane
  • SPBM maps application requirements dynamically to storage services

 

VMware vSphere Virtual Volumes – Integration framework for VM-aware storage

  • Virtual disks are natively represented on arrays
  • Enables VM granular storage operations using array-based data services
  • Extends vSphere SPBM to the storage ecosystem
  • Supports existing storage IO protocols (FC, iSCSI, NFS)
  • Based on T10 industry standards
  • industry-wide initiative supported by major storage vendors
  • included with vSphere

 

High Level Architecture

STO7549_vvol-architecture

  • storage containers
  • protocol endpoint
  • VASA provider

 

Storage Container

  • Logical storage constructs for grouping of VVOLs
  • Typically defined & setup by storage administrators in order to define storage capacity allocations and restrictions
  • capacity based on physical storage capacity
  • logically partition or isolate VMs with diverse storage needs and requirements
  • storage policy settings based on data service capabilities
  • minimum one storage container per array
  • maximum depends on the array

 

Protocol Endpoints (PE)

Access point that enables communication between ESXi hosts and storage array systems

  • part of the physical storage fabric
  • created by storage administrators

 

Scope of PEs

  • Compatible with all SAN and NAS protocols (iSCSI, NFS v3, FC and FCoE)
  • Can support any one of the above protocols at a given time
  • Existing multi-path policies and NFS topology requirements can be applied to the PE

 

VASA Provider (VP)

  • Software component developed by storage array vendors
  • ESX and vCenter Server connect to VASA provider
  • Provides storage awareness services
  • Single VASA provider can manage multiple arrays
  • Supports VASA APIs exported by ESX
  • VASA Provider can be implemented within the array’s management server or firmware
  • Responsible for creating VVOLs

 

VVOLs created for a VM VMDKs

For any VM on a VVOL-enabled data store

There are 5 differnet types of recognised VVOLs (KB2113013)

  • Config-VVOL – Metadata (includes VM home, vmx file, descriptor files for virtual disks, log files, etc)
  • Data-VVOL – VMDKs
  • Mem-VVOL – Swap files
  • Snap-VVOL – Snapshots
  • Other-VVOL – Vendor solution specific

 

Storage Policy Based Management (SPBM) – Array Capabilities

  • Publish capabilities
    Array-based features and data services
    Defines what an array can offer
    Advertised to ESX through VASA APIs

 

Example vCenter storage policy

 

SPBM Rule set 2

  • Tags
  • Policy based on storage tiering
  • datastore choice based on storage policy

 

Provision VM from templates using VMware level storage policies. 39 partners in the program

 

Yet another deep dive on VVOL?

  • A while since vSphere 6 and VVOLs were released
  • But customers still have questions
    • How do the various VVOL components interact?
    • What does the storage containrer look like?
    • How are VM files stored on a VVOL enabled data store?

 

OLTP DB Workload requirements – Example Use case

 

EMC UnityVSA 4.0 VM Setup (Import OVA)

  • VM UnityVSA
  • 2 vCPU / 12GB RAM
  • 6 NICs
  • 9 VMDKs (3 internal and 6 VMDKs for 4 different pools of 100GB each)

 

VSA VVOL – high level steps

  1. Add vCenter to UnityVSA to discover ESXi hosts
  2. Create VASA provider in vSphere
  3. Create Pools with Capability Profile
  4. Add software iSCSI interface to UnityVSA and ESXi hosts
  5. Add NAS server for NFS to UnityVSA
  6. Create iSCSI Storage container (DS-VVOL-Performance, DS-VVOL-ExPerformance, DS-VVOL-MultiTier)
  7. Create NFS Storage Container (DS-VVOL-Capacity)
  8. Check Protocol Endpoints (iSCSI & NFS)
  9. Create iSCSI and NFS data stores on vSphere
  10. Create VVOL enabled VMDK for Oracle DB

 

  • EMC Unity VVOL Deployment Guide
  • VASA Provider details
  • Create pools with capability profile
  • UnityVSA – storage container map to a vSphere VVOL Datastore
  • PE are IO access points from ESXi host to Unity system

 

Oracle DB VM setup

  • VM Name – Oracle-Unity-VVOL
  • OS – Oracle Enterprise Linux 7.2
  • 8 vCPUs / 16GB RAM
  • DB Name – ORAVVOL
  • Oracle 12.1.0.2.0 single instance DB with Grid Infrastructure
  • Install vCLI package on GOS

 

No VMDK is set to independent-persisted (disallows snapshot)

 

Anatomy of a VM (files)

  • Check this link for a good overview.
  • The VMDK is stored on the VM datastore, other files stored on the VVOL datastore

 

Oracle on VVOLs – Use cases

  • Use case 1 – scenario
    • Application is business critical – major application code change
    • QA team want to perform complete system testing
    • Unfortunately production DB size 5TB and rising – lot of time taken to perorm DB cloning
  • Use case here
    • Clone the DB to test application code change, DB software patch, OS patch

 

Backup and recovery – DB Consistent Backup Snapshot

Step 1

  • Create script “hot_backup”
  • Script places DB in Begin Backup Mode
  • Create a VMware snapshot
  • End backup mode for Oracle DB

Step 2

  • Run cloning script “New-VMFromSnapshot.ps1”
  • Clones a VM from the DB-consistent snapshot

Step 3

  • Startup cloned DB VM via script “db_consistent_recovery”
  • Perform DB recovery manually using “recover database” command

 

Check out this whitepaper.

 

Conclusion

  • Oracle VVOL – A game changer for virtualised DBs and applications
  • Trustworthy and seamless backup and recovery
  • Simplified cloning and refresh operations
  • Effective and consistent storage based policy management

 

Some more info about Unity

 

Useful VVOLs links

 

Informative session. 4 stars.