Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
A quick post to provide some closing thoughts on VMworld 2017 and link to the posts I did during the event. Not in that order. I’ll add to this as I come across interesting posts from other people too.
Here are a few event-related articles I found interesting. You should also get along to the newly launched Blog Beat for some great coverage by a range of bloggers.
This was my third VMworld US event, and I had a lot of fun. I’d like to thank all the people who helped me out with getting there, the people who stopped and chatted to me at the event, and VMware for putting on a great show. I’m looking forward to (hopefully) getting along to it next year (August 26 – 30).
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
You can view the video of Kingston‘s presentation at Tech Field Day Extra VMworld US 2017 here, and download a PDF copy of my rough notes from here.
It’s A Protocol, Not Media
NVMe has been around for a few years now, and some people get it confused for a new kind of media that they plug into their servers. But it’s not really, it’s just a standard specification for accessing Flash media via the PCI Express bus. There’re a bunch of reasons why you might choose to use NVMe instead of SAS, including lower latency and less CPU overhead. My favourite thing about it though is the plethora of form factors available to use. Kingston touched on these in their presentation at Tech Field Day Extra recently. You can get them in half-height, half-length (HHHL) add-in cards (AIC), U.2 (2.5″) and M.2 sizes. To give you an idea of the use cases for each of these, Kingston suggested the following applications:
HHHL (AIC) card
Server / DC applications
High-end workstations
U.2 (2.5″)
Direct-attached, server backplane, just a bunch of flash (JBOF)
White box and OEM-branded
M.2
Client applications
Notebooks, desktops, workstations
Specialised systems
It’s Pretty Fast
NVMe has proven to be pretty fast, and a number of companies are starting to develop products that leverage the protocol in an extremely efficient manner. Coupled with the rise of NVMe/F solutions and you’ve got some pretty cool stuff coming to market. The price is also becoming a lot more reasonable, with Kingston telling us that their DCP1000 NVMe HHHL comes in at around “$0.85 – $0.90 per GB at the moment”. It’s obviously not as cheap as things that spin at 7200RPM but the speed is mighty fine. Kingston also noted that the 2.5″ form factor would be hanging around for some time yet, as customers appreciated the serviceability of the form factor.
Flash media has been slowly but surely taking over the world for a little while now. The cost per GB is reducing (slowly, but surely), and the range of form factors means there’s something for everyone’s needs. Protocol advancements such as NVMe make things even easier, particularly at the high end of town. It’s also been interesting to see these “high end” solutions trickle down to affordable form factors such as PCIe add-in cards. With the relative ubiquity of operating system driver support, NVMe has become super accessible. The interesting thing to watch now is how we effectively leverage these advancements in protocol technologies. Will we use them to make interesting advances in platforms and data access? Or will we keep using the same software architectures we fell in love with 15 years ago (albeit with dramatically improved performance specifications)?
Conclusion and Further Reading
I’ll admit it took me a little while to come up with something to write about after the Kingston presentation. Not because I don’t like them or didn’t find their content interesting. Rather, I felt like I was heading down the path of delivering another corporate backgrounder coupled with speeds and feeds and I know they have better qualified people to deliver that messaging to you (if that’s what you’re into). Kingston do a whole range of memory-related products across a variety of focus areas. That’s all well and good but you probably already knew that. Instead, I thought I could focus a little on the magic behind the magic. The Flash era of storage has been absolutely fascinating to witness, and I think it’s only going to get more interesting over the next few years. If you’re into this kind of thing but need a more comprehensive primer on NVMe, I recommend you check out J Metz’s article on the Cisco blog. It’s a cracking yarn and enlightening to boot. Data Centre Journal also provide a thorough overview here.
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
You can view the video of Druva‘s presentation here, and you can download a PDF copy of my rough notes from here.
DMaaS
Druva have been around for a while, and I recently had the opportunity to hear from them at a Tech Field Day Extra event. They have combined their Phoenix and inSync products into a single platform, yielding Druva Cloud Platform. This is being positioned as a “Data Management-as-a-Service” offering.
According to Druva, the solution takes into account all the good stuff, such as:
Protection;
Governance; and
Intelligence.
It works with both:
Local data sources (end points, branch offices, and DCs); and
Cloud data sources (such as IaaS, Cloud Applications, and PaaS).
The Druva cloud is powered by AWS, and provides, amongst other things:
Auto-tiering in the cloud (S3/S3IA/Glacier); and
Easy recovery to any location (servers or the cloud).
Just Because You Can Put A Cat …
With everything there’s a right way and a wrong way to do it. Sometimes you might do something and think that you’re doing it right, but you’re not. Wesley Snipes’s line in White Men Can’t Jump may not be appropriate for this post, but Druva came up with one that is: “A VCR in the cloud doesn’t give you Netflix”. When you’re looking at cloud-based data protection solutions, you need to think carefully about just what’s on offer. Druva have worked through a lot of these requirements and claim their solution:
Is fully managed (no need to deploy, manage, support software);
Offers predictable lower costs
Delivers linear and infinite (!) scalability
Provides automatic upgrades and patching; and
Offers seamless data services.
I’m a fan of the idea that cloud services can offer a somewhat predictable cost models to customers. One of the biggest concerns faced by the C-level folk I talk to is the variability of cost when it comes to consuming off-premises services. The platform also offers source side global deduplication, with:
Application-aware block-level deduplication;
Only unique blocks being sent; and
Forever incremental and efficient backups.
The advantage of this approach is that, as Druva charge based on “post-globally deduped storage consumed”, chances are you can keep your costs under control.
It Feels Proper Cloudy
I know a lot of people who are in the midst of the great cloud migration. A lot of them are only now (!) starting to think about how exactly they’re going to protect all of this data in the cloud. Some of them are taking their existing on-premises solutions and adapting them to deal with hybrid or public cloud workloads. Others are dabbling with various services that are primarily cloud-based. Worse still are the ones assuming that the SaaS provider is somehow magically taking care of their data protection needs. Architecting your apps for multiple geos is a step in the right direction towards availability, but you still need to think about data protection in terms of integrity, not just availability. The impression I got from Druva is that they’ve taken some of the best elements of their on-premises and cloud offerings, sprinkled some decent security in the mix, and come up with a solution that could prove remarkably effective.
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are my rough notes from “STO3194BU – Protecting Virtual Machines in VMware Cloud on AWS”, presented by Brian Young and Anita Thomas. You can grab a PDF copy of my notes from here.
VMware on AWS Backup Overview
VMware Cloud on AWS
VMware is enabling the VADP backup partner ecosystem on VMC
Access to native AWS storage for backup target
Leverages high performance network between Virtual Private Clouds
VMware Certified – VMware provides highest level of product endorsement
Product certification with VMware Compatibility Guide Listing
Predictable Life Cycle Management
VMware maintains continuous testing of VAPD APIs on VMC releases
Customer Deployed – Same solution components for both on-premises and VMC deployments
Operational Consistency
Choice of backup methods – image-level, in-guest
Choice of backup targets – S3, EBS, EFS
Partner Supported – Partner provides primary support
Same support model as on-premises
VADP / ENI / Storage Targets
VADP
New VDDK supports both on-premises and VMC
VMware backup partners are updating existing products to use new VDDK to enable backup of VMC based VMs
Elastic Network Interface (ENI)
Provide access to high speed, low latency network between VMC and AWS Virtual Private Clouds
No ingress or egress charges within the same availability zone
Backup Storage Targets
EC2 based backup appliance – EBS and S3 storage
Direct to S3
Example Backup Topology
Some partners will support in-guest and image level backups direct to S3
Deduplicates, compresses and encrypts on EC2 backup appliance
Store or cache backups on EBS
Some partners will support vaulting older backups to S3
Summary
VADP based backup products for VMC are available now
Elastic Network Interface connection to native AWS services is available now
Dell EMC Data Protection Suite is the first VADP data protection product available on VMC
Additional VADP backup solutions will be available in the coming months
Dell EMC Data Protection for VMware Cloud on AWS
Data Protection Continuum – Where you need it, how you want it
Dell EMC Data Protection is a Launch Partner for VMware Cloud on AWS. Data Protection Suite protects VMs and enterprise workloads whether on-premises or in VMware Cloud
Same data protection policies
Leveraging best-in-class Data Domain Virtual Edition
AWS S3 integration for cost efficient data protection
Dell EMC Data Domain and DP Suite
Data Protection Suite
Protects across the continuum – replication, snapshot, backup and archive
Covers all consumption models
Broadest application and platform support
Tightest integration with Data Domain
Data Domain Virtual Edition
Deduplication ratios up to 55x
Supports on-premises and cloud
Data encryption at rest
Data Invulnerability Architecture – best-in-class reliability
Includes DD Boost, DD Replicator
Dell EMC Solution Highlights
Unified
Single solution for enterprise applications and virtual machines
Works across on-premises and cloud deployments
Efficient
Direct application backup to S3
Minimal compute costs in cloud
Storage-efficient: deduplication up to 55x to DD/VE
Scalable
Highly scalable solution using lightweight stateless proxies
Virtual synthetic full backups – lightning fast daily backups, faster restores
Uses CBT for faster VM-image backup and restore
Solution Detail
Backup of VMs and applications in VMC to a DD/VE or AWS S3. The solution supports
VM image backup and restore
In-guest backup and restore of applications using agents for consistency
Application direct to S3
ESG InstaGraphic
ESG Lab has confirmed that the efficiency of the Dell EMC architecture can be used to reduce monthly in-cloud data protection costs by 50% or more
ESG Research has confirmed that public cloud adoption is on the rise. More than 75% of IT organisations report they are using the public cloud and 41% are using it for production applications
There is a common misconception that an application, server, or data moved to the cloud is automatically backed up the same way it was on-premises
Architecture matters when choosing a public cloud data protection solution
Source – ESG White Paper – Cost-efficient Data Protection for Your Cloud – to be published.
Manage Backups Using a Familiar Interface
Consistent user experience in cloud and on-premises
Manage backups using familiar data protection UI
Extend data protection policies to cloud
Detailed reporting and monitoring
Software Defined Data Protection Policies
Dynamic Polices – Keeping up with VM data growth and smart policies
Supported Attributes
DS Clusters
Data Center
Tags
VMname
Data Store
VMfolder
VM resource group
vApp
Technology Preview
The Vision we are building towards (screenshot demos).
Further Reading
You can read more in Chad’s post on the solution. Dell EMC put out a press release that you can see here. There’s a blog post from Dell EMC that also provides some useful information. I found this to be a pretty useful overview of what’s available and what’s coming in the future. 4 stars.
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
You can view the video of NetApp‘s presentation here, and download a copy of my rough notes from here.
What’s In A Name?
There’s been some amount of debate about whether NetApp’s HCI offering is really HCI or CI. I’m not going to pick sides in this argument. I appreciate that words mean things and definitions are important, but I’d like to focus more on what NetApp’s offering delivers, rather than whether someone in Tech Marketing made the right decision to call this HCI. Let’s just say they’re closer to HCI than WD is to cloud.
Ye Olde Architectures (The HCI Tax)
NetApp spent some time talking about the “HCI Tax” – the overhead of providing various data services with first generation HCI appliances. Gabe touched on the impact of running various iterations of controller VMs, along with the increased memory requirements for services such as deduplication, erasure coding, compression, and encryption. The model for first generation HCI is simple – grow your storage and compute in lockstep as your performance requirements increase. The great thing with this approach is that you can start small and grow your environment as required. The problem with this approach is that you may only need to grow your storage, or you may only need to grow your compute requirement, but not necessarily both. Granted, a number of HCI vendors now offer storage-only nodes to accommodate this requirement, but NetApp don’t think the approach is as polished as it could be. The requirement to add compute as you add storage can also have a financial impact in terms of the money you’ll spend in licensing for CPUs. Whilst one size fits all has its benefits for linear workloads, this approach still has some problems.
The New Style?
NetApp suggest that their solution offers the ability to “scale on your terms”. With this you can
Optimise and protect existing investments;
Scale storage and compute together or independently; and
Eliminate the “HCI Tax”.
Note that only the storage nodes have disks, the compute nodes get blanks. The disks are on the front of the unit and the nodes are stateless. You can’t have different tiers of storage nodes as it’s all one cluster. It’s also BYO switch for connectivity, supporting 10/25Gbps. In terms of scalability, from a storage perspective you can scale as much as SolidFire can nowadays (around 100 nodes), and your compute nodes are limited by vSphere’s maximum configuration.
There are “T-shirt sizes” for implementation, and you can start small with as little as two blocks (2 compute nodes and 4 storage nodes). I don’t believe you mix t-shirt sizes in the same cluster. Makes sense if you think about it for more than a second.
Thoughts
Converged and hyper-converged are different things, and I think this post from Nick Howell (in the context of Cohesity as HCI) sums up the differences nicely. However, what was interesting for me during this presentation wasn’t whether or not this qualifies as HCI or not. Rather, it was about NetApp building on the strengths of SolidFire’s storage offering (guaranteed performance with QoS and good scale) coupled with storage / compute independence to provide customers with a solution that seems to tick a lot of boxes for the discerning punter.
Unless you’ve been living under a rock for the last few years, you’ll know that NetApp are quite a different beast to the company first founded 25 years ago. The great thing about them (and the other major vendors) entering the already crowded HCI market is that they offer choices that extend beyond the HCI play. For the next few years at least, there are going to be workloads that just may not go so well with HCI. If you’re already a fan of NetApp, chances are they’ll have an alternative solution that will allow you to leverage their capability and still get the outcome you need. Gabe made the excellent point that “[y]ou can’t go from traditional to cloud overnight, you need to evaluate your apps to see where they fit”. This is exactly the same with HCI. I’m looking forward to see how they go against the more established HCI vendors in the marketplace, and whether the market responds positively to some of the approaches they’ve taken with the solution.
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are my rough notes on “STO3331BUS – Cohesity Hyperconverged Secondary Storage: Simple Data Protection for VMware and vSAN” presented by Gaetan Castelein of Cohesity and Shawn Long, CEO of viLogics. You can grab a PDF of my notes from here.
Secondary Storage Problem
SDS has changed for the better.
Primary storage has improved dramatically
Moving from:
High CapEx costs
Device-centric silos
Complex processes
To:
Policy-based management
Cost-efficient performance
Modern storage architectures
But secondary storage is still problematic
Rapidly growing data
6ZB in 2016
93ZB in 2025
80% unstructured
Too many copies
45% – 60% of capacity for copy data
10 – 12 copies on average
$50B problem
Legacy storage can’t keep up
Doesn’t scale
Fragmented silos
Inefficient
Cohesity Hyperconverged Secondary Storage
You can use this for a number of different applications, including:
File shares
Archiving
Test / Dev
Analytics
Backups
It also offers native integration with the public cloud and Cohesity have been clear that you shouldn’t consider it to be just another backup appliance.
You can read more about Cohesity’s cloud integration here.
Use Cases
Simple Data Protection
Distributed File Services
Object Services
Multicloud Mobility
Test / Dev Copies
Analytics
You can use Cohesity with existing backup products if required or you can use Cohesity DataProtect.
Always-Ready Snapshots for Instant Restores
Sub-5 minute RPOs
Fully hydrated images (linked clones)
Catalogue of always-ready images
Instant recoveries (near-zero RTOs)
Integration with Pure Storage
Tight Integration with VMware
vCenter Integration
VADP for snap-based CBT backups
vRA plugin for self-service, policy-based management
CloudArchive
Policy-based archival
Dedupe, compression, encryption
Everything is indexed before it goes to the cloud – search files and VMs
Individual file recovery
Recover to a different Cohesity cluster
CloudReplicate
Replicate backup data to cloud
Deploy Cohesity to the cloud (available on Azure currently, other platforms soon).
Reduce TCO
You can move from “Legacy backup”, where you’re paying maintenance on backup software and deduplication appliances, to paying just for Cohesity.
Testimonial
Shawn Long from viLogics then took the stage to talk about their experiences with Cohesity.
People want to consume IT
“Product’s only as good as the support behind it”
Conclusion
This was a useful session. I do enjoy the sponsored sessions at VMworld. It’s a useful way for the vendors to get their message across in a way that needs to tie back to VMware. There’s often a bit of a sales pitch, but there’s usually also enough information in them to get you looking further into the solution. I’ve been keeping an eye on Cohesity since I first encountered them a few years ago at Storage Field Day, and their story has improved in clarity and coherence since them. If you’re looking at secondary storage solutions it’s worth checking the out. You’ll find some handy resources here. 3.5 stars.
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are my rough notes from “PBO3334BUS – State of the Union: Everything multi-cloud, converged, hyper-converged and more!” presented by Chad Sakac. You can grab a copy of the PDF here.
Confusion
“There’s a lot of confusion inside the marketplace” and people are struggling to see the pattern.
The IT universe is $2.7 trillion and it’s growing at 2% CAGR (growth has slowed to GDP). The spending on on-premises infrastructure is around $1T and this figure is shrinking. The primary movements are towards
SaaS (+60% CAGR); and
Cloud Native, AWS, Azure (+60% CAGR)
On-premises is comprised of:
Servers $100B
Network $100B
Storage $70B
Servers have been negative as a whole in terms of revenue. Funnily enough, blades are cold, and rack mounts are hot – SDS / SDN / SDDC is driving this. From a networking perspective, parts of Cisco are growing (wireless) and declining (switching and routing). Switch hardware is all mainly the same merchant silicon. Storage has been -9% CAGR for the last 12 (?) quarters.
This industry will consolidate. CI is growing a little bit, while HCI is on fire, with the HCI market being worth around $2.5B – $4B today.
The “Easy buttons” are growing but people still want to know “What the hell do I put where?”. If I have a cloud-first strategy – what does that actually mean? It’s also a financial decision – CapEx and OpEx.
CapEx vs OpEx
Where?
Should it be on or off-premises (not managed service, but multi-tenant, public cloud)? You need to consider:
Data gravity (stuff has to live somewhere);
Governance (so many people don’t understand this); and
What you have / don’t have (sometimes, there are constraints on what you can do).
Value
You running it vs Someone else running it. Does the act of doing “it” differentiate you? Remember that there’s no one right answer for this with any given customer.
We have to start from the top of the pyramid, remember that “[c]loud is an operating model, not a place”. It provides you with:
Single control point
Automation
Metering
Self-service
Capacity management
Monitoring and reporting
Built-in security
Service-level choice
Hybrid Cloud Platforms
Top concerns among public cloud users
41% cost predictability and cost of data transfer
28% latency performance
41% compliance
Top concerns among private cloud users
34% lack of access to value added cloud services – like databases and analytics platforms
Public and Private Cloud complement each other perfectly, and an integrated catalogue of services can serve organizations well – provided they can be managed and used in a unified manner.
Driven by workload variations – emergence of “clouds built for purpose”
Mission-critical applications
General purpose applications
Cloud-native applications
Complexity = Build your own = Hard. Platform, Orchestration, Virtualisation, Servers, Storage and Network <- all hard to get right. >70% of IT resources and budgets go to “snowflake” solutions.
Dell EMC Enterprise Hybrid Cloud
You can potentially achieve:
67% total savings over three years vs build your own
2x faster ITaaS delivery vs build your own
100% savings for level 2-3 platform support vs build your own
42% upgrade savings over three years vs build your own
74% faster time to upgrade vs build your own
New: EHC on VxRack SDDC
Turnkey cloud experience at rack-scale.
*Demo – EHC 4.1.2
EHC on VxRail – What’s New
Multi-site support for increased scale
Application and VM-level disaster recovery
Improved automation-driven install and upgrade
Customers needed the same thing, but for cloud-native apps. What if you want to start at the infrastructure layer?
“Anyone who says something is simple and flexible is a salesperson.”
Dell EMC CI and HCI in Simple Terms
The simplest, most powerful, most integrated HCI appliance … for customers standardized on VMware – VxRail is where to start.
If considering VxRail – but are ready for network and SDN transformation – VxRack SDDC is for you.
For a scalable and flexible HCI system with hypervisor of choice or bare metal … can start small and scale out – VxRack FLEX is for you
A flexible HCI Appliance that can start small for customers who want hypervisor choice – we have the XC Series
For workloads with specific capacity, performance, or data service needs – VxBlock is for you.
VxRail 4.5
Latest VMware technology
vSphere 6.5 U1
vSAN 6.6
Enhanced security options
More enterprise capabilities
10x faster expansion, 50% better IOPS.
VxRack Update
Lastest and greatest VMware software stack including VMware Cloud Foundation 2.2
vSphere 6.5 U1
vSAN 6.6.1
NSX 6.3.3
Single Management Cluster and Management Domain for easier scaling up to 8 racks
Now 40 Dell EMC PowerEdge configurations for both expanded high performance and entry level options in cores, memory and CPUs
PowerEdge 14G Servers
The bedrock of the modern DC. Hardware is low margin, software is high margin. To succeed in the hyperconverged world – you’ll need to be a mega server vendor.
New: Cloud Flex – cloud economics for HCI and HCO
Cloud-like economic model – eliminate acquisition costs and move to a straightforward OpEx cost structure
No obligation – Customers experience the benefits of HCI without a long term commitment
Price drops over time – Ensures monthly rate is competitive with decreasing price of technology
Cost advantages over Public Cloud
VDI Workloads – Up to 32% 1st year savings, Up to 62% in 4th year
General purpose virtualized server workloads – Up to 47% 1st year savings, Up to 67% in 5th year
Microsoft SQL Server workloads – Up to 41% 1st year savings, Up to 63% in 5th year
Builders to Buyers
More and more dollars are shifting towards buy, but the majority of the world is still in the traditional “best of breed / I want to build it myself”. Have to keep innovating in the point technologies, such as Dell EMC Data Protection Suite for Applications. This is also built in to VMware Cloud on AWS.
At this point Chad was really running out of time, but here are a few other things to check out:
Dell EMC Ready Solutions – Designed to reduce risk, and accelerate the rate at which people can do stuff. You can be spending your time better. These are available as:
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are my rough notes from “SER1166BU – Housekeeping Strategies for Platform Services Controller-Expert Talk”, presented by Jishnu Surendran Thankamani and Agnes James, both of whom are GSS employees with VMware. You can grab a PDF copy of them here.
Know more about PSC
Infrastructure Services offered by PSC
VMDir – internally developed LDAP service
Single Sign-on (IDMD, STS, SSOAdmin, LookupService)
VMware Certificate Authority
Licensing
Certificates
VMware Endpoint Certificate Manager
Each node has one Machine Endpoint Certificate
Solution User Certificates
machine
vsphere-webclient
vpxd
vpxd-extensions
Right Decisions at the Right Time
Topology Based Best Practices
Embedded PSC
Expected to be simple topology with easy maintenance
Availability management is a matter of protecting a single machine (vCenter HA)
External PSC
Expected to be used with multiple vCenters involved
Availability management based on load balancer options
When more than one PSC is involved replication becomes a point of interest
Maintain same build of PSCs
Use sites to group PSCs in multiple HA groups – PSCs behind a load balancer
Latency between PSCs – as low as possible
Configuration Maximums
Maximum number of PSCs supported in replication – 8 (6.0), 10 (6.5)
Maximum number of PSCs behind load balancer – 4
Maximum vCenters in single SSO domain – 10 (6.0 and 6.5), 15 (6.5 U1)
Group membership per user for best performance – 1015
Factors for Design Decisions
Area
Choices
Justification
Implication
Deployment Topology
Embedded
Reduced Resource utilisation for Management, VCHA availability needed on PSC as well
VCs in Linked Mode is not a supported topology
External
Multi-VC and Single Management access
More VMs to manage
SSO Domain
One
Share authentication and license data across components and regions / “disposable” PSC
More than one
Embedded PSCs / Replication requirements are not met
Separate availability / management practice
Replication Topology
Linear
No manual intervention. Agreements made in deployment order
SPoF possible in more than two PSC case
Ring
Each PSC with two replication partners
CLI must be used
PSC HA
Standby PSC without load balancer
Load balancer management overhead is a constraint / manual failover acceptable
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are my rough notes from “STO2063BU – Architecting Site Recovery Manager to Meet Your Recovery Goals” presented by GS Khalsa. You can grab a PDF version of them from here. It’s mainly bullet points, but I’m sure you know the drill.
Terminology
RPO – Last viable restore point
RTO – How long it will take before all functionality is recovered
You should break these down to an application, or a service tier level.
Protection Groups and Recovery Plans
What is a Protection Group?
Group of VMS that will be recovered together
Application
Department
System Type
?
Different depending on replication type
A VM can only belong to one Protection Group
How do protection groups fit into Recovery Plans?
[Image via https://blogs.vmware.com/vsphere/2015/05/srm-protection-group-design.html]
vSphere Replication Protection Groups
Group VMs as desired into Protection Groups
What storage they are located on doesn’t matter
Array Based Protection Groups
If you want your protection groups to align to your applications – you’ll need to shuffle storage around
Policy Driven Protection
New style Protection Group leveraging storage profiles
High level of automation compared to traditional protection groups
Policy based approach reduces OpEx
Simpler integration of VM provisioning, migration, and decommissioning
This was introduced in SRM 6.1
How Should You Organise Your Protection Groups?
More Protection Groups
Higher RTO
Easier testing
Only what is needed
More granular and complex
Fewer Protection Groups
Lower RTO
Less granular, complex and flexible
This varies by customer and will be dictated by the appropriate combination of complexity and flexibility
Topologies
SRM Supports Multiple DR Topologies
Active-Passive Failover
Dedicated resources for recovery
Active-Active Failover
Run low priority apps on recovery infrastructure
Bi-directional Failover
Production applications at both sites
Each site acts as the recovery site for the other
Multi-site
Many-to-one failover
Useful for Remote Office / Branch Office
Enhanced Topology Support
There are a few different topologies that are supported.
[Images via https://blogs.vmware.com/virtualblocks/2016/07/28/srm-multisite/]
10 SRM pairs per vCenter
A VM can only be protected once
SRM & Stretched Storage
[Image via https://blogs.vmware.com/virtualblocks/2015/09/01/srm-6-1-whats-new/]
Supported as of SRM 6.1
SRM and vSAN Stretched Cluster
[Image via https://blogs.vmware.com/virtualblocks/2015/08/31/whats-new-vmware-virtual-san-6-1/]
Failover to the third site (not the 2 sites comprising the cluster)
Enhanced Linked Mode
You can find more information on Enhanced Linked Mode here. It makes it easier to manage your environment and was introduced in vSphere 6.0.
Impacts to RTO
Decision Time
How long does it take to decide to failover?
IP Customisation
Workflow without customisation
Power on VM and wait for VMtools heartbeats
Workflow with IP customisation
Power on VM with network disconnected
Customise IP utilising VMtools
Power off VM
Power on VM and wait for VMtools heartbeats
Alternatives
Stretched Layer 2
Move VLAN / Subnet
It’s going to take some time to do when you failover a guest
Priorities and Dependencies vs Priorities Only
Organisation for lower RTO
Fewer / larger NFS datastore / LUNs
Fewer protection groups
Don’t replicate VM swap files
Fewer recovery plans
VM Configuration
VMware Tools installed in all VMs
Suspend VMS on Recovery vs PowerOff VMs
Array-based replication vs vSphere Replication
Recovery Site Sizing
vCenter sizing – it works harder than you think
Number of hosts – more is better
Enable DRS – why wouldn’t you?
Different recovery plans target different clusters
Recommendations
Be Clear with the Business
What is / are their
RPOs?
RTOs?
Cost of downtime?
Application priorities?
Units of failover?
Externalities?
Do you have Executive buy-in?
Risk with Infrequent DR Plan Testing
Parallel and cutover tests provide the best verification, but are very resource intensive and time consuming
Cutover tests are disruptive, may take days to complete and leaves the business at risk
Frequent DR Testing Reduces Risk
Increased confidence that the plan will work
Recovery can be tested at anytime without impact to production
Test Network
Use VLAN or isolated network for test environment
Default “auto” setting does not allow VM communication between hosts
Different PortGroup can be specified in SRM for test vs actual run
Specified in Network Mapping and / or Recovery Plan
Test Network – Multiple Options
Two Options
Disconnect NSX Uplink (this can be easily scripted)
Use NSX to create duplicate “Test” networks
RTO = dollars
*Demos
Conclusion and Further Reading
I enjoy these kind of sessions, as they provide a nice overview of the product capabilities that ties in well with business requirements. SRM is a pretty neat solution, and something you might consider using if you need to move workload from one DC to another. If you’re after a technical overview of Site Recovery Manager 6.5, this site is pretty good too. 4.5 stars.
Disclaimer: I recently attended VMworld 2017 – US. My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are my notes on gifts, etc, that I received as an attendee at VMworld US 2017. Apologies if it’s a bit dry but I’m just trying to make it clear what I received during this event to ensure that we’re all on the same page as far as what I’m being influenced by. I’m going to do this in chronological order, as that was the easiest way for me to take notes during the week. While every attendee’s situation is different, I took 3 days of holiday time and 3 days of training time to be at this event.
Friday
My employer paid for my taxi to the airport. I flew Qantas economy class from BNE – LAX – LAS courtesy of ActualTech Media. In LA I had a 4 hour layover so I had a breakfast burger at one of the over-priced eateries in Terminal 4. This was paid for by my employer. My taxi to the hotel was also covered by my employer. I stayed at New York New York. The cost of this was very kindly covered by Tech Field Day. On Friday night I did non-VMworld related things (like going to an awesome Kevin Seconds / The Selecter / Dropkick Murphys / Rancid concert) at my own expense.
Saturday
I had a few meals over the weekend – these were covered by my employer.
Sunday
On Sunday I went to the conference venue and picked up my VMworld backpack (containing a notepad, pen, t-shirt and water bottle). That night there was an attendee welcome reception in the Solutions Exchange. I had 3 Ballast Point beers. I also picked up:
A Veeam stubbie cooler (they call them Koozies in the US) and a carry bag;
A Cohesity vExpert backpack (containing a 6000mAh / 3.7V Giga Charger, h2go arc water bottle, socks, chill copper vacuum 20oz drink container).
Mark Browne very kindly covered my entry to the VMunderground party where I had 3 Lagunitas Aunt Sally beers and some finger food.
Monday
I started the conference with the classic “Continental Breakfast”, which consisted of a range of fruit and some orange juice. For lunch I had a caesar salad and some reasonably tasty salmon.
I did another whip around the Solutions Exchange and picked up:
Most of my Tuesday was spent at Tech Field Day Extra. This was held at the Delano. In the suite I had some coffee, 2 Krispy Kreme donuts and some water. Kingston gave each of us a 64GB USB stick. I had pasta and a bread stick for lunch. Druva gave each of us socks and vTrail map book.
I took an Uber to Lotus of Siam. This was paid for by Tom Hollingsworth. At dinner I had 3 large Singha beers and various dishes, including the garlic prawns and deep-fried banana. It was, as always, very tasty. The meal was paid for by people from Scale Computing, Datrium, Druva, Silicon Valley PR, and Tech Field Day. Transport back to the strip was courtesy of Cody Bunch and his people mover .
Wednesday
I grabbed some fruit (the “classic continental”) for breakfast and some coffee. I picked up a vExpert swag bag (containing an “I heart vSphere” sticker, a VMware-branded Tritan water bottle, VMware-branded 4000mAh battery, a $5 discount off the Host Resources Deep Dive book, a vSAN sticker and a USB-C / Lightning cable) from the VMTN & Community area, along with a VMTN Network Member t-shirt.
I did a whip around the Solutions Exchange and also picked up:
I had briefings and sessions throughout the official lunch period, so I bought a Rocket burger, fries and a chocolate shake from the Johnny Rockets at the conference centre. This was covered by my employer. I did a little time at the VMUG booth and picked up a VMUG stubbie cooler and sticker. I then grabbed an IBM WebCam cover on my way out the door. At the appreciation party I had 3 “hot dogs”, a pretzel, a Shock Top Belgian White beer and 3 Goose IPA beers. The hot dogs are in quotation marks because I’m not sure what they were but given an appropriate amount of mustard and tomato sauce they were close to edible. At the end of the event I retreated to New York New York for 2 Heinekens and a slice of pizza.
Thursday
For breakfast I had the classic continental, consisting of a fruit cup and orange juice. I randomly encountered David Glynn and he gave me a VMworld backpack. For the last lunch of the conference I had some ham and turkey sandwiches, salad and a coffee. Once the conference finished up I shared a Lyft to the airport with a friend and his wife. Please now enjoy this photo of a baseball card with my likeness on it.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.