VMware – VMworld 2016 – STO7903 – An Industry Roadmap: From storage to data management

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here are my very rough notes on “STO7903 – An Industry Roadmap: From storage to data management” presented by Christos Karamanolis. Note that this is a high-level, future looking presentation, so the standard disclaimer (none of this may come to pass, etc) applies.

STO7903

Data: tremendous business value potential. Information growth (from now until 2020, the size of the digital universe will about double every 2 yrs). Driverless cars produce approximately 4TB of data per day – Where does all the data go?

 

The decline of the disk array

STO7903_Disk_Decline

“Enterprise storage market sees continued decline … total capacity shipments down 4% year over year” – IDC June 2016

 

The rise of SDS

The world of IT is changing rapidly:

  • infrastructure sprawl
  • hybrid clouds
  • rapid app evolution
  • regulatory “spaghetti”

Traditional IT is still obsessed with infrastructure

Enter the software-defined data centre. HCI overcomes legacy limitations with a seamlessly integrated architecture:

  • Hyperconverged software
  • compute, storage and networking
  • tightly integrated software stack
  • Industry-standard hardware
  • convergence of physical storage on x86 hardware
  • building block approach

 

It’s a brave new IT world. With SDDC, HCI and VSAN, help customers approximate the operational efficiencies of mega clouds

1. infrastructure management

2. application lifecycle management

Now, we are extending these two dimensions to a hybrid IT world

 

Open challenge: Data Lifecycle Management

Data is in chains – storage products comes with data services: tightly coupled with physical infrastructure

  • snapshots / clones
  • replication
  • deduplication
  • checksums
  • encryption

It doesn’t have to be this way!

 

Decouple data services from

  • physical infrastructure boundaries
  • storage (persistence) layer

VMware is already offering data protection products that are decoupled from storage. Current focus: disaster recovery

 

vSphere Replication

  • per-VM, host-based replication
  • network-efficient by replicating only changed data
  • included with vSphere Essentials Plus and higher editions

 

VMware Site Recovery Manager

  • recovery plans for 1000s of VMs
  • Non-disruptive recovery testing
  • automated DR workflows
  • integrated with the VMware stack
  • eliminates complexity and risk
  • fast and highly predictable RTOs
  • policy-driven DR control

 

Data storage disruptors

  • Public clouds: compelling storage options
    • cost-effective archival storage
    • range of availability options
    • not meeting all requirements for Enterprise primary storage
  • New compute consumption models
    • pay as you go infrastructure
    • software as a service

Data is escaping the boundaries of IT organisations.

 

VMware vCloud Availability: DR to the cloud

STO7903_DR_Cloud

“Today’s storage products do not meet the requirements of the evolving IT industry”

 

Long-term example: the “portable snapshot”

Imagine snapshots:

  • decoupled from storage
  • mobile across physical locations
  • archived anywhere
  • recovered anywhere

Portable snapshot use case: data protection

 

STO7903_DP1

With this technology, your workloads become a heck of a lot more portable.

STO7903_DP2

 

Scalable data lifecycle management

Multiple Use cases

  • Near line protection
  • offline protection
  • disaster recovery
  • global catalog

With this, you can have:

  • data protection by policy
  • across locations, storage
  • global catalog of data
  • diverse recovery workflows – locations, storage
  • granular recovery options
    • files and folder
    • VM and / or VMDK
    • application

 

Portable snapshot use case: Application Mobility

Instantaneous app provisioning

  • examples – test and dev in cloud, production on premises, application recovery natively on public cloud, elastic expansion of application components
  • cloud orchestration tools – discovery, monitoring, reporting

Challenges

  • image management, conversion
  • archival -> primary, instantly!

 

Applicability to new generation applications?

  • application awareness is critical
    • group crash consistency
    • semantic consistency (application quiescing)

But what about distributed applications using micro-services?

  • requirements: elasticity, mobility, fast deployment

 

A distributed File System for Cloud-native Apps

  • Hyperconverged distributed FS
  • Relies on block storage (VSAN)
  • Scalable data volume sharing
  • Efficient image management
  • Group-Consistent snapshots / clones for stageful distributed applications

STO7903_DFS

Exaclones!

 

How do you track and control all that data?

Data governance

  • define policies and standards for data handling
  • quantify cost vs risk
  • demonstrate legal compliance
  • protect confidentiality, integrity, availability of information assets

 

Other “decoupled” data services

  • end-to-end data integrity: checksums
  • space efficiency: dedupe / compression
  • multi-tenancy: access control, isolation
  • security: encryption, attribution

 

Focus on data management, not infrastructure

  • data repair and recovery
  • test and development
  • file or DB recovery
  • data archival
  • data analytics

 

I love that we’re moving away from infrastructure to an application focus – this is a good sign fro the industry. I look forward to seeing what develops over the next few years. Duncan did a much better write-up on this session than me – you can read it here. Great session. 4 stars.

 

 

 

VMware – VMworld 2016 – CTO7516 – Ask the Experts – Titans of Tech

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here’re my rough notes from “CTO7516 – Ask the Experts – Titans of Tech”. The panel-style session was moderated by Rick Scherer, Lead Global Architect at EMC. The participants were:

CTO7516_2

The content of this session is driven by the audience questions, so if my notes are a little weird, that’s my excuse.

 

What’s the future of the composable (yes, I know it’s not a real word) infrastructure market?

Chad – it has a fit with on-premises variations of cloud-native stacks, creating highly programmable, low-level infrastructure APIs. “Right now, composable is more buzzword than reality.”

Matt – this been around for a long time. You need to be down in the deep optimisation bowels – this happens with the hyper scale providers

Jason – hyper scale vs traditional IT.

Matt – modularity of composability

 

Storage and network has gone down cost-wise, but not compute? Is this hindering cloud?

Chad – AWS and Azure are profitable. The pricing war ($/GB) has tapered off.

Matt – compute price has dropped the fastest the furthest over the last few decades. Cost of cloud is rental economics. For the same reason we don’t furnish our houses with rental furniture (except maybe if you need it short-term, like when you’re selling your house).

Chad – the cost of WAN has been the slowest to drop. Latency is still a problem, so you still end up with compute co-located with storage/data.

All the different public cloud options feels like going to different operating systems in the old days.

Kit – a lot of people have very specific functions for things. Some people are fine with going all in, knowing they’ll be “locked in” (everything requires effort to move on and off). Other people have rules for developers around what they can use from a public cloud service perspective.

Matt – we haven’t done a good job of providing a framework for evaluating what we need. Portability’s the holy grail to which we’ve never really gotten.

Jason – traditional view of IaaS and PaaS is gone. We’re in an expansion phase.

Chad – it’s all about benefit relative to complexity to move. How long does an app live in an enterprise? 20 years. What about your kid? 20 years. You just had a “cloud baby”. Prophylactic (Cloud Foundry) forces you to not bind to a specific service.

Rick – originally trying to solve a problem that they might not actually have.

Jason – move towards more application-centric view of the world. People are thinking about what the application needs, and having the infrastructure adapt to that.

Matt – altitudes of developers have shifted in the last 15 years in terms of where they slot in to the infrastructure.

Kit – evolution and standardisation of the hardware stack. Linux is becoming a more standard way of doing the OS. With containers – the importance thing is the application and the required libraries.

Chad – “OpenStack has lost a little of its mojo.”

Matt – it was all about running your infrastructure, not about applications, etc.

Kit – what does OpenStack want to be when it grows up? An on-premises version of AWS? A cheaper version of VMware? Trying to be everything to everyone.

 

Gartner’s IaaS Magic Quadrant refers to Google as also-ran, SoftLayer as also-ran/niche? Are they on track?

Matt – it’s all about sourcing models. What you source from whom and where. Buzzwords are bad for clarity in the industry.

Jason – MQ says more about Gartner than it does the industry. People are doing more than just IaaS though – so what did they evaluate?

Chad – Gartner do not believe in degrees of hybridity – it’s all going to public. Chad disagrees with this, the workloads will go where they’re going to go.

Matt – It’s a useful tool to provoke debate.

Chad – Gartner and IDC have built their taxonomy forced vendors to change the way they counted their products (including SKUs) for nonsensical reasons.

Rick – Now that they’re Dell they’re leaders in 21 MQs :)

 

What’s the next big thing?

Chad – 3-5 years talking about maturity of tech that is fringe today. Containers, eg. NoSQL. Probably still be trapped by Larry 5 years from now.

Matt – CIOs getting closer to CEOs. Seeing interesting combinations of apps, tech and business processes. IoT in fishing, big data in farming. Starting to get over the hump of being overwrought with IaaS.

Kit – NVRAM – persistent storage with RAM latency. FPGAs. GPGPUs, e.g. Nvidia. As these technologies converge, what does that mean? How do we make it accessible for developers and application teams?

Jason – the whole AI space – the impact of that will be huge.

Chad – Check out dwave – Canadian startup – starting to reach commercial application.

Kit – Quantum crypto.

Jason – IBM has a quantum computer they put on the cloud – check out BlueMix.

 

Culture eats strategy for breakfast.

Chad – it is the largest inhibitor to tech adoption. Can give organisations near-death experiences. Individuals who are empathetic but passionate and driven, they can effect change ridiculously fast. We want to get here, but our culture stops us. But why? You need the right person at the right level to make that change happen.

Kit – no company is special in this regard. It takes the “heroic” acts of some people to get changes done. Fail whale in the early days of twitter. There was a guy who fixed it up – needed a big change though.

Chad – people underestimate the power of the individual even inside giant corporate machines. People rally to good ideas.

Jason – micro-services is a people problem. Getting people to work together in a different way through technology.

 

SDDC and the military. What is the future of “industry certifications” (particularly in security) now that we’re all about hyper converged infrastructure?

Chad – it’s a matter of time and effort. It’s not a technology problem – it’s policy and governance.

Kit – we saw this with virtualisation as well. It takes time.

Chad – There are people whose job it is to lobby the govt on this kind of thing. “I don’t know how they jump out of bed every day”. “You want to buy two for twice the price?”

 

Interesting session. I’d like to revisit some of these topics in the near future. 4 stars.

 

VMware – VMworld 2016 – STO8718-SPO – Building Next-Gen Data Protection for VMware Environments with Rubrik

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here are my rough notes from “STO8718-SPO – Building Next-Gen Data Protection for VMware Environments with Rubrik” presented by Chris Wahl and Chris Gurley.

STO8718-SPO_Intro

 

Agenda

  • What is Cloud Data Management?
  • Protection via policy-driven SLAs
  • Restoring Services

 

Cloud Data Management

What customers are doing? Looking to build more cloud like environments for legacy and next-generation applications.
So what about Rubrik? They “deliver killer applications to democratise public cloud for enterprises with an easy button to protect, manage, and secure data everywhere”.
Wrapping data in an intelligent software fabric

  • All your data. Activated.
  • Assign policies
  • Layer on security
  • Track compliance
  • Introduce automation
  • Define user access
  • Instantly search.

Topology agnostic – manage data everywhere.
Accelerate data for lifecycle usage

  • Backup & Recovery
  • Search & Analytics
  • Copy Data Management (so hot right now)
  • Disaster Recovery
  • Archival & Compliance
  • Cloud Instantiation

Rubric is a programmatic software fabric

  • API-first architecture – Rubrik consumes the same APIs
  • Automation – Create, select, execute. Repeat.
  • Extensible – APIs designed to be resilient to change.

Automation Real-world use cases

  • Post-script automation for linux files protection
  • Automated management in vCenter for objects and tags
  • PowerShell automation for DSC
  • Self-service via vRealize Suite
  • New workload provisioning for DevOps shops with Chef and Puppet

Orchestrating data across clouds

An intelligent software fabric to orchestrate data retention across public and private clouds

  • Security
  • SLA-based tiering
  • Global deduplication
  • Global search

Data Platform Security

  • Management Plane
    • Role Access – granular control of user access to data
    • Compliance Reporting – centralised compliance reporting
    • Log Monitoring and Audit – Monitor system events, operational tasks, capacity, logs and user events

Data Plane

  • Data encryption – FIPS-140 Level 2 HDDs/SSDs protect against even physical theft or breach
  • Data encryption in-flight – data encryption before and after leaving appliance
  • Key management – cryptographic keys protected by Trusted Platform Modules
  • Reference Point-in-time – revert to point-in-time to determine breach or for recovery

Chris G – Demo

OH “Don’t knock on the projector, they’re here now”. “It’s okay, I don’t think he owns it”
Chris W – “Is this kind of fun? Like Asteroids for data protection?”

 

Protection via Policy-driven SLAs

Provide the information (RPO, RTO, etc) and the policy will make it so.
Users consume services and data.

How do you recover an application?

  • One VM?
  • Tier of VMs?
  • A section of the DC?
  • An entire DC?

IO scales linearly (20000 IOPs per box).

Traditionally there’s been a large focus on data ingest.

STO8718-SPO_Data_Ingest

But can I quickly / easily restore the data?
Add Archival Location (S3 / Object Store / NFS) – I like when they can answer a question by jumping into the product and doing a demo.
This was a top session with some great demos. It’s a real treat to sit in sessions where the presenters can answer questions quickly through a demo. 5 stars.

 

 

Paessler have been doing this for a while now

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

TFD-Extra-VMworld-300

paessler-logo

Paessler recently presented at Tech Field Day Extra – VMworld US 2016. My rough notes can be found here. You can see videos from the presentation here and here.

 

What’s a Paessler?

Benjamin Day, Senior Systems Engineer with Paessler took us through some of the background on the company. Founded in 1997 in Nuremberg, Germany, they are 100% owned by founders and employees. The US is their largest market and they tell us that over 70% of Fortune 100 enterprises worldwide use PRTG.

 

What’s a Sensor?

PRTG is often referred to as “MRTG for Windows”. When I say often, I mean it was mentioned by Paessler yesterday. But they also say it on their website. You can get a product overview from here. You can also check out a demo here.

So what are sensors? PRTG is defined (built and licensed) at the sensor level. Pretty much anything you would monitor is a sensor (you can read more on that here). Note also that it’s one sensor, but not one metric (these are known as channels). Generally speaking you can count on using 5-10 sensors per device. Here’s an image I swiped from the Paessler website that kind of shows what sensors look like.

TFDx - Paessler - PRTG Sensor_web

Licences come in lots of 500, 1000, 2500, 5000, and Unlimited. The good things is that they’re not named, so Christopher doesn’t have to monitor those printers if he really doesn’t want to.
From a notification perspective, there are a bunch of options to get the message out, and you can send things via:

  • Email;
  • SMS (through third-party or IP-enabled SMS gateways);
  • PRTG-enabled smart devices (there’s a mobile app);
  • syslog; and
  • SNMP traps.

There are also options for auto remediation, and you can do things via a script (powershell, shell, etc) or, amongst other things, kick off a web action (handy for ticketing systems)

 

Thresholds and Notifications

There are all sorts of things you can do in terms of actions when you exceed thresholds, including:

  • Sending email
  • Sending push notifications (to a user or group, and you can customise the message)

You can modify the format – html, text, text with custom content and customise the priority. You can add entry to event logs and send Amazon simple notification service message. You might want to assign a ticket as well.

Note also that PRTG is multi-tenant capable, making it an interesting choice for service providers. There’s also an option to “white box” it with your own logo if you’re into that kind of thing. Note that MSP licensing is done in a different fashion to normal licensing.

My favourite thing (besides what seems like a pretty comprehensive monitoring capability and lightweight deployment requirement) is that every sensor has a QR code. And the PRTG app has a QR code scanner (you see where I’m going with this?). You can print out the device QR codes and they’re come up in PRTG. There’s no longer a requirement to faff about with long labels on hosts. If you’re using per port sensors on your switches, you can put a QR code on the cable.

 

Conclusion

Paessler have been doing this for almost 20 years now. It strikes me that the product seems easy to deploy and use while being fairly powerful and feature-rich. If you’d like to try PRTG out there’s a free license you can use for both personal and commercial use. This is limited to 100 sensors.

If you can monitor it with SNMP (their preference) or WMI, and are happy to use a Windows platform, then PRTG could be the tool for you. I recommend checking them out.

 

VMware – VMworld 2016 – Tuesday General Session Notes

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here’re my rough notes from the VMworld 2016 – General Session – Tuesday. And if you want some blurry photos as well – I can help out with that too.

Sanjay Poonen takes the stage. The world going digital. This is the fourth industrial revolution (I hadn’t been keeping count, but it seems right). Both high-tech and low-tech (using the tea industry as an example) is being transformed.
Transform the DC. Transform the EUC experience. “Any cloud, any app, any device”.

Tuesday_General_VMware_Vision
50% of the world is still client-server, Windows-based.

Workspace ONE – closing the divide between IT and users.

  • Consumer meets enterprise.
  • “Cloud-first” solutions – for the DC, Horizon, AirWatch, etc.
  • Identity management is key

People pull their phones out of their pockets 90-100 times per day, and use them on average 90-100s.

“Live” Demo overview of Workspace ONE.
Stephanie Buscemi – @sbuscemi (EVP, Product and Solutions Marketing at Salesforce) takes the stage, talks about Salesforce One.

Demo
VMware Horizon

  • Next-Generation user experience
  • true stateless desktops
  • real-time app delivery
  • desktop-as-a-Service
  • Hybrid cloud architecture

Simplicity and security
Customer testimonial video – American Red Cross, Sprint, Mecklenburg County, The Coca-Cola Company.

  • Unified endpoint management is a focus for VMware
  • Aiming to lower cost of ownership for Windows 10 (15-30% lower from $7000 per user)

VMware TrustPoint (partnership with Tanium) – Integration of TrustPoint with AirWatch.
Consumer wants it simple, enterprise wants secure – do it with Workspace ONE
Ray O’Farrell – EVP and CTO – takes the stage and starts talking about the partnership between VMware and customers and partners.

Tuesday_General_VMware_and_You
It’s a balancing act. Cloud native apps require that you embrace a bunch of new technologies.
Kit Colbert then takes the stage to get all cloudy.

  • Modern apps built using containers. Hipsters!
  • Containers are moving to the enterprise

Containers in production can be a bit different to containers in development.

Containers in Development

  • Laptop + Docker

Containers in Production

  • Networking
  • Monitoring
  • Accounting
  • Storage
  • Security
  • Portability
  • Diagnosis
  • Avaiability
  • Backup
  • Repeatable Deployments
  • Disaster recovery

VMware are focussed on enterprise container infrastructure, so you can run in production with confidence.
vSphere integrated containers

  • for developers
  • docker-compatible solution
  • reuse of native Docker client and existing Docker ecosystem tools
  • Seamless integration into SDLC

For operators

  • familiarity of vSphere
  • No new tooling, training, or technologies
  • full enterprise-class power of the software-defined DC

New features

  • Container engine – Docker Remote API-compatible engine deeply integrated into vSphere, instantiating container images in VMs
  • Container Registry (new) – Enterprise registry for securely storing container images, with built-in RBAC and image replication
  • Container Management Portal (new) – Portal for app teams to manage the container repositories, images, hosts and running container instances

Use vRealize Automation to deliver Container hosts via a catalogue
Customer testimonial video – Otto Group, Banca Popolare di Sondrio

  • available now as OSS – github.com/vmware/vic-product
  • Beta program – vmware.com/vicbeta

Photon Platform

  • supports agile and dynamic DevOps
  • Deep integration into popular app frameworks
  • strong security and multi-tenancy
  • simple, out of the box experience
  • thin layer of API-driven, fully automated management
  • high-scale compute, network, and storage fabric

Photon platform

  • Free tier now available – vmware.github.io/photon-controller
  • coming soon: photon platform with Kubernetes and NSX

Ray O now talking about SDDC

  • Customer testimonial – Nike
  • VMware Integrated OpenStack

SDDC key elements – compute, storage, network

  • vSphere is key to this
  • vRealize
  • NSX and VSAN

Rajiv Ramaswami takes the stage.

  • Networking heading from hardware to software
  • challenges – security, automation, application continuity
  • fun fact: average cost of a data breach – $4 million
  • automation – need to be able to turn apps on instantly (provisioning in weeks or months)

Customer examples (including citibank)
Jacob from the technical marketing team with NSX on stage

  • Assess
  • Plan
  • Enforce
  • Monitor

Demo
The goals haven’t changed – we’re still after security, availability, speed. Infrastructure, architecture and user behaviour is changing though.
Customer testimonial – Amadeus Data Processing
Yanbing Li takes the stage, talking about HCI powered by VSAN.

  • +5000 unique customers
  • Over 60% using VSAN for business critical apps

Customer testimonial – Amway
Why HCI with VSAN?

  • increase operational efficiency with native vSphere storage
  • achieve 50% savings with all-flash optimisations
  • eliminate IT silos with broader choice: 15 server vendors and 100+ cloud partners

In March they released VSAN 6.2.
The best storage for cross-cloud architecture

  • cloud – containers / big data
  • management – performance analytics and policies
  • security – fully integrated, software-defined encryption

 

Cross-cloud services – data management and governance

  • data governance
  • mobility
  • disaster recovery
  • data protection

*Ray back on stage to wrap up

  • VMware Vision (see photo above)
  • learn, engage, embrace
  • “Ready to be tomorrow”

Solid session. 3 stars.

 

VMware – VMworld 2016 – STO7875 – A Day in the Life of a VSAN I/O

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here are my rough notes from “STO7875 – A Day in the Life of a VSAN I/O”, presented by Duncan Epping and John Nicholson (Senior Technical Marketing Manager).

STO7875

 

Agenda

  • Introduction
  • Virtual SAN, what is it?
  • Virtual SAN, a bit of a deeper dive
  • What about failures? (John)
  • etc

 

Introduction

The SDDC – two of the big challenges we’ve had have been networking challenges and storage.

  • “hardware evolution started the infrastructure revolution” – this is a pretty good point, and that I think is too often overlooked.
  • a lazy admin is the best admin
  • simplicity – operational/management
  • the hypervisor is the strategic high ground (VMware vSphere)

Storage policy-based management provides application-centric automation. The cool thing is this gives you:

  • Intelligent placement
  • Fine control of services at VM level
  • Automation at scale through policy
  • Need new services for VM? change current policy on the fly, attach new policy on the fly

 

Virtual SAN Primer

  • HCI
  • SDS
  • Distributed, scale-out architecture
  • Integrated with vSphere platform
  • Ready for today’s vSphere use cases

Comprised of:

  • Generic x86 hardware
  • Integrated with hypervisor
  • Leveraging local storage resources
  • Exposing a single shared datastore

There are currently over 5000 customers using VSAN. The use cases are increasing as well:

  • Business critical apps;
  • End user computing
  • DMZ – I think this is a great idea, given the struggles I’ve had with InfoSec teams and their requirement to keep workloads on discrete storage.
  • DR/DA
  • Test/Dev
  • Management
  • Staging
  • ROBO

VSAN comes in tiered hybrid and all-flash varieties.
all writes and the vast majority of reads are served by flash storage

1. write back buffer (30%) (or 100% in all-flash)

  • writes acknowledged as soon as they are persisted on flash (on all replicas)

2. Read cache (70%)

  • active data set always in flash, hot data replace cold data
  • cache miss – read data from HDD and put in cache

 

VSAN (Deeper)

The VM is treated as a set of objects on VSAN

  • Define a policy first
  • Each object has multiple components – allows you  to meet availability and performance requirements
  • Data is distributed on storage policy
  • Number of failures to tolerate
  • Number of disk stripes per object
  • Fault domains, increasing availability through rack awareness

 

What about failures?

1 host isolated – HA restart

select response as power off the VM, not shutdown

2 hosts partitioned – HA restart

4 hosts partitioned – HA restart

 

VSAN IO flow – write acknowledgement

VSAN mirrors write IOs to all active mirrors

these are acknowledged when they hit the write buffer

[..]

 

Anatomy of a hybrid read

1. guest issues a read on virtual disk

2. owner chooses replica to read from

  • load balance across replicas
  • not necessarily local replica (if one)
  • a block always reads from the same replica

3. At chosen replica, read data from flash read chase or client cache, if present

4. otherwise, read from HDD and place data in flash Read cache – replace cold data

5. return data to owner

6. complete read and return data to VM

 

Anatomy of an all-flash read

1. guest OS issues a read on virtual disk

2. Owner chooses replica to read from

  • load glance across replicas
  • not necessarily local replica (if one)
  • a block always read from same replica

3. at chosen replica, read data from (write) flash Cache or client cache, if present

4. Otherwise, read from capacity flash device

5. return data to owner

6. complete read and return data to VM

 

Client cache

  • Always local
  • Up to 1GB of memory per host
  • Memory latency < network latency
  • Horizon 7 testing – 75% fewer read IOPS, 25% better latency
  • Compliments Content Based Read Cache (CBRC)
  • Enabled by default in 6.2

 

Anatomy of checksum

1. guest OS issues a write on virtual disk

2. host generates checksum before it leaves host

3. transferred over network

4. checksum verified on host where it will write to disk

5. ACK is returned to the VM

6. on read the checksum is verified by the host with the VM. If any component fails it is repaired from the other copy or parity.

7. scrubs of cold data performance done once a year (this is adjustable)

 

Deduplication and compression for space efficiency

  • deduplication and compression per disk group level
    • enabled on a cluster level
    • fixed block length deduplication (4KB blocks)
  • compression after deduplication
    • LZ4 is used, low CPU
    • single feature, no schedules required
    • file system stripes all IO across disk group

 

Deduplication and compression disk group stripes

  • deduplication and compression per disk group level
    • data stripes across the disk group
  • Fault domain isolated to disk group
    • fault of device leads to rebuild of disk group
    • stripes reduce hotspots
    • endurance/throughput impact

 

Deduplication and compression (IO path)

  • avoids inline or post process downsides
  • performed at disk group level
  • 4KB fixed block
  • LZ4 compression after deduplication

1. VM issues write

2. Write acknowledged by cache

3. cold data to memory

4. deduplication

5. compression

6. data written to capacity

 

Deduplication process (all-flash only)

  • SHA-1 is Fast
  • Hash map not fully in memory
  • Avoids fragmentation

1. VM issues write

2. write acknowledged by cache

3. cold data to memory

4. deduplication

5. compression

6. data written to capacity

Compression process (all-flash only)

  • LZ4 is fast
  • avoid compress duplicate data
  • uncompressible data?

 

RAID 5/6

  • can turn on on a per component basis

 

RAID 5 inline erasure coding

  • When Number of Failures to Tolerate = 1 and failure Tolerance Method = Capacity -> RAID 5
    • 3+1 (4 host minimum)
    • 1.33x  instead of 2x overhead (20GB disk consumes 40GB with RAID 1, now consumes ~27GB with RAID 5)

 

Swap placement

(new in 6.2)

  • Sparse Swap
  • reclaim space used by memory swap
  • host advanced option enables setting

How to set it?

  • esxcfg-advcfg -g /VSAN/SwapThickProvisionDisabled
  • https://github.com/jasemccarty/SparseSwap

 

Snapshots for VSAN

(new in 6.0)

  • Not using VMFS redo logs
  • writes allocated into 4MB allocations
  • snapshot metadata cache (avoids read amplification)
  • Performs pre-fetch of metadata cache
  • Maximum 31

 

Wrapping Up

The recently launched a new portal for Storage and Availability technical documents. You should also check out the Virtually Speaking podcast.

Good session – 4.5 stars

 

 

VMware – VMworld 2016 – STO7914 – Revamped vSphere Storage DRS and SIOC for automating the Data Centers

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here’re my rough notes from “STO7914 – Revamped vSphere Storage DRS and SIOC for automating the Data Centers”, presented by Naveen Nagaraj and Ben Meadowcroft.

 

Agenda

  • Typical storage management related questions
  • SDRS/SIOC by the numbers
  • technical deep dive
  • What are we working on?
  • best practice guidelines
  • advanced options
  • Q&A

 

Typical Questions

  • Backup jobs – what’s the impact on production?
  • which is efficient – thin over thin? thick over thin?
  • SDRS and FAST?
  • Datstore maintenance mode – how to avoid overwhelming the array controller
  • SIOC – does it throttle svMotion? lower latency with AFA? vMSC?

 

SDRS/SIOC by the numbers

  • 96% svMotion
  • 58% SDRS
  • 38% SIOC – tedious, pain-staking, error-prone – working to make this easier to manage through policies

SDRS – 68% fully automated, 32% manual

This is similar to where DRS was at the same stage of its product lifecycle in terms of maturity and automation.
Popular features

  • 100% space load balancing
  • 92% datastore maintenance mode
  • 77% IO Load balancing

Cluster sizes

  • Maximum supported? 64
  • 50% 32-148 datastores per POD
  • interoperability tested up to 64, not higher
  • 30% 16-31
  • 20% 4-15

Default threshold space utilisation

95% stick with 80% (default)

 

Technical Deep Dive

Affinity and anti-affinity rules

  • initial placement – VMDKs together or separate
  • load balancing
  • Add disk to VM? -pre-requisite moves, entire collection is moved
  • maintenance mode – mandatory action (marching orders – have to evacuate VMDKs) – generates faults (you can override affinity/anti-affinity rules)

Growth rate and data store correlation

  • SDRS constantly tracks VMDK growth
  • leverages it for both initial placement and load balancing

Correlation

  • SDRS figures out if 2 or more datastore share same spindle
  • 2 approaches – IO modelling, VASA

Storage DRS is now aware of storage capabilities through VASA 2.0

  • array-based thin-provisioning
  • array-based deduplication
  • array-based auto-tiering
  • array-based snapshot

Thin provisioned datastores

  • visibility into back-end pool utilisation (capacity)

Deduplication capability

  • dedupe pool spans across multiple datastore
  • datastore appears to store more data than capacity
  • SDRS uses free space in datastore rather than just computing sum of vodka capacity
  • VASA integration allows SDRS to know the mapping of dat stores to dedupe pools

FAST Arrays

  • multiple storage tires
  • VM across tiers
  • tier use changes workload

SDRS becomes very conservative – lets the array deal with SLA guarantees

  • array threshold < SDRS threshold < SIOC threshold
  • space load balancing
  • rule enforcement
  • maintenance mode

 

What are we working on?

Today

SIOC Performance controls – reservations, shares and limits

  • reservations – minimum guaranteed IOPS per VM
  • limit – maximum IOPS allowed per VM
  • shares – relative importance of VMs during contention

Storage policy overview

A way to describe storage requirements

  • user-defined tags, default VMware system tags, VASA – storage vendor published capabilities

they can be associated at VM level or VMDK level
define the policy and associate it with it.

need more IOPS for an app? go to your policy and change it as required
Adding SDRS policy (composition)

  • one policy to rule them all
  • Tag datastore (AFA, Gold, Silver)
  • Then use those tags in your policy

SDRS with Policy

  • initial placement
  • compliance alert
  • remediation

 

Best Practice Guidelines

Homogenous data stores in the POD

  • type (VMFS/NFS)
  • performance characteristics

Full connectivity between hosts and datastores

  • maximum flexibility for initial placement
  • optimal for Mmode operations

Do not mix virtualised and non-virtualised IO workload

  • SDRS/SIOC IO-modelling will be inaccurate
  • hard to ensure SLA guarantees

latency vs throughput

  • setting very low latency will impact IOPS
  • strike a right balance between throughput needs and latency expectations

SIOC and vMSC

  • vMSC and SIOC intro is very deployment specific
  • latency between sites and LUN config (RW/RO) impacts IO modelling
  • throttling IO queue may or may not help due to interference from WAN latency

 

Advanced Config Options

EnforceCorrelationForAffinity

Use datastore correlation while enforcing/fixing anti-affinity rules

  • 0 – disabled
  • 1 – soft enforcement
  • 2 – hard enforcement

EnforceStorageProfiles

Storage profile requirements are enforced during initial placement and load-balancing

  • false – bed effort, relaxed enforcement
  • true – strict enforcement both during IP and LB

MaxConcurrentIOMoves

number of concurrent svMotions per DS during LB and Mmode

adjust only if the storage controller is not able to keep up

  • default – 3
  • min value – 1
  • max value – 8

 

And that’s about it. Solid session. 3.5 stars.

VMware – VMworld 2016 – INF8260 – Automated Deployment and Configuration of the vCenter Server Appliance

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

These are my rough notes from “INF8260 – Automated Deployment and Configuration of the vCenter Server Appliance” presented by Alan Renouf and William Lam.

Agenda

  • Speaker intro
  • vCenter Server Appliance overview
  • vCSA Deployment
  • vCSA Migration
  • vCSA Configuration
  • vCSA API Tech Preview
  • Takeaway

 

Speaker Intro

@alanrenouf – http://virtu-al.net

@lamw – http://www.virtuallyghetto.com

 

Overview

6.0 – feature parity with Windows vCenter Server

What about update manager? 6.0 U1 has integration with web client, still need windows … for the moment.

Distributed as an ISO

  • vCSA OVA
  • additional tools for deployment

UI supported on Windows and Linux

CLI on windows, linux and Mac OS X
directory – vcsa-cli-installer

sample templates provided

./vcsa-deploy

-h (for help)

dash dash verify-only

 

vCSA Deployment

CLI based

  • JSON configuration file
  • JSON editors (William uses Atom)

demo

 

vCSA Migration (Tech Preview)

  • released as a fling in 2015
  • there will be official vCSA migration tool
  • migrates from windows vCenter 5.5 to vCSA 6.0u2
    • supports physical or visual
    • supports sql server (full, express), oracle
    • supports 5.5 deployment topologies
    • available to consume via CLI/UI

JSON file to specify migration path

how it works – demo video

 

vCSA Configuration

appliancesh

Core plugins

Extension plugins

help pi list

extends appliancesh with existing shell/utility commands

demo

 

vCSA API Tech Preview

  • appliancesh only available via ssh
  • limited remote automation options
  • limited powercli automation of appliance management
  • limited multi-platform cli automation of appliance management
  • no SDKs for programmatic configuration of appliance management

What they’re working towards:

  • REST based API for app management configuration
  • API explorer for try it now
  • full coverage, easy to use documentation
  • remote access powercli cmdlets for all API functionality
  • easy to use multi-platform for all API functionality
  • multiple SDKs for all API functionality

 

Takeaways

  • vCSA is production-ready
  • entire lifecycle of vCSA can be automated
  • VMware will give you a choice in dev and automation interfaces for appliance mgmt in the future

Top session. Great demos. 4.5 stars.

 

VMware – VMworld 2016 – Monday General Session Notes

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

vmworld-2016-hero-US_950

Here’re the notes from the VMworld 2016 – General Session – Monday. And if you want some blurry photos as well – I’ve got you covered.
“How will tomorrow arrive?”
Pat Gelsinger (CEO VMware) took the stage.

Monday_General_VMware_Gelsinger

23000 attendees? Facing forward – with you to the future

21 ppl have attended every VMworld – lifetime passes for them and spouses

“digital transformation”. Digital vs traditional business. All business is digital business – there’s no longer a distinction. So what’s the difference between leaders? Culture and tech strategy.

2006 – What happened?

Pluto lost it’s status as a planet, and Eric Schmidt walked on stage and talked about stuff being “in a cloud somewhere”.

By the numbers:

  • 2006 – 29 million workloads – 98% traditional IT, 2% (mainly salesforce.com)
  • 2011 – 80 million workloads – 7% public cloud, 6% private cloud
  • 2016 – 160 m workloads, 15% public cloud, 12% private, 73% traditional
  • 2021 – Predict that 50% in the cloud

“let’s click a little bit more into that” – I’m really not a fan of that phrase.

2030 – public cloud becomes more than 50% of workloads. “we have much work to do”

Hosting in 2016 is a $60 billion market, with an 18% growth rate. There’s a big shift from DIY to service provider DCs. Device proliferation, however, is unevenly distributed.

  • IoT is “about to explode” – 4.5x devices over the next 5 years to 18 million devices
  • 2019 – more machine driven than human driven devices connected to the internet

As cloud takes root IT becomes more cost effective and more accessible

Which vertical has embraced cloud most aggressively? Counting down from 10 – 1.

  • Construction;
  • Professional Services;
  • Securities and investments;
  • Insurance;
  • Transportation;
  • Manufacturing;
  • Banking;
  • Resources;
  • Communications;
  • Technology vendors (1)

Every function in every industry is embracing cloud – now calling it “self-starting IT”.

Gel singer states that traditional systems of IT are doomed to fail. It’s a period of traumatic change – freedom and control.

VMware spoke about the software-defined DC (announced in 2011), with NSX and VSAN in the tornado phase of adoption. We then saw some Customer Videos from Zebra Technologies, iGov, SAIC.

Gel singer then spoke about vCloud Air Service and vCloud Air Network, and the introduction of vCloud Air Hybrid Cloud Manager. Clearly there was a need for a Cross Cloud Architecture. This was described as “Like having a teenager you love and like”. This is an analogy VMware needs to stop using.

The idea behind VMware Cloud Foundation and Cross-Cloud Services is to make the private cloud easy.

Monday_General_VMware_CloudFoundation

Gelsinger also announced VMware Cloud Foundation as a service, with IBM being the first partner on board.

Robert Leblanc – SVP Cloud at IBM joins Gelsinger on stage.

  • higher quality, lower cost, less time, more secure, at a global scale
  • 500+ customers on the platform (including Telstra)

Alan Rosa, SVP tech delivery and IT security at Marriott, also joins them on stage.

Some handy Cloud Foundation and Cross-cloud Services articles can be found here:

 

Guido Appenzeller (@appenz) then takes the stage.

What does it mean to be in IT in the age of “Mega clouds”?

Motti Finkelstein from Citibank talks about the need to ramp resources and ramp them down when the need arises.

  • complexity?
  • Your apps need to be architected for bursting, need APIs, need to be able to do this across multiple platforms
  • you have to be able to bifurcate the workloads to the various providers
  • security and compliance?

Monday_General_VMware_Cross-cloudServices

John Spiegel from Columbia Sportswear comes on stage.

NSX for public cloud workloads

Josh Warsop – Johnson and Johnson

  • “Borderless data centre”
  • Leverage your existing skills to get your infra into the public cloud

Jim Fowler, CIO, GE (Video)

Pat Gelsinger back on stage

  • Manage and secure apps across public and private clouds
  • Any cloud, any application, any device

Michael Dell takes the stage

Thoughts on the ecosystem changing moving forward?

“The open ecosystem of VMware is critical to its success”

“A big priority for us is making private clouds easy”

Not a bad session. 3.5 stars.

VMware – VMworld 2016 – See you in Vegas

VMworld-2016

This is a quick post to let my loyal readers know that I’ll be heading to VMware‘s annual conference (VMworld) this year in Las Vegas. This will be my second VMworld and first time in Vegas. I’m looking forward to catching up with some old friends and meeting some new ones. If you haven’t registered yet but feel like that’s something you might want to do – the registration page is here. To get a feel for what’s on offer, you can check out the VMworld 2016 Content Catalog here. Yes, no, that’s just how they spell catalogue.

Big thanks to Corey at VMware for organising the blogger pass. I’ll also be publicly thanking some other folks when I have some more logistics locked in. Incidentally, if any companies want to chip in for my flights I’m sure I can arrange some kind of exposure in return – just let me know. Keep an eye out for me at the conference and surrounding events and don’t be afraid to come and say hi (if you need a visual – I look like Wolverine would if he let himself go).