Storage Field Day Exclusive at Pure//Accelerate 2017 – Purity Update

Disclaimer: I recently attended Storage Field Day 13.  My flights, accommodation and other expenses were paid for by Tech Field Day and Pure Storage. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

These are my rough notes from a session I attended on “Day 0” of Pure//Accelerate 2017 (aka Storage Field Day Exclusive at Pure//Accelerate 2017). Videos of the session can be found here and you can grab my raw notes from here. I try to avoid dumping a bunch of dot points in Tech Field Day posts, but as this one covered some key announcements, I thought most of the information was useful presented as is.

 

ActiveCluster

Tabriz Holtz and Larry Touchette took us through an overview of ActiveCluster.

 

What do customers really want?

  • Disaster protection
  • Consistent performance
  • Transparent failover
  • Ease of management
  • Subscription to innovation

They don’t want

  • Life to be difficult because they’re running synchronous replication
  • To pay for it

 

Multi-site Active/Active

  • Zero RPO
  • Zero RTO
  • Zero $
  • Zero Additional Hardware

 

Basic Architecture

  • Symmetric Active/Active
  • Pure’s pod management model – container where you can store your volumes
  • Passive Pure1 Cloud Mediator – prevent split brain

 

Pods

  • Simple management model
  • only 1 new command introduced
  • serves as a container and a consistency group
  • keeps metadata with its data

 

4 Steps to Setup

The whole point is that it’s super simple to setup, so much so that you can do it in four steps from the CLI.

1. Connect the arrays

purearray connect --type sync-replication

2. Create a stretched pod

purepod create pod1
purepod add --array arrayB pod1

3. Create a volume

purevol create --size 1T pod1::vol1

4. Connect your hosts

purehost create --preferred-array arrayA host
pure host connect --vol pod1::vol1 host

 

Symmetric Active/Active

I/Os perform symmetrically

  • 1 round trip for writes
  • reads serviced locally

Host ALUA preferences:

  • Active/Optimised
  • Active/Non-optimised

There’s a 5ms RTT limit and it uses TCP/IP between arrays (Ethernet). Independent dedupe runs on both sides.

 

Passive Mediator

  • No split brain … ever!
  • Intelligence is in the arrays
  • Mediator imply records failover
  • No third site needed
  • Arrays alert if they can’t access mediator

There is also the option to deploy a VM that can be used on-premises. While the cloud mediator runs multiple instances behind a load balancer, the on-premises mediator would have to be protected with HA or similar.

So what if I lose comms to the outside world? (both to the outside world and the partner array). Volumes will be taken offline. The mediator is a per pod setting, so you could conceivably use both in your environment.

 

Transparent Recovery

1. Snapshots sent asynchronously until arrays are nearly in sync

2. IOs forwarded synchronously along with final snap are merged into target

3. Arrays are fully in sync with no pause in IO for final sync

The goal is to allow different arrays to replicate with each other. At GA these will be qualified. Purity versions (1 minor version different e.g. 5.1 and 5.2). Every customer gets this feature (yay, Evergreen). Assuming you have the appropriate supporting infrastructure. And there’s support for multiple connection types.

 

DirectFlash Shelf

Pete Kirkpatrick (Chief Hardware Architect) spoke about the recently announced FlashArray//X.

  • 100% NVMe Enterprise AFA
  • “This is just a flash array”
  • The data is the array, the hardware and software comes and goes over time

They were working on the FlashArray//M chassis about 4 years ago, and had gone to some length to future proof the design. “DirectFlash Modules” are now replacing the SAS SSDs. SSDs emulate HDDs – this isn’t ideal.

 

Flash Transition Layer needs Garbage Collection

  • Severely limits sustained throughout
  • Destroys latency distribution
  • Causes excess wear
  • Needs over provisioned capacity

 

Legacy protocols and interfaces

  • Assumed high latency
  • Inherently serialised

 

DirectFlash

Purity has always been designed for Flash, so get disk legacy out of the way. The goal of DirectFlash was to

  • Start with NVMe: efficient, low latency, high throughput, high parallelism,
  • Design an API providing knowledge of the Flash geometry
  • Place data intelligently, and schedule operations with high precision
  • No FTL is required, so GC is eliminated
  • High sustained throughput and low, deterministic latency
  • No over provisioning
  • Extended flash endurance

Here’s a happy snap of one of the 18.3TB (I think) modules.

You can now fit 1PB of useable Flash in 3RU. DirectFlash Shelf is this week’s news. They’re using NVMe/F. It’s over RoCE (RDMA over Converged Ethernet). You can start with 1 shelf at this stage, but Pure are looking to extend that capability.

 

VVols Support

Cody Hosterman took us through VMware Virtual Volumes (VVols) support with Purity//FA 5.0. I enjoy watching Cody present and I wasn’t disappointed by this session. So, VVols eh? Why?

  • Virtual disk granularity on array – use array-based technology on a virtual disk basis
  • Automatic volume creation and configuration
  • Storage Policy Based Provisioning

 

Virtual Volumes – The Full Picture

[image via Pure Storage]

 

VVols

Every VM has individual volumes on the array:

  • Config VVol—4 GB—holds the configuration information of the VM. Created automatically when a VM is provisioned
  • Swap VVol—is for the VM swap file. Sized according to the VM memory. Created automatically when the VM is powered-on and deleted when powered-off
  • Data VVol—for every virtual disk added to the virtual machine there is a new data VVol. Sized by the requested size of the virtual disk

 

VVol Snapshots

  • VMware snapshot and array-snapshot is created automatically – no performance penalty

 

The Data Plane

Protocol Endpoints

  • A mount-point for VVols
  • Presented in a traditional fashion via iSCSI or FC as LUN
  • VVols are sub-LUNs to a PE. IO goes to the PE on the array, the array distributes.
  • FlashArray automatically “binds” VVols to the appropriate PE

 

The Management Plane

  • How does vCenter manage the FlashArray? Through a VASA provider

FlashArray VASA Provider

  • VASA version 3 (includes replication)
  • Redundant service on both controllers
  • Automatically configured during Purity upgrade
  • Active-active configuration
  • Entirely stateless – no configuration is tied to /stored on the hardware of the controllers (no special VASA database on the array, it is part of the array configuration)

 

Policy Based Management

Create VMs and individual virtual disks (VVols) independently

Use customised capabilities advertised by VASA provider to configure volumes

  • Replication
  • Snapshot policy
  • QoS
  • Etc.

Are VVols Special Volumes? On the FlashArray, not really. Just normal volumes with special metadata tags. So if you want to present a VVol to a physical server for instance? You can just connect it as a standard LUN.

 

Conclusion

I’m enthusiastic as all get out about ActiveCluster. There have been rumblings in the market about this type of capability for some time, so it’s great to see Pure deliver. I need to dig a bit deeper into it, but it feels a lot like it has the cross-site capability of Dell EMC’s VPLEX without a lot of the palaver traditionally associated with that product (which to be fair, does more than just cross-site volumes). The great thing is that it’s available to existing customers without additional expense (at least on the Pure side) or messing about. This has always been a big selling point for Pure, and it’s great to see it continue here.

I think the DirectFlash shelf is certainly a step in the right direction and the appetite is there for this kind of solution. I’ll be interested to see how many shelves they end up adding, as the scale and speed possibilities here are potentially pretty tremendous. It will also be interesting to see the uptake of the solution over the next 12 months.

I liked a lot of what I saw with Cody’s presentation on VVols support. It certainly appeared fairly straightforward. I remain underwhelmed by VVols in general though. I know there needed to be a change in the way we presented storage to VMs but it feels like we’ve somehow missed the boat with this solution. In another year it might just be that everything sits on VVols by default but I feel like that’s been the feeling for the past five years and it hasn’t yet transformed to the extent we expected. I am more than happy to be proven wrong on this point though, and my surliness regarding VVols shouldn’t be taken as criticism of what Pure have managed to deliver here. Also, it’s worth checking out Cody’s post on Virtual Volumes support here – he covers it way better than I do.