Dell Technologies World 2018 – storage.12 – XtremIO X2: An Architectural Deep Dive Notes

Disclaimer: I recently attended Dell Technologies World 2018.  My flights, accommodation and conference pass were paid for by Dell Technologies via the Press, Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my rough notes from the storage.12 session. This was presented by Rami Katz and Zvi Schneider, and covered XtremIO X2: An Architectural Deep Dive.

XtremIO X2

Efficiency

  • 4x better rack density
  • 1/3 effective price $/GB
  • Multi-dimensional scaling
  • 25% DRR enhancements

Protection

  • NVRAM
  • Expanded iCDM capabilities
  • QoS
  • Metadata-aware replication

Performance

  • 80% lower application latency
  • 2x better copy operations
  • Hardware

Simplicity

  • Simple HTML5 GUI
  • Intelligent reporting and troubleshooting
  • Guided workflows

 

Software-driven architecture driving efficiency and performance

Brute force approach limits enhancements (eg faster chips, more cores). With this approach you can average 20 – 30% performance improvement every year to 18 months. You need to have software innovation.

 

XtremIO Content Addressable Storage (CAS) Architecture

The ability to move data quickly and efficiently using metadata indexing to reduce physical data movement within the array, between XtremIO arrays or between XtremIO and other arrays / the cloud (not in this release).

 

XtremIO In-line, all-the-time data services

  • Thin provisioning – all volumes thin; optimised for data saving
  • Deduplication – inline – block written once, no post-process
  • Compression – inline – compressed blocks only, no post-process
  • XtremIO Virtual Copies – super efficient – in-memory metadata copy
  • D@RE – Always on encryption with no performance impact
  • XtremIO Data Protection – Single “RAID Model” double parity w/ 84% useable

 

XtremIO Virtual Copies (XVC)

  • Performance – read, write and latency as volume
  • Space efficient – no metadata bloat, no space reservation, no moving blocks
  • In-memory – instant creation, immediate deletion, flexible topology, unrestricted refresh
  • Flash optimised – identical data services, always on, always inline

 

Efficiency

  • X2: All the goodness at 1/3 the GBu price of X1
  • X1: High WPD SSD
  • X2: Low WPD (write per day), 4x denser DAE = greater economies of scale

In X1 as you grow capacity, $/GB is the same, in X2 it gets less as capacity increases

 

Better Controller Cost Amortisation

  • X1 – 40TB in 6RU
  • X2 – 138TB in 4RU
  • X2 (Future) – 230TB in 4RU

 

Multi-dimensional Scaling, Hardware Platform and Improved scaling granularity 

  • X2 – up to 72 drives, start with 18, go up in increments of 6 (138TB RAW per Brick). 4 Bricks of 550TB (effective 2.7PB assuming 6:1 data reduction). Scale up in X2 Platform.
  • Resilience – 2 simultaneous SSD failures per XDP group (36 drives)
  • Using an efficient compression algorithm – Intelligent packing algorithm, systems typically experience ~2-3:1 compression ratio

SQL example – even with native compression enabled get some additional array compression

 

Performance

  • 80% lower application latency
  • 2x better copy operations

Install base – over 65% of write Its are small <= 8KB

 

Write Flow – X1

Flow overview:

  • Write arrives
  • Find compute module
  • Calculate hash – read old data if needed
  • Send to data module
  • Harden data
  • Acknowledge host

 

Write Flow – X2 with Write Boost

Flow overview

  • Write arrives
  • Find compute module
  • Harden data
  • Acknowledge host

Huge latency improvement, mainly for small IOs

 

Write Flow – X2 De-Stage

  • Write transactions are aggregated, improving efficiency and bandwidth
  • Benefit from IO folding

 

Hardware and Models

X-Brick: X2S 7.2TB – 28.8TB – Cluster Building Block

  • 48 Haswell cores
  • 768GB RAM
  • 8 Host ports (2 16Gbps FC and 2 10Gbps iSCSI per controller)
  • Up to 72 SSDs

 

X-Brick: X2-R 34.5TB – 138.2TB – Cluster Building Block

  • 48 Haswell cores
  • 2TB RAM
  • 8 Host ports (2 16Gbps FC and 2 10Gbps iSCSI per controller)
  • Up to 72 SSDs

 

New X-Brick Configuration

Configuration X-Brick Minimum (RAW) X-Brick Maximum (RAW) Cluster Size in X-Bricks
X2-S 7.2TB 28.8TB Up to 4
X2-T 34.5TB 69.1TB 1*
X2-R 34.5TB 138.2TB 4 (current)

8 (future)

X2-T – If you think you’ll get to X2-R, it would be more cost effective to go to that straight away

 

Protection

NVRAM – People didn’t like the BBU, so they got that sorted for the X2.

 

NVRAM – Components Failure – Method of Operation

 

 

Xpanded iCDM Capabilities

  • 60% of all XtremIO Volumes are Copies – a large number of XVC copies are writeable
  • 10% of clusters > 500 writable copies
  • 5% of clusters > 1000 writable copies

Other iCDM Statistics

  • Average # of copies has doubled in X2 vs X1
  • 90% of XtremIO iCDM clusters use writable copies
  • Max 6700 writable copies per cluster
  • Average writable copies per volume -5 copies
  • Average copies per volume – 12 copies

2x the number of XVC copies

 

Metadata-aware Replication

A superior method for efficient data replication

  • Unique blocks only
  • Globally unique (not just at the volume / consistency group level)
  • Changes only
  • Compressed

 

Global Dedupe = Global Savings

Replication Use Cases

Unified view for local and remote protection

Easy Operation, Best Protection, Superior Performance

You can read more about XtremIO Native Replication here.

 

Simplicity

Redesigned User Interface

  • Simple and intuitive
  • 1-2-3 provisioning
  • Tagging and search
  • HTML5 (no Java)
    • Nothing to install
    • Popular browser support

 

Flexible provisioning flows – next step suggestions

 

New Reports

  • Weekly peaks
  • Latency heat map
  • Block distribution

“X2 evolves every frontier that X1 redefined”

 

Futures

Better iCDM with QoS

  • Max IOPS or Bandwidth for every volume or consistency group
    • Protect workloads based on importance e.g. critical applications and multi-tenant environments
  • Burst mode gracefully handles applications that temporarily exceed max IOPS

 

Q&A

Is synchronous replication on the roadmap? Give us a little time. It’s not coming this year. You could use VPLEX in the interim.

How about CloudIQ? CloudIQ support is coming in September

What about X1? It’s going end of sale for new systems. You can still expand clusters. Not sure about any buyback programs. You can keep X1 for years though. We give a 7 year flash guarantee.

X2 is sticking with InfiniBand and SAS, why not NVMe? Right now it’s expensive. We have it running in the labs. We’re getting improvements in software. Remember X2 came out 6 months ago. Can’t really talk too much more.

 

Solid session. 3.5 stars.

Dell Technologies World 2018 – storage.13 – XtremIO X2 Native Replication: Use Cases, Architecture and Best Practices Notes

Disclaimer: I recently attended Dell Technologies World 2018.  My flights, accommodation and conference pass were paid for by Dell Technologies via the Press, Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my rough notes from the storage.13 session. This was presented by Aharon Blitzer and Marco Abela and covered XtremIO X2 Native Replication.

 

Agenda

  • Benefits
  • Technical Overview
  • Demo
  • iCDM Integration and Demo
  • Ecosystem Integrations
  • Setup

 

Benefits

Replication is copying: it is also a copy data problem

Brute force leads to a forced tradeoff – RPO/RTO vs Cost

Existing Replication Solutions

  • All changes must be replicated
  • Arrays at both ends must transmit and receive all changed data
  • WAN bandwidth must be sized to account for all changed data
  • Many customers deploy WAN accelerators to reduce bandwidth

 

The XtremIO Difference: elegance, not brute force

  • New block written and deduped – Only finger print is replicated 
  • Only unique changes are replicated
    • Data is deduplicated at source, destination and the WAN
    • Unique blocks are sent compressed
  • Arrays at both ends transmit and receive only unique data, freeing their resources for other uses
  • WAN bandwidth should be sized to account for only globally unique changes
  • No need to deploy any WAN accelerators
  • Free up bandwidth for other applications

 

Global Dedupe = Global Savings

FAN in configuration of 4:1

 

Business benefits: free the WAN

  • Reuse the freed up WAN bandwidth for other workloads e.g. protect / replicate incrementally more workloads with the same WAN costs; or
  • Reduce WAN costs while still replicating at the same level of protection

 

Metadata-aware Replication

  • Easy Operation 
    • Uses XtremIO in-memory snapshots
    • Wizard based
    • Full operational disaster recovery 
  • Best Protection
    • RPO as low as 30 seconds
    • Sub-minute RTO
    • Up to 500 recovery points
    • “Fan-in” configuration
  • Superior Performance
    • Supports XtremIO High Performance
    • Efficient Metadata-aware replication
    • Efficient replication – compression aware

 

Technical Overview

XtremIO CAS (Content Aware Storage)

XtremIO X2 Inline data reduction in action – Inline deduplication that is fast and scalable

 

XtremIO Virtual Copies (XVC)

Traditional Snapshots

  • non-immediate operation
  • not space efficient
    • best case scenario: copy metadata
    • worst case scenario: copy metadata and block for block
  • Not same performance as source volume
    • negative impact performance of source volume

XtremIO Virtual Copies

  • immediate
  • space and metadata efficient
    • no copied metadata or data blocks
    • redirect-on-unique-write
  • no difference between regular volume and snapshot
    • 100% same performance as source
    • no performance degradation 

 

XVC vs Traditional snapshots

  • no impact on copy creation
  • consistent performance on prod and copy

 

X2 XtremIO XVC Snapshots

Unlimited Immediate copy operations

  • create copy from any copy
  • unlimited, immediate delete
  • unrestricted topology

Instant refresh of Virtual copies

  • prod to any copy
  • any copy to any copy
  • any copy to prod

 

Putting it all together 

Metadata Aware: Efficiently transferring Changes Only

 

X2 Native Replication – Xtremly Efficient 

Failover is easy and immediate

 

Architectural Advantages = Replication Efficiencies

  1. Changes only
  2. Write Folding
  3. Global, across sites deduplication
  4. Compression

 

Configuration and Options

Replication is the means to an end

 

Replication Use Cases

Setup / refresh takes hours / days

 

Improve business process and agility 

  1. Unified management 
  2. Quick and easy setup
  3. Comprehensive and simple operations
  4. iCDM integration
  5. Orchestration

Unified view for local and remote protection (under data protection)

See RPO compliance, throughput, replication efficiency, RPO (actual vs required)

 

DR Operations

  • Resume / suspend
  • Failover 
  • Failback
  • Test a copy

 

Protection Settings

  • RPO as low as 30 seconds
  • Retention Policy – defines the protection window and number of PiTs
  • XtremIO automatically manages the retention
  • Granularity per time period (up to 3)

[Demo Video]

 

iCDM Integration

Examples

Copy prod to dev and test

  1. Setup replication
  2. Repurpose copy
  3. Map volumes to host

[Demo Video]

Prod to test using master image

  1. Setup replication
  2. Repurpose copy
  3. Refresh data

[Demo Video]

 

Ecosystem

  • Solves the CDM problem
  • App self service
  • App integration and orchestration
  • XtremIO Virtual Copies (XVC)
  • Consistent multi-dimensional scale performance and data services

 

AppSync 

  • Get AppSync functionality free (existing X2 customers)
  • Manage up to 20 concurrent mounted copies
  • FREE support and maintenance included
  • Upgrade to unlimited copies at normal AppSync pricing
  • XtremIO Native Replication Integration – AppSync 3.8 Q3 2018
    • create and manage copies of data on either the source or the target of X2 replication
    • create and manage copies of data on source and target simultaneously 
    • restore from any one side at any given time

 

VMware SRM Support via SRA plugin

 

Setup

X2 Metadata-aware Replication Details

  • Full GUI management and monitoring
    • configuration via simple XMS or CLI wizards
    • REST API
  • Networking & Performance
    • 10GbE IP (onboard copper or optic)
    • Up to 200ms latency
    • Bi-directional
    • VLANs supported
  • Volume re-sizing supported
  • Fan in/out (up to 4 systems) supported

 

Replication Ports

  • Dual personality cards – can be configured as 4 iSCSI or 2 iSCI and 2 FC
  • Replication
    • Any of the 4 available replication optical ports @ 10Gbs, if not used for host connectivity or FC
    • Dedicated copper replication port @ 10Gbs

Efficient – Consistent – Reliable 

6.1 is out on the 3rd for existing XtremIO X2 customers

 

Questions?

Can I shrink volumes? No shrink. Expand is okay, resizing supported with replication.

How are the secondary hosts setup? You can put the hosts in no access mode or read-only. There’s a best practices guide. No access generally for some clusters.

How do you fallback? When you failover, you can choose to start replication to the other side. You can start it later if required. Do your failover and check, then start replication.

This is X2 only.

Single management view with XMS? Yes, although recommend one XMS per site.

Is everything scriptable? Yes, Web UI, CLI, and REST API. You can also plug in to vRO.

Will AppSync be re-purpose aware? It will allow you to do remote re-purposing.

What’s the size of a metadata block? 20 bytes, maybe? Not sure.

Plans for the X1? No, no plans to back port. You’ll need to use RecoverPoint.

Can you comment on redirect on unique write vs copy on write tech? Let’s take that off-line.

Per VM recovery? With AppSync or VMware SRM. The SRA is in the certification process – a few weeks away. SRM 6.5, 6.8, etc.

Migrate to VMAX to XtremIO? Customers are using storage vMotion – most common. Host-based tools. Let’s have a chat after. VPLEX, RP, etc.

 

Solid session. 4 stars.

Dell EMC Announces XtremIO “X2”

Disclaimer: I recently attended Dell EMC World 2017.  My flights, accommodation and conference pass were paid for by Dell EMC via the Dell EMC Elect program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

XtremIO X2

[image via Dell EMC]

In a nutshell, Dell EMC describe the new XtremIO X2 as “[f]lash optimised AFA with multi-dimensional scalability”. Features include:

  • New multi-dimensional scalable hardware
  • Software-driven performance / efficiency improvements
  • iCDM use case enhancements
  • New, simple HTML5 UI
  • New metadata-aware native replication (not at GA, later this year)

 

Your Feedback Is Important To Us

X1 Challenges Addressed?

Firstly though, Dell EMC have been listening to customers, and have been working on some improvements with the X2. To wit:

  • Cabling – there’s a new cable harness
  • BBU – BBUs replaced with NVRAM
  • Price – as low as a third of the price of the X1 in terms of effective $/GB
  • Density – up to 100TB/RU
  • Scaling – scale up and scale out
  • 16Gb FC – Natively supported on X2

 

Cabling

A picture is indeed worth a thousand words. And the original X-Brick had some fairly ordinary cable management. The X2 is better.

[image via Dell EMC]

 

PCIe NVRAM

Those cumbersome (and rack space consuming) battery backup units (BBUs) are no more.

  • Increased reliability
  • Reduced service calls (battery replacement)
  • Improved density
  • Reduced cabling
  • Reduced complexity
  • Allows odd X-Brick support

 

A Third of the Price?

How can they say that? Well, according to Dell EMC, the X2 offers:

  • Higher density X-Bricks
  • Higher data reduction
  • Lower WPD SSDs
  • iCDM enhancements
  • Better controller cost amortisation
  • Better scaling economics

 

X2 is better than X1

Speeds and Feeds

There are 2 models (X2-S & X2-R) being launched.

Configuration X-Brick Minimum Raw X-Brick Maximum Raw Cluster Size in X-Bricks
X2-S 7.2TB 28.8TB Up to 4
X2-R 34.5TB 138.2TB Up to 8 (Post GA)

*Note that the sizing assumes 4:1 Data Reduction Rate (DRR) and ~80% usable:raw ratio.

There are 2 active controllers (each with 2×12 Haswell cores) and:

  • 1024GB RAM each
  • 2x 16Gbps FC
  • 2x 10Gbps iSCSI
  • Infiniband RDMA for controller communications
  • SAS 3.0
  • 1920GB SSDs (up to 72 per X-Brick)

With this you can scale to over 1.1PB Raw (and over 5PB assuming 4:1 dedupe rates). In terms of drive configurations, the starting point is 18 drives, and you can scale up in increments of 6 drives. When you get past 36 drives, a second XDP group is added (whereby you deploy another 18 + 6 + 6 + 6 disks). The SSDs are a hot swappable field replaceable unit (FRU) as well.

 

Expanded iCDM Capabilities

This is all pretty exciting, but what about integrated copy data management (iCDM)? Dell EMC say they’ve been seeing a 25% better data reduction on average when comparing X1 to X2 (on a 100GB working set). There have also been compression improvements made (via an intelligent packing algorithm) yielding 16:1 ratios. As well as this, they’re providing:

  • Open APIs
  • App integration and orchestration
  • Virtual copies (XVC)
  • Consistent, multi-dimensional performance with inline data services

You can now also do 2x the number of XVC copies. There are 16384 volumes supported, with 1024 snapshots per volume also supported.

 

Management

New HTML5 UI

  • No more Java binaries – aw yisss!
  • Faster and better user experience

Simple and Intuitive UI

  • Easy drill-down & navigation
  • Intelligent reports
  • 1-2-3 provisioning

You can manage X2 clusters (obviously), and will have the ability to manage X1 clusters post GA.

 

Provisioning

Provisioning has been improved, with “Next step suggestions” in the form of:

  • Flexible and guided provisioning flows
  • “Popular” next step suggestions
  • Multi-step workflows

 

Metadata-aware Native Replication

This is coming in the future. Dell EMC tell me it’s going to be great. And I really hope it will be, because I’ve been underwhelmed in the field to date.

Easy operation

  • Uses XtremIO in-memory snapshot
  • Wizard-based
  • Full operational DR

Best Protection

  • RPO as low as 30 seconds
  • Immediate RTO
  • Up to 1000 recovery points
  • “Fan-in” configurations

Superior Performance

  • Supports XtremIO high performance
  • Efficient metadata-aware replication
  • Efficient replication – compression aware

How Will It Work?

  • Only deduplicated changes are replicated – data is deduplicated at source, destination, and WAN
  • Arrays at both ends must transmit and receive only deduplicated data
  • WAN bandwidth must be sized to account for only deduplicated data
  • No need for WAN accelerators
  • Native replication is async only

 

When?

The X2 will be available to order from May 31st, 2017 and shipping from August 30th, 2017. XIOS 6.0 will be made GA on August 30th, 2017.

 

Conclusion

I’ve been a fan of the XtremIO for a while now. It goes really fast and does some really cool stuff in terms of density, deduplication and performance. It has been a little underwhelming in terms of data services support (although we’ve seen X1 go through some significant changes in that respect) and hasn’t always been price competitive. But if you’ve been a VMAX customer pining for rack space or an enterprise running some RDBMS that needed some great performance from your block storage, then XtremIO has been for you.

This iteration of the XtremIO platform sounds (on paper at least) to be a lot better than its predecessor, and demonstrates that Dell EMC have been listening to their customers. In much the same way as X-Men 2 was better than the first one, so too does the X2 have the edge over the X1. I look forward to seeing these things in the field. And I look forward very much to seeing the end of Java-based storage management UIs. If you’d like to read Dell EMC’s take on the announcement, check out this blog post.

Random Short Take #2

I did one of these 7 years ago – so I guess I never really got into the habit – but here’re a few things that I’ve noticed and thought you might be interested in:

And that’s about it, thanks for reading.

iStock-Unfinished-Business-2

EMC – Thinking about buying XtremIO?

Introduction

I’ve been doing some design work for a few customers and thought I’d put together a brief post on some considerations when deploying XtremIO. I don’t want to go into the pros and cons of the product, nor am I really interested in discussing better / worse alternatives. Let’s just assume you’ve made the decision to go down that track. So what do you need to know before it lobs up in your data centre? As always, I recommend checking EMC’s support site as they have some excellent site planning and installation documentation. There’s also a pretty good introductory whitepaper here.

 

Hardware Overview

X-Brick

The core hardware in the XtremIO solution is the X-Brick. I’ve included a glamour shot below from EMC’s website for reference. Each X-Brick is comprised of:

  • One 2U DAE, containing 25 eMLC SSDs (400GB, 800GB or 1600GB SSD options);
  • Two redundant power supply units (PSUs);
  • Two redundant SAS interconnect modules;
  • Two BBUs (per cluster); and
  • Two 1RU Storage Controllers (redundant storage processors).

xbrick

A one X-Brick configuration has the controllers directly connected via InfiniBand, whilst other configurations require 2 IB switches. X-Brick clusters can be deployed in combinations of 1, 2, 4, 6 or 8 X-Bricks.

 

Release 4.0

The latest release (4.0) has been generally available since 30 June 2015, and now supports:

  • “Generation 3” hardware (the 40TB X-Bricks);
  • Larger clusters (up to 8 20TB X-Bricks in a cluster); and
  • Native RecoverPoint integration.

You can read Chad’s take on things here, as well as the XtremIO team’s announcement here.

 

Virtual XMS

You can optionally deploy the XMS (XtremIO Management Server) on a VM rather than physically. There are a few things you need to be mindful of if you go down this route.

The virtual XMS VM should have the following configuration:

  • 8GB vRAM;
  • 2 vCPUs; and
  • 1 vNIC.

The virtual XMS VM should have a single 900GB disk (thin provisioned). Note that 200GB of disk capacity is pre-allocated following the cluster initialization. This should be provisioned on a RAID-protected storage. Note that shared storage used should not originate from an XtremIO cluster.

The virtual XMS should be located in the same LAN as the XtremIO cluster.

The deployed virtual XMS Shares memory resource allocation is set to High. As such, the virtual XMS is given high priority on memory allocation when required. If you’re using a non-standard shares memory allocation, this should be adjusted post-deployment.

 

In The Data Centre

Rack Requirements

The following table shows the required rack space depending on the number of X-Bricks in the cluster.

XtremIO_Rack_Table

 

Power and Cabling

From a cabling perspective, your friendly EMC installation person will take care of that. There’s very good guidance on the EMC support site, depending on your access level. Keep in mind that you’ll want your PDUs in the rack to come via diverse circuits to ensure a level of resiliency.

In terms of power consumption, the table below provides guidance on maximum power usage depending on the number of X-Bricks you deploy.

XtremIO_Power_Table

 

Connectivity

From a connectivity perspective, you’ll need to account for both FC and IP resources. Each controller has two FC front-end ports and two iSCSI ports that you can present for block storage access. You’ll also need an IP address for each controller (so two per X-Brick), along with at least one for the XMS. For monitoring, the latest version of the platform supports EMC’s Secure Remote Services (ESRS), so you can incorporate it into your existing solution if required.

 

Conclusion

Should you decide to go down the XtremIO track, there a few things to look out for, primarily around planning your data centre space. It’s a nice change that you don’t have to get too bogged down in details about the actual configuration of the storage itself. But ensuring that you’ve planned for suitable space, power and management will make things even easier.