Random Short Take #15

Here are a few links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 15 – it could become a regular thing. Maybe every other week? Fortnightly even.

Random Short Take #14

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Episode 14 – giddy-up!

Random Short Take #13

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Let’s dive in to lucky number 13.

Storage Field Day 18 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

This is a quick post to say thanks once again to Stephen and Ben, and the presenters at Storage Field Day 18. I had a super fun and educational time. For easy reference, here’s a list of the posts I did covering the events (they may not match the order of the presentations).

Storage Field Day – I’ll Be At Storage Field Day 18

Storage Field Day 18 – Day 0

Storage Field Day 18 – (Fairly) Full Disclosure

Cohesity Is (Data)Locked In

NetApp And The Space In Between

StorPool And The Death of Hardware-Defined Storage

IBM Spectrum Protect Plus – More Than Meets The Eye

Western Digital Are Keeping Composed

VAST Data – No More Tiers Mean No More Tears?

WekaIO Continues To Evolve

Datera and the Rise of Enterprise Software-Defined Storage

 

Also, here’s a number of links to posts by my fellow delegates (in no particular order). They’re all very smart people, and you should check out their stuff, particularly if you haven’t before. I’ll attempt to keep this updated as more posts are published. But if it gets stale, the Storage Field Day 18 landing page will have updated links.

 

Becky Elliott (@BeckyLElliott)

California Dreamin’ My Way to Storage Field Day 18

A VAST-ly Different Storage Story

 

Chin-Fah Heoh (@StorageGaga)

A Storage Field 18 I will go – for the fun of it

VAST Data must be something special

Catch up (fast) – IBM Spectrum Protect Plus

Clever Cohesity

Storpool – Block storage managed well

Bridges to the clouds and more – NetApp NDAS

WekaIO controls their performance destiny

The full force of Western Digital

 

Chris M Evans (@ChrisMEvans)

Podcast #3 – Chris & Matt review the SFD18 presenters

Exploiting secondary data with NDAS from NetApp

VAST Data launches with new scale-out storage platform

Can the WekaIO Matrix file system be faster than DAS?

#91 – Storage Field Day 18 in Review

 

Erik Ableson (@EAbleson)

SFD18-Western Digital

Vast Data at Storage Field Day 18

 

Ray Lucchesi (@RayLucchesi)

StorPool, fast storage for fast times

For data that never rests, NetApp NDAS

 

Jon Klaus (@JonKlaus)

My brain will be melting at Storage Field Day 18!

Faster and bigger SSDs enable us to talk about something else than IOps

How To: Clone Windows 10 from SATA SSD to M.2 SSD (& fix inaccessible boot device)

The fast WekaIO file system saves you money!

Put all your data on flash with VAST Data

 

Enrico Signoretti (@ESignoretti)

A Packed Field Day

Democratizing Data Management

How IBM is rethinking its data protection line-up

NetApp, cloudier than ever

Voices in Data Storage – Episode 10: A Conversation with Boyan Ivanov

Voices in Data Storage – Episode 11: A Conversation with Renen Hallak

Voices in Data Storage – Episode 12: A Conversation with Bill Borsari

 

Josh De Jong (@EuroBrew)

 

Matthew Leib (@MBLeib)

I Am So Looking Forward to #SFD18

#SFD18 introduces us to VAST Data

Dual Actuator drives: An interesting trend

Weka.IO and my first official briefing

Cohesity: More on the real value of data

 

Max Mortillaro (@DarkkAvenger)

Storage Field Day 18 – It’s As Intense As Storage Field Day Gets

Storage Field Day 18 – Fifty Shades of Disclosure

Cohesity – The Gold Standard in Data Management

EP17 – Storpool: Being the best in Block Based storage – with Boyan Ivanov

Developing Data Protection Solutions in the Era of Data Management

Western Digital : Innovation in 3D NAND and Low Latency Flash NAND

 

Paul L. Woodward Jr (@ExploreVM)

Storage Field Day 18, Here I Come!

 

[photo courtesy of Stephen Foskett]

Datera and the Rise of Enterprise Software-Defined Storage

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Datera recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here.

 

Enterprise Software-Defined Storage

Datera position themselves as delivering “Enterprise Software-Defined Storage”. But what does that really mean? Enterprise IT gives you:

  • High Performance
  • Enterprise Features
    • QoS
    • Fault Domains
    • Stretched Cluster
    • L3 Networking
    • Deduplication
    • Replication
  • HA
  • Resiliency

Software-defined storage gives you:

  • Automation
  • DC Awareness Agility
  • Continuous Availability
  • Targeted Data Placement
  • Continuous Optimisation
  • Rapid technology adoption

Combine both of these and you get Datera.

[image courtesy of Datera]

 

Why Datera?

There are some other features built in to the platform that differentiate Datera’s offering, including:

  • L3 Networking – Datera brings standard protocols with modern networking to data centre storage. Resources are designed to float to allow for agility, availability, and scalability.
  • Policy-based Operations – Datera was built from day 1 with policy controls and policy templates to easy operations at scale while maintaining agility and availability.
  • Targeted Data Placement – ensure data is distributed correctly across the physical infrastructure to meet policies around perfromance, availability, data protection while controlling cost

 

Thoughts and Further Reading

I’ve waxed lyrical about Datera’s intent-based approach previously. I like the idea that they’re positioning themselves as “Enterprise SDS”. While my day job is now at a service provider, I spent a lot of time in enterprise shops getting crusty applications to keep on running, as best as they could, on equally crusty storage arrays. Something like Datera comes along with a cool hybrid storage approach and the enterprise guys get a little nervous. They want replication, they want resiliency, they want to apply QoS policies to it.

The software-defined data centre is the darling architecture of the private cloud world. Everyone wants to work with infrastructure that can be easily automated, highly available, and extremely scalable. Historically, some of these features have flown in the face of what the enterprise wants: stability, performance, resiliency. The enterprise guys aren’t super keen on updating platforms in the middle of the day. They want to buy multiples of infrastructure components. And they want multiple sets of infrastructure protecting applications. They aren’t that far away from those software-defined folks in any case.

The ability to combine continuous optimisation with high availability is a neat part of Datera’s value proposition. Like a number of software-defined storage solutions, the ability to rapidly iterate new features within the platform, while maintaining that “enterprise” feel in terms of stability and resiliency, is a pretty cool thing. Datera are working hard to bring the best of both worlds together, and managing to deliver the agility that enterprise wants, while maintaining the availability within the infrastructure that they crave.

I’ve spoken at length before about the brutally slow pace of working in some enterprise storage shops. Operations staff are constantly being handed steamers from under-resourced or inexperienced project delivery staff. Change management people are crippling the pace. And the CIO wants to know why you’ve not moved your SQL 2005 environment to AWS. There are some very good reasons why things work the way they do (and also some very bad ones), and innovation can be painfully hard to make happen in these environments. The private cloud kids, on the other hand, are all in on the fast paced, fail fast, software-defined life. They’ve theoretically got it all humming along without a whole lot of involvement on a daily basis. Sure, they’re living on the edge (do I sound old and curmudgeonly yet?). In my opinion, Datera are doing a pretty decent job of bringing these two worlds together. I’m looking forward to seeing what they do in the next 12 months to progress that endeavour.

WekaIO Continues To Evolve

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here. I’ve written about WekaIO before, and you can read those posts here and here.

 

WekaIO

Barbara Murphy described WekaIO Matrix as “the fastest, most scalable parallel file system for AI and technical compute workloads that ensure applications never wait for data”.

 

What They Do

So what exactly does WekaIO Matrix do?

  • WekaIO Matrix is software-defined storage solution that runs on anything from bare metal, VMs, containers, on-premises or in the cloud;
  • Fully-coherent POSIX file system that’s faster than a local file system;
  • Distributed Coding, More Resilient at Scale, Fast Rebuilds, End-to-End Data Protection; and
  • InfiniBand or Ethernet, Converged or Dedicated, on-premises or cloud.

[image courtesy of WekaIO]

 

Lots of Features

WekaIO Matrix now has a bunch of features, including:

  • Support for S3, SMB, and NFS protocols;
  • Cloud backup, Snapshots, Clones, and Snap-2-Obj;
  • Active Directory support and authentication;
  • POSIX;
  • Network High Availability;
  • Encryption;
  • Quotas;
  • HDFS; and
  • Tiering.

Flexible deployment models

  • Appliance model – compute and storage on separate infrastructure; and
  • Converged model – compute and storage on shared infrastructure.

Both models are cloud native because “[e]verybody wants the ability to be able to move to the cloud, or leverage the cloud”

 

Architectural Considerations

WekaIO is focused on delivering super fast storage via NVMe-oF, and say that NFS and SMB deliver legacy protocol support for convenience.

The Front-End

WekaIO front-ends are cluster-aware

  • Incoming read requests optimised re location and loading conditions – incoming writes can go anywhere
  • Metadata fully distributed
  • No redirects required

SR-IOV optimises network access WekaIO directly access NVMe Flash

  • Bypassing the kernel leads to better performance.

The Back-End

The WekaIO parallel clustered filesystem is

  • Optimised flash-native data placement
    • Not designed for HDD
    • No “cylinder groups” or other anachronisms – data protection (similar to EC)
    • 3-16 data drives, +2 or +4 parity drives
    • Optional hot spares – uses a “virtual” hot spare

Global namespace = hot tier + Object storage tier

  • Tiering to S3-API Object storage
    • Additional capacity with lower cost per GB
    • Files shared to object storage layer (parallelised access optimise performances, simplifies partial or offset reads)

WekaIO uses the S3-API as its equivalent of “SCSI” for HDD.

 

Conclusion and Further Reading

I like the WekaIO story. They take away a lot of the overheads associated with non-DAS storage through the use of a file system and control of the hardware. You can make DAS run really fast, but it’s invariably limited to the box that it’s in. Scale-out pools of storage still have a place, particularly in the enterprise, and WekaIO are demonstrating that the performance is there for the applications that need it. There’s a good story in terms of scale, performance, and enterprise resilience features.

Perhaps you like what you see with WekaIO Matrix but don’t want to run stuff on-premises? There’s a good story to be had with Matrix on AWS as well. You’ll be able to get some serious performance, and chances are it will fit in nicely with your cloud-native application workflow.

WekaIO continues to evolve, and I like seeing the progress they’ve been making to this point. It’s not always easy to convince the DAS folks that you can deliver a massively parallel file system and storage solution based on commodity hardware, but WekaIO are giving it a real shake. I recommend checking out Chris M. Evans’s take on WekaIO as well.

VAST Data – No More Tiers Means No More Tears?

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

VAST Data recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here.

 

VAST Enough?

VAST Data have a solution that basically offers massive scale with Tier 1 performance, without the cost traditionally associated with Tier 1 storage.

Foundational Pieces

Some of the key pieces of the solution are technologies that weren’t commonly available until recently, including:

  • NVMe-oF – DC-scale storage protocol that enables remote NVMe devices to be accessed with direct attached performance.
  • QLC Flash – A new Flash architecture that costs less than enterprise Flash while delivering enterprise levels of performance.
  • Storage Class Memory – Persistent, NVMe memory that can be used to reliably buffer perfect writes to QLC and create large, global metadata structures to enable added efficiency.

If you read their blog post, you’ll notice that there are some interesting ideas behind the VAST Data solution, including the ideas that:

  • Flash is the only media that can be used to bring the cost of storage under what people pay today for HDD-based systems.
  • NFS and S3 can be used for applications that up until now required a level of performance that could only come from block storage.
  • Low-endurance QLC flash can be used for even the most transactional of workloads.
  • Storage computing can be disaggregated from storage media to enable greater simplicity than shared-nothing and hyper-converged architectures.
  • Data protection codes can reduce overhead to only 2% while enabling levels of resiliency 10 orders of magnitude more than classic RAID.
  • Compressed files provide evidence that data can be reduced further when viewed on a global scale.
  • Parallel storage architectures can be built without any amount of code parallelism.
  • Customers can build shared storage architectures that can compose and assign dedicated performance and security isolation to tenants on the fly.
  • One well-engineered, scalable storage system can be ‘universal’ and can enable a diverse array of workloads and requirements.

Architecture

[image courtesy of VAST Data]

  • VAST Servers – A cluster can be built with 2- 10,000 stateless servers. Servers can be collocated with applications as containers and made to auto-scale with application demand.
  • NVMe Fabric – A scalable, shared-everything cluster can be built by connecting every server and device in the cluster over commodity data center networks (Ethernet or InfiniBand).
  • NVMe Enclosures – Highly-Available NVMe Enclosures manage over one usable PB per RU. Enclosures can be scaled independent of Servers and clusters can be built to manage exabytes.

Rapid Rebuild Encoding

VAST codes accelerate rebuild speed by using a new type of algorithm that gets faster with more redundancy data. Everything is fail-in-place.

  • 150+4: 3x faster than HDD erasure rebuilds, 2.7% overhead
  • 500+10: 2x faster than HDD erasure rebuilds, 2% overhead Additional redundancy enables MTBF of over 100,000 years at scale.

Read more about that here.

Global Data Reduction

  • Data is fingerprinted in large blocks after the write is persisted in SCM
  • Fingerprints are compared to measure relative distance, similar chunks are clustered
  • Clustered data is compressed together; byte-level deltas are extracted & stored

Read more about that here.

Deployment Options

  • Full Appliance – VAST-provided turn-key appliance
  • Software-Defined – enclosures and container software
  • Software-only – run VAST SW on certified QLC hardware

 

Specifications

The storage is the VAST DF-5615 Active / Active NVMe Enclosure.

[image courtesy of VAST Data]

 

I/O Modules 2 x Active/Active IO Modules
I/O Connectivity 4 x 100Gb Ethernet or 4 x 100Gb InfiniBand
Management (optional) 4 x 1GbE
NVMe Flash Storage 44 x 15.36TB QLC Flash
NVMe Persistent Memory 12 x 1.5TB U.2 Devices
Dimensions (without cable mgmt.) 2U Rackmount

H: 3.2”, W: 17.6”, D: 37.4”

Weight 85 lbs.
Power Supplies 4 x 1500W
Power Consumption 1200W Avg / 1450W Max
Maximum Scale Up to 1,000 Enclosures

 

Compute is housed in the VAST Quad Server Chassis.

[image courtesy of VAST Data]

 

Servers 4 x Stateless VAST Servers
I/O Connectivity 8 x 50 Gb Ethernet 4 x 100 Gb InfiniBand
Management (optional) 4 x 1GbE
Physical CPU Cores 80 x 2.4 GHz
Memory 32 x 32GB 2400 MHz RDIMM
Dimensions 2U Rackmount

H: 3.42”, W: 17.24”, D: 28.86”

Weight 78 lbs.
Power Supplies 2 x 1600W
Power Consumption 750W Avg / 900W Max
Maximum Scale Up to 10,000 VAST Servers

 

Thoughts And Other Reading

One of my favourite things about the VAST Data story is the fact that they’re all in on a greenfield approach to storage architecture. Their ace in the hole is that they’re leveraging Persistent Memory, QLC and NVMe-oF to make it all work. Coupled with the disaggregated shared everything architecture, this seems to me like a fresh approach to storage. There are also some flexible options available for deployment. I haven’t seen what the commercials look like for this solution, so I can’t put my hand on my heart and tell you that this will be cheaper than a mechanical drive based solution. That said, the folks working at VAST have some good experience with doing smart things with Flash, and if anyone can make this work, they can. I look forward to reading more about VAST Data, particularly when they get some more customers that can publicly talk about what they’re doing. It also helps that my friend Howard has joined the company. In my opinion that says a lot about what they have to offer.

VAST Data have published a reasonably comprehensive overview of their soilution that can be found here. There’s also a good overview of VAST Data by Chris Mellor that you can read here. You can also read more from Chris here, and here. Glenn K. Lockwood provides one of the best overviews on VAST Data you can read here.

Western Digital Are Keeping Composed

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Western Digital recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here.

 

Getting Composed

Scott Hamilton (Senior Director, Product Management) spoke to the delegates about Western Digital’s vision for composable infrastructure. I’m the first to admit that I haven’t really paid enough attention to composability in the recent past, although I do know that it messes with my computer’s spell check mechanism – so it must be new and disruptive.

There’s Work To Be Done

Hamilton spoke a little about the increasingly dynamic workloads in the DC, with a recent study showing that:

  • 45% of compute hours and storage capacity are utilised
  • 70% report inefficiencies in the time required to provision compute and storage resources

There are clearly greater demands on:

  • Scalability
  • Efficiency
  • Agility
  • Performance

Path to Composability

I remember a few years ago when I was presenting to customers about hyper-converged solutions. I’d talk about the path to HCI, with build it yourself being the first step, followed by converged, and then hyper-converged. The path to Composable is similar, with converged, and hyper-converged being the precursor architectures in the modern DC.

Converged

  • Preconfigured hardware / software for a specific application and workload (think EMC Vblock or NetApp FlexPod)

Hyper-Converged

  • Software-defined with deeper levels of abstraction and automation (think Nutanix or EMC’s VxRail)

Composable

  • Disaggregated compute and storage resources
  • Shared pool of resources that can be composed and made available on demand

[image courtesy of Western Digital]

The idea is that you have a bunch of disaggregated resources that can be really used as a pool for various applications or hosts. In this architecture, there are

  • No physical systems – only composed systems;
  • No established hierarchy – CPU doesn’t own the GPU or the memory; and
  • All elements are peers on the network and they communicate with each other.

 

Can You See It?

Western Digital outlined their vision for composable infrastructure thusly:

Composable Infrastructure Vision

  • Open – open in both form factor and API for management and orchestration of composable resources
  • Scalable – independent performance and capacity scaling from rack-level to multi-rack
  • Disaggregated – true disaggregation of storage and compute for independent scaling to maximise efficiency, agility snd to reduce TCO
  • Extensible – flash, disk and future compassable entities can be independently scaled, managed and shared over the same fabric

Western Digital’s Open Composability API is also designed for DC Composability, with:

  • Logical composability of resources abstracted from the underlying physical hardware, and
  • It discovers, assembles, and composes self-virtualised resources via peer-to-peer communication.

The idea is that it enables virtual system composition of existing HCI and Next-generation SCI environments. It also

  • Future proofs the transition from hyper-converged to disaggregated architectures
  • Complements existing Redfish / Swordfish usage

You can read more about OpenFlex here. There’s also an excellent technical brief from Western Digital that you can access here.

 

OpenFlex Composable Infrastructure

We’re talking about infrastructure to support an architecture though. In this instance, Western Digital offer the:

  • OpenFlex F3000 – Fabric device and enclosure; and
  • OpenFlex D3000 – High capacity for big data

 

F3000 and E3000

The F3000 and E3000 (F is for Flash Fabric and E is for Enclosure) has the following specification:

  • Dual-port, high-performance, low-latency, fabric-attached SSD
  • 3U enclosure with 10 dual-port slots offering up to 614TB
  • Self-virtualised device with up to 256 namespaces for dynamic provisioning
  • Multiple storage tiers over the same wire – Flash and Disk accessed via NVMf

D3000

The D3000 (D is for Disk / Dense) is as follows:

  • Dual-port fabric-attached high-capacity device to balance cost and capacity
  • 1U network addressable device offering up to 168TB
  • Self-virtualised device with up to 256 namespaces for dynamic provisioning
  • Multiple storage tiers over the same wire – Flash and Disk accessed via NVMe-oF

You can get a better look at them here.

 

Thoughts and Further Reading

Western Digital covered an awful lot of ground in their presentation at Storage Field Day 18. I like the story behind a lot of what they’re selling, particularly the storage part of it. I’m still playing wait and see when it comes to the composability story. I’m a massive fan of the concept. It’s my opinion that virtualisation gave us an inkling of what could be done in terms of DC resource consumption, but there’s still an awful lot of resources wasted in modern deployments. Technologies such as containers help a bit with that resource control issue, but I’m not sure the enterprise can effectively leverage them in their current iteration, primarily because the enterprise is very, well, enterprise-y.

Composability, on the other hand, might just be the kind of thing that can free the average enterprise IT shop from the shackles of resource management ineptitude that they’ve traditionally struggled with. Much like the public cloud has helped (and created consumption problems), so too could composable infrastructure. This is assuming that we don’t try and slap older style thinking on top of the infrastructure. I’ve seen environments where operations staff needed to submit change requests to perform vMotions of VMs from one host to another. So, like anything, some super cool technology isn’t going to magically fix your broken processes. But the idea is so cool, and if companies like Western Digital can continue to push the boundaries of what’s possible with the infrastructure, there’s at least a chance that things will improve.

If you’d like to read more about the storage-y part of Western Digital, check out Chin-Fah’s post here, Erik’s post here, and Jon’s post here. There was also some talk about dual actuator drives as well. Matt Leib wrote some thoughts on that. Look for more in this space, as I think it’s starting to really heat up.

IBM Spectrum Protect Plus – More Than Meets The Eye

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

IBM recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here.

 

We Want A Lot From Data Protection

Data protection isn’t just about periodic protection of applications or files any more. Or, at the very least, we seem to want more than that from our data protection solutions. We want:

  • Application / data recovery – providing data availability;
  • Disaster Recovery – recovering from a minor to major data loss;
  • BCP – reducing the risk to the business, employees, market perception;
  • Application / data reuse – utilise for new routes to market; and
  • Cyber resiliency – recover the business from a compromised attack.

There’s a lot to cover there. And it could be argued that you’d need five different solutions to meet those requirements successfully. With IBM Spectrum Protect Plus (SPP) though, you’re able to meet a number of those requirements.

 

There’s Much That Can Be Done

IBM are positioning SPP as a tool that can help you extend your protection options beyond the traditional periodic data protection solution. You can use it for:

  • Data management / operational recovery – modernise and expanded use cases with instant data access, instant recovery leveraging snapshots;
  • Backup – traditional backup / recovery using streaming backups; and
  • Archive – long-term data retention / compliance, corporate governance.

 

Key Design Principles

Easy Setup

  • Deploy Anywhere: virtual appliance, cloud, bare metal;
  • Zero touch application agents;
  • Automated deployment for IBM Cloud for VMware; and
  • IBM SPP Blueprints.

The benefits of this include:

  • Easy to get started;
  • Reduced deployment costs; and
  • Hybrid and multi-cloud configurations.

Protect

  • Protect databases and applications hosted on-premises or in cloud;
  • Incremental forever using native hypervisor, database, and OS APIs; and
  • Efficient data reduction using deduplication and compression.

The benefits of this include:

  • Efficiency through reduced storage and network usage;
  • Stringent RPOs compliance with a reduced backup window; and
  • Application backup with multi-cloud portability.

Manage

  • Centralised, SLA-driven management;
  • Simple, secure RBAC based user self service; and
  • Lifecycle management of space efficient point-in-time snapshots.

The benefits of this include:

  • Lower TCO by reducing operational costs;
  • Consistent management / governance of multi-cloud environments; and
  • Secure by design with RBAC.

Recover, Reuse

  • Instant access / sandbox for DevOps and test environments;
  • Recover applications in cloud or data centre; and
  • Global file search and recovery.

The benefits of this include:

  • Improved RTO via instant access;
  • Eliminate time finding the right copy (file search across all snapshots with a globally indexed namespace);
  • Data reuse (versus backup as just an insurance policy); and
  • Improved agility; efficiently capture and use copy of production data for test.

 

One Workflow, Multiple Use Cases

There’s a lot you can with SPP, and the following diagram shows the breadth of the solution.

[image courtesy of IBM]

 

Thoughts and Further Reading

When I first encountered IBM SPP at Storage Field Day 15, I was impressed with their approach to policy-driven protection. It’s my opinion that we’re asking more and more of modern data protection solutions. We don’t just want to use them as insurance for our data and applications any more. We want to extract value from the data. We want to use the data as part of test and development workflows. And we want to manipulate the data we’re protecting in ways that have proven difficult in years gone by. It’s not just about having a secondary copy of an important file sitting somewhere safe. Nor is it just about using that data to refresh an application so we can test it with current business problems. It’s all of those things and more. This add complexity to the solution, as many people who’ve administered data protection solutions have found out over the years. To this end, IBM have worked hard with SPP to ensure that it’s a relatively simple process to get up and running, and that you can do what you need out of the box with minimal fuss.

If you’re already operating in the IBM ecosystem, a solution like SPP can make a lot of sense, as there are some excellent integration points available with other parts of the IBM portfolio. That said, there’s no reason you can’t benefit from SPP as a standalone offering. All of the normal features you’d expect in a modern data protection platform are present, and there’s good support for enhanced protection use cases, such as analytics.

Enrico had some interesting thoughts on IBM’s data protection lineup here, and Chin-Fah had a bit to say here.

StorPool And The Death of Hardware-Defined Storage

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

StorPool recently presented at Storage Field Day 18. You can see their videos from Storage Field Day 18 here, and download a PDF copy of my rough notes from here.

 

StorPool?

StorPool delivers block storage software. Fundamentally, it “pools the attached storage (hard disks or SSDs) of standard servers to create a single pool of shared block storage. The StorPool software is installed on each server in the cluster and combines the performance and capacity of all drives attached to the servers into one global namespace”. There’s a useful technical overview that you can read here.

[image courtesy of StorPool]

StorPool position themselves as a software company delivering scale-out, block storage software. They say they’ve been doing this before SDS / SDN / SDDC & “marketing-defined storage” were popular terms. The idea is that it is always delivered as a working storage solution on customer’s hardware. There are a few ways that the solution can be used, including:

  1. Fully-Managed software + 24/7/365 support, SLAs, etc
  2. On HCL-compatible hardware; or
  3. As a pre-integrated solution.

Data Integrity

The kind of data management features you’d expect from modern storage systems are present here as well, including:

  • Thin provisioning / reclaim;
  • Copy on Write snapshots, clones; and
  • Changed block tracking, incremental recovery, and transfer.

There’s also support for multi-site deployments:

  • Connect 2 or more StorPool clusters over public Internet; and
  • Send snapshots between clusters for backup and DR.

Developed from Scratch

One of the cool things about StorPool is that whole thing has been developed from scratch. They use their own on-disk format, protocol, quorum, client, etc. They’ve had systems running in production for 6+ years, as well as:

  • Numerous 1PB+ flash systems;
  • 17 major releases; and
  • Global customers.

Who Uses It?

So who uses StorPool? Their target customers are companies building private and public clouds, including:

  • Service Providers and folk operating public clouds; and
  • Enterprises and various private cloud implementations.

That’s obviously a fairly broad spectrum of potential customers, but I think that speaks somewhat to the potential versatility of software-defined solutions.

 

Thoughts and Further Reading

“Software-defined” storage solutions have become more and more popular in the last few years. Customers seem to be getting more comfortable with using and supporting their own hardware (up to a point), and vendors seem to be more willing to position these kinds of solutions as viable, production-ready platforms. It helps tremendously, in my opinion, that a lot of the heavy lifting previously done with dedicated silicon on traditional storage systems can now be done by a core on an x86 or ARM-based CPU. And there seem to be a lot more cores going around, giving vendors the option to do a lot more with these software-defined systems too.

There are a number of benefits to adopting software-defined solutions, including the ability to move from one hardware supplier to another without the need to dramatically change the operation environment. There’s a good story to be had in terms of updates too, and it’s no secret that people like that they aren’t tied to the vendor’s professional services arm to get installations done in quite the same way they perhaps were with dedicated storage arrays. It’s important to remember, though, that software isn’t magic. If you throw cruddy hardware at a solution like StorPool, it’s not going to somehow exceed the limitations of that hardware. You still need to give it some grunt to get some good performance in return. That said, there are plenty of examples where software-defined solutions can be improved dramatically through code optimisations, without changing hardware at all.

The point of all this is that, whilst I don’t really think hardware-defined storage solutions are going anywhere for the moment, companies like StorPool are certainly delivering compelling solutions in code that mean you don’t need to be constrained by what the big box storage vendors are selling you. StorPool have put some careful consideration into the features they offer with their platform, and have also focused heavily on the possible performance that could be achieved with the solution. There’s a good resilience story there, and it seems to be very service provider-friendly. Of course, everyone’s situation is different, and not everyone will get what they need from something like StorPool. But if you’re in the market for a distributed block storage system, and have a particular hankering to run it on your own, preferred, flavour of hardware, something like StorPool is certainly worthy of further investigation. If you want to dig in a little more, I recommend checking out the resources section on the StorPool website – it’s packed with useful information. And have a look at Ray’s article as well.