Storage Field Day 12 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

This is a quick post to say thanks once again to Stephen, Richard and Kat and the presenters at Storage Field Day 12. I had a super fun and educational time. For easy reference, here’s a list of the posts I did covering the event (they may not match the order of the presentations).

Storage Field Day – I’ll Be At Storage Field Day 12

Storage Field Day 12 – Day 0

Storage Field Day 12 – (Fairly) Full Disclosure

Excelero are doing what? For how much?

Ryussi – Or Why Another SMB Stack Is Handy

Nimble Storage Gets Cloudy

NetApp Aren’t Just a Pretty FAS

Intel Are Putting Technology To Good Use

There’s A Whole Lot More To StarWind Than Free Stuff

Elastifile Are Doing It For All The Right Reasons

Datera – Hybrid Is The New Black

SNIA Know What Time It Is

 

Also, here’s a number of links to posts by my fellow delegates (in no particular order). They’re all very smart people, and you should check out their stuff, particularly if you haven’t before. I’ll attempt to keep this updated as more posts are published. But if it gets stale, the Storage Field Day 12 landing page has updated links.

 

Ray Lucchesi (@RayLucchesi)

4.5M IO/sec@227µsec 4KB Read on 100GBE with 24 NVMe cards #SFD12

There’s a new cluster filesystem on the block, Elastifile

 

Jon Klaus (@JonKlaus)

Storage Field Day 12: storage drop bears reunited!

Intel SPDK and NVMe-oF will accelerate NVMe adoption rates

SNIA: Avoiding tail latency by failing IO operations on purpose

Moving to and between clouds made simple with Elastifile Cloud File System

Excelero NVMesh: lightning fast software-defined storage using commodity servers & NVMe drives

 

Arjan Timmerman (@ArjanTim)

The Datera Company overview

 

Adam Bergh (@AJBergh)

Storage Field Day 12!

Storage Field Day 12 Day 1 Recap and Day 2 Preview

Storage Field Day 12  Day 2 Recap

Storage Field Day 12  Day 3 Recap

 

Chan Ekanayake (@S_Chan_Ek)

Storage Field Day (#SFD12) – A quick intro!

Storage Field Day 12 (#SFD12) – Vendor line up

Excelero – The Latest Software Defined Storage Startup

Intel Storage Futures From #SFD12

Impact from Public Cloud on the storage industry – An insight from SNIA at #SFD12

 

Chin-Fah Heoh (@StorageGaga)

Ryussi MoSMB – High performance SMB

The engineering of Elastifile

Can NetApp do a bit better?

 

Dave Henry (@DaveMHenry)

Confirmed: I’ll be a Delegate at Storage Field Day 12

 

Glenn Dekhayser (@GDekhayser)

Intel Storage – Storage Field Day 12

 

Howard Marks (@DeepStorageNet)

Visiting Intel with SFD 12

 

Matthew Leib (@MBLeib)

Open19 Brings a new build paradigm to HyperScale Buildouts

Excelero achieves amazing stats at #SFD12

Netapp – an #SFD12 Update

Nimble’s InfoSight – An #SFD12 Follow-up

 

Finally, thanks again to Stephen and the team at Gestalt IT. It was an educational and enjoyable few days and I really valued the opportunity I was given to attend.

[image courtesy of Tech Field Day]

SNIA Know What Time It Is

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

Here are some notes from SNIA‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

SNIA Know What Time It Is

SNIA gave a fantastic presentation towards the end of Storage Field Day 12. It covered the world of Hyperscalers primarily. Storage and Hyperscalers is pretty wild stuff. The hyper bit of Hyperscalers means that they’re doing things that your traditional enterprise probably doesn’t, and coming across problems that you or I may not. I won’t go into what was covered in the presentation here though. Instead I urge you to check the video and my notes for more on that.

I’ve thought a lot over the last few weeks about what I saw and heard during SNIA’s presentation, and about what I knew about them from previous interactions at the odd industry event in Australia. And while I’d love to talk about Hyperscalers in this article, I think it’s more important to use this as an opportunity to fly the flag for SNIA, so to speak. What I really want to draw your attention to, my three weary but loyal readers, is the importance of an association like SNIA to the storage industry. It might be self-evident to some of us in the industry, but for your average storage punter SNIA may seem like a bit of a mystery. It doesn’t have to be that way though. There’s a tonne of extremely useful information available on the SNIA website, from the Dictionary, to tutorials, to information on storage industry standards. That’s right, whilst it may appear at times that the storage industry is the high tech wild west, there are a lot of people from a range of vendors and independents working together to ensure standards are coherent, documented and available to review. They also present at various events (not just the storage ones) and have published a whole heap of extremely interesting white papers that I recommend you check out.

Industry associations sometimes get a bad rap, because some people find themselves in charge of them and start using them for personal gain (I’m not referring to SNIA in this instance), or because members sign up to them and don’t see immediate benefits or increased sales. But not all associations have to be a fiasco. I believe SNIA have proven their value to the industry, and I think we should all be making more of an effort to promote what they’re doing and what they’re trying to achieve. And if, for whatever reason, you’re not happy about something that’s happening or something they’re doing, get in touch with them. The only way the industry can get better is to, well, be better. And SNIA seem to be doing their bit. Or at least they’re trying to.

Datera – Hybrid Is The New Black

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

Here are some notes from Datera‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

Hybrid is the New Black

Datera’s Mark Fleischmann spent some time talking to us about the direction Datera think the industry is heading. They’re seeing the adoption of public cloud operations and architecture as the “new IT blueprint”. Ultimately, a move to a “Unified Hybrid Cloud” seems to be the desired end-state for most enterprises, where we’re able to leverage a bunch of different solutions depending on requirements, etc. In my mind it’s not that dissimilar to the focus on “best of breed” that was popular when I first got into the technology industry. It’s a concept that looks great on a slide, but it’s a lot harder to effect than people realise.

According to Datera, the goal is to deliver self-tuning invisible data infrastructure. This provides:

  • Policy-based automation;
  • High performance;
  • Low latency;
  • Simple management;
  • Scalability; and
  • Agility.

For Datera, the key attribute is the policy based one. I wrote a little about the focus on intent after I saw them at Storage Field Day 10. I still think this is a key part of Datera’s value proposition, but they’ve branched out a bit more and are now also focused particularly on high performance and low latency. Datera are indeed keen to “give people better than public cloud”, and are working on hybrid cloud data management to provide a fabric across public and private clouds.

 

What do we have now?

So where are we at right now in the enterprise? According to Datera, we have:

  • Expensive silos – composed of legacy IT and open source building blocks – neither of which were designed to operate as-a-Service (aaS); and
  • Data gravity – where data is restricted in purpose-built silos with the focus on captive data services.

 

What do we want?

That doesn’t sound optimal. Datera suggest that we’d prefer:

  • Automation – with cloud-like data simplicity, scalability and agility, application-defined smart automation, “self-driving” infrastructure; and
  • Choice – hybrid data choices of services across clouds, flexibility and options.

Which sounds like something I would prefer. Of course, Datera point out that “[d]ata is the foundation (and the hard part)”. What we really need is for a level of simplicity that can be applied to our infrastructure in much the same way as our applications are easy to use (except Word, that’s not easy to use).

 

What’s a Hybrid?

So what does this hybrid approach really look like? For Datera, there are a few different pieces to the puzzle.

Multi-cloud Data Fabric

Datera want you to be able to leverage on-premises clouds, but with “better than AWS” data services:

  • True scale out with mixed media
  • Multiple tiers of service
  • 100% operations offload

You’re probably also interested in enterprise performance and capabilities, such as:

  • 10x performance, 1/10 latency
  • Data sovereignty, security and SLOs
  • Data services platform and ecosystem

 

Cloud Operations

You’ll want all of this wrapped up in cloud operations too, including cloud simplicity and agility:

  • Architected to operate as a service;
  • Self-tuning, wide price/performance band; and
  • Role-based multi-tenancy.

Multi-cloud Optionality

  • Multi-customer IaaS operations portal; and
  • Predictive data analysis and insights.

 

So Can Datera Hybrid?

They reckon they can, and I tend to agree. They offer a bunch of features that feel like all kinds of hybrid.

Symmetric Scale-out

  • Heterogeneous node configurations in single cluster (AFN + HFA);
  • Deployed on industry standard x86 servers;
  • Grow-as-you-grow (node add, replacement, decommission, reconfiguration);
  • Single-click cluster-wide upgrade; and
  • Online volume expansion, replica reconfiguration.

 

Policy-based Data Placement

  • Multiple service levels – IOPS, latency, bandwidth, IO durability;
  • Policy-based data and target port placement;
  • All-flash, primary flash replica, or hybrid volumes;
  • Application provisioning decoupled from infrastructure management;
  • Template-based application deployment; and
  • Automated to scale.

 

Infrastructure Awareness

Native Layer-3 Support

  • DC as the failure domain (target port (IP) can move anywhere);
  • Scale beyond Layer-2 boundaries; and
  • Scale racks without overlay networking.

Fault Domains

  • Automate around network/power failure domains or programmable availability zones (data/replica distribution, rack awareness); and
  • Data services with compute affinity.

Self-adaptive System

  • Real-time load target port and storage rebalancing;
  • Transparent IP address failover;
  • Transparent node failure handling, network link handling; and
  • Dynamic run-time load balancing based on workload / system / infrastructure changes.

Multi-Tenancy

  • Multi-tenancy for storage resources;
  • Micro-segmentation for users/tenants/applications;
  • Noisy neighbour isolation through QoS;
  • IOPS and bandwidth controls (total, read, write); and
  • IP pools, VLAN tagging for network isolation.

API-driven Programmable

  • API-first DevOps provisioning approach;
  • RESTful API with self-describing schema;
  • Interactive API browser; and
  • Integration with wide eco-system.

 

What Do I Do With This Information?

Cloud Operations & Analytics

Datera also get that you need good information to make good decisions around infrastructure, applications and data. To this end, they offer some quite useful features in terms of analytics and monitoring.

From a system telemetry perspective, you get continuous system monitoring and a multi-cluster view. You also get insights into network performance and system / application performance. Coupled with capacity planning and trending information and system inventory information there’s a bunch of useful data available. The basic monitoring in terms of failure handling and alerting is also covered.

 

Conclusion and Further Reading

It’s not just Datera that are talking about hybrid solutions. A bunch of companies across a range of technologies are talking about it. Not because it’s necessarily the best approach to infrastructure, but rather because it takes a bunch of the nice things we like about (modern) cloud operations and manages to apply them to the legacy enterprise infrastructure stack that a lot of us struggle with on a daily basis.

People like cloud because it’s arguably a better way of working in a lot of cases. People are getting into the idea of renting service versus buying products outright. I don’t understand why this has developed this way in recent times, although I do understand there can be very good fiscal reasons for doing so. [I do remember being at an event last year where rent versus buy was discussed in broad terms. I will look into that further].

Datera understand this too, and they also understand that “legacy” infrastructure management can be a real pain for enterprises, and that the best answer, as it stands, is some kind of hybrid approach. Datera’s logo isn’t the only thing that’s changed in recent times, and they’ve come an awful long way since I first heard from them at Storage Field Day 10. I’m keen to see how their hybrid approach to infrastructure, data and applications develops in the next 6 – 12 months. At this stage, it seems they have a solid plan and are executing it. Arjan felt the same way, and you can read his article here.

Elastifile Are Doing It For All The Right Reasons

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

Here are some notes from Elastifile‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

What is it?

Elastifile’s crown jewel is the Elastic Cloud File System (ECFS). It’s a scalable data platform, supporting:

  • 1000s of nodes, 100000s of clients
  • 100s thousands FS (data containers), unlimited files/directories
  • Exabyte scale capacity, 10s millions IOPS (and above)

And offering “advanced embedded data management and analytics”.

It is software only, and is virtualisation and cloud friendly, running on:

  • Physical servers (tested on 20+ platforms)
  • On-premises virtualisation (VMware, KVM, etc)
  • Cloud VMs (Amazon, Google, etc)

It’s also “Flash Native”, supporting all NAND technologies interfaces and offering support for object/S3 cold tiers (dynamic tiering/HSM/ILM).

From an architectural perspective, ECFS also offers:

  • Enterprise Level Features, including non-disruptive upgrades (NDUs), n-way redundancy, self healing, snapshots, sync/async DR
  • Storage Interfaces based on NFSv3/v4, SMB2/3, S3, HDFS

It also has cloud-friendly features, including:

  • Multi-tenancy, QoS, Hot add/remove nodes/capacity
  • Snapshot shipping to S3 (“CloudConnect”)
  • ILM/HSM/Dynamic tiering to other sites/clouds, Object/S3
  • Multi-site (geographically distributed access)

You can also leverage a number of different deployment modes, including:

  • Hyperconverged mode (HCI)
  • Dedicated storage mode (DSM) – with 2 tiers or single tier
  • Cloud (“Marketplace”) – as a service / as an application

 

Design Objectives and Journey

So what did Elastifile consider the foundation for a successful architecture when designing the product? They told us it would be “[a] good trade-off between all relevant dimensions, resources and requirements to produce the best solution for the desired target(s)”. Note, however, that it isn’t a linear path. Their goal is to “[b]e the best data platform for the cloud era enterprise”. It’s a lofty goal, to be sure, but when you’re starting from scratch it’s always good to aim high. Elastifile went on to tell us a little bit about what they’ve seen in the marketplace, what requirements these produced, and how those requirements drove product-direction decisions.

 

Elastifile Architectural Base Observations

Elastifile went into what they saw out in the marketplace:

  • Enterprises increasingly use clouds or cloud-like techniques;
  • In a cloud (like) environment the focus ISN’T infrastructure (storage) but rather services (data);
  • Data services must be implemented by simple, efficient mechanisms for many concurrent I/O data patterns;
  • Everything should be managed by APIs;
  • Data management should be very fine grained; and
  • Data mobility has to be solved.

 

Elastifile Architectural Base Requirements

The following product requirements were then developed based on those observations:

  • Everything must be automatic;
  • Avoid unnecessary restrictions and limitations. Assume as little as you can about the customer’s I/O patterns;
  • Bring Your Own Hardware (BYOH): Avoid unnecessary/uncommon hardware requirements like NVRAM, RDMA networks, etc (optional optimisations are OK);
  • Support realtime & dynamic reconfiguration of the system;
  • Support heterogeneous hardware (CPU, memory, SSD types, sizes, etc); and
  • Provide good consistent predictable performance for the given resources (even under failures and noisy environments).

 

Elastifile Architectural Base Decisions

So how do these requirements transform into architectural directions?

Scale-out (and Scale-up)

  • Cloud, cloud and cloud! (Cloudy is popular for a reason)

Software only

  • Cloud and virtualisation friendly (smart move)
  • Cost effective (potentially, although my experience of software companies is that they all eventually want to be Oracle)

Flash only

  • Provides flexibility and efficient multiple concurrent IO patterns
  • Capacity efficiency achieved by dedupe, compression and tiering

Application level file system

  • Enables unique data level services and the best performance (!) (a big claim, but Elastifile are keen to stand behind it)
  • Superset of block/object interfaces
  • Enables data sharing (and not only storage sharing) for user self-service. (The focus on data, not just storage, is great).

 

Bumps in the Road

Elastifile started in 2013 and launched in v1 in Q4 2016 and they had a few bumps along the way.

  • To start with, the uptake in private clouds didn’t happen as expected;
  • OpenStack didn’t gather enough momentum;
  • It seems that private clouds don’t make sense short of the web-scale guys – lack of economy of scale;
  • Many enterprises do not attempt to modernise their legacy systems, but rather attempt to shift (some/all) workloads to the public cloud.

The impact on product development? They had to support public cloud use cases earlier than expected. According to Elastifile, this turned out to be a good thing in the end, and I agree.

 

The hyperconverged infrastructure (HCI) model has proven problematic for many use cases. (And I know this won’t necessarily resonate with a lot of people I know, but horses for course and all that). Why not?

  • It’s not perceived well by many storage administrators due to problematic responsibilities boundaries (“where is my storage?);
  • Requires coordination with the applications/server infrastructure
  • Limits the ability to scale resources (e.g. scale capacity, not performance)

HCI is nonetheless a good fit for

  • Places/use cases without (appropriate/skilled) IT resources (e.g. ROBO, SMB); and
  • Vertical implementations (this means web scale companies in most places).

The impact on Elastifile’s offering? They added a dedicated storage mode to provide separate storage resources and scaling to get around these issues.

 

Conclusion

One of the things I really like about Elastifile is that the focus isn’t on the infrastructure (storage) but rather services (data). I’ve a million conversations lately with people across a bunch of different business types around the importance of understanding their applications (and the data supporting their applications) and why that should be more important to them than the badge on the front of the platform. That said, plenty of companies are running applications and don’t really understand the value of these applications in terms of business revenue or return. There’s no point putting in a massively scalable storage solution of you’re doing it to support applications that aren’t helping the business do its business. It seems like a reasonable thing to be able to articulate, but as anyone who’s worked in large enterprise knows, it’s often not well understood at all.

Personally, I love it when vendors go into the why of their key product architectures – it’s an invaluable insight into how some of these things get to minimum viable product. Of course, if you’re talking to the wrong people, then your product may be popular with a very small number of shops that aren’t spending a lot of money. Not only should you understand the market you’re trying to develop for, you need to make sure you’re actually talking to people representative of that market, and not some confused neckbeards sitting in the corner who have no idea what they’re doing. Elastifile have provided some nice insight into why they’ve done what they’ve done, and are focusing on areas that I think are really important in terms of delivering scalable and resilient platforms for data storage. I’m looking forward to seeing what they get up to in the future. If you’d like to know more, and don’t mind giving up some details, check out the Elastifile resources page for a nice range of white papers, blog posts and videos.

There’s A Whole Lot More To StarWind Than Free Stuff

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

Here are some notes from StarWind‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

StarWind Storage Appliance

StarWind’s Storage Appliance is looking to solve a number of problems with traditional storage solutions, including:

  • Unpredictable data and performance requirements growth;
  • Legacy storage incompatibility with existing compute / HCI; and
  • Untenable licensing costs when scaling out traditional HCI.

StarWind tell us that their solution offers the following benefits:

  • Easily scalable, fast, and fault tolerant storage;
  • Seamless integration into any existing CI/HCI; and
  • Ability to scale storage independently from compute.

This all sounds great, but what’s the “sweet spot” for deploying the appliances? StarWind position the solution where “high performance is needed for particular workload at a reasonable cost”. This seems to fall within the 20 – 40TB range. They also offer hybrid and all-flash models (you can check out the datasheet here).

 

What’s in the box?

So what do you get with these things? Nothing too surprising.

  • NL-SAS / SAS drives
  • SATA SSDs
  • RAID Adapter
  • Ethernet adapter (with support for RoCE)
  • DIMM RAM Cards

 

“The magic happens in the software”

For StarWind, the real value is derived from the software platform driving the appliance.

Supported storage protocols include:

  • iSCSI
  • iSER
  • SMB3
  • SMB Direct
  • NFS v4.1
  • NVMoF (coming soon)

From a management perspective, you get access to the following tools:

  • Web GUI
  • vCenter Plugin
  • Thick client
  • CLI (Powershell)
  • VASA / vVols
  • SMI-S

While there’s support for Ethernet and InfiniBand, there is still (disappointingly) no FCoTR support.

 

Stairway to Cloud

StarWind also walked through their partner Aclouda‘s product – a hardware cloud storage gateway recognised by the server as an ordinary hard drive. In the picture above you can see:

  1. SATA interface for host connectivity
  2. Proprietary RTOS (no Linux!). Motorola PowerPC and ARM
  3. Gigabit Ethernet: iSCSI and SMB3 for cloud uplink
  4. Altera FGPA to accelerate (what software can’t do)

The idea is you can replace spinning disks with cloud storage transparently to any software-defined storage and hypervisor. You can read about the StarWind Storage Appliance and AcloudA Use Case here (registration required).

 

Further Reading and Conclusion

I’m the first to admit that I was fairly ignorant of StarWind’s offering beyond some brief exposure to their free tools during a migration project I did a few years ago. Their approach to hardware seems solid, and they’ve always been a bit different in that they’ve traditionally used Windows as their core platform. I get the impression there’s a move away from this as scalability and throughput requirements increase at a rapid pace.

The HCI market is crowded to say the least, but this doesn’t mean companies like StarWind can’t deliver a reasonable product to customers. They say the sweet spot for this is 20 – 40TB, and there are plenty of smaller shops out there who’d be happy to look at this as an alternative to the bigger storage plays. To their credit, StarWind has focused on broad protocol support and useful management features. I think the genesis of the product in a software platform has certainly given them some experience in delivering features rather than relying on silicon to do the heavy lifting.

I’m looking forward to seeing how this plays out for StarWind, as I’m certainly keen to see them succeed (if for no other reason than Max and Anton are really nice guys). It remains to be seen whether the market is willing to take a bet on a relative newcomer to the HCI game, but StarWind appear to have the appetite to make the competition interesting, at least in the short term. And if you’ve gotten nothing else from this post, have a look at some of the white papers on the site, as they make for some great reading (registration required):

(As an aside, you haven’t lived until you’ve read Stephen’s articles on the I/O Blender here, here and here).

Intel Are Putting Technology To Good Use

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

Here are some notes from Intel‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

I/O Can Be Hard Work

With the advent of NVM Express, things go pretty fast nowadays. Or, at least, faster than they used to with those old-timey spinning disks we’ve loved for so long. According to Intel, systems with multiple NVMe SSDs are now capable of performing millions of I/Os per second. Which is great, but it results in many cores of software overhead with a kernel-based interrupt-driven driver model. The answer, according to Intel, is the Storage Performance Development Kit (SPDK). The SPDK enables more CPU cycles for storage services, with lower I/O latency. The great news is that there’s now almost no premium now on capacity to do IOPS with a system. So how does this help in the real world?

 

Real World Applications?

SPDK VM I/O Efficiency

The SPDK offers some excellent performance improvements when dishing up storage to VMs.

  • NVMe ephemeral storage
  • SPDK-based 3rd party storage services

Leverage existing infrastructure for:

  • QEMU vhost-scsi;
  • QEMU/DPDK vhost-net user.

Features and benefits

  • High performance storage virtualisation
  • Reduced VM exit
  • Lower latency
  • Increased VM density
  • Reduced tail latencies
  • Higher throughput

Intel say that Ali Cloud sees ~300% improvement in IOPS and latency using SPDK

 

VM Ephemeral Storage

  • Improves Storage virtualisation
  • Works with KVM/QEMU
  • 6x efficiency vs kernel host
  • 10x efficiency vs QEMU virtuo
  • Increased VM density

 

SPDK and NVMe over Fabrics

SPDK also works a treat with NVMe over Fabrics.

VM Remote Storage

  • Enable disaggregation and migration of VMs using remote storage
  • Improves storage virtualisation and flexibility
  • Works with KVM/QEMU

 

NVMe over Fabrics

NVMe over Fabrics
Feature Benefit
Utilises NVM Express (NVMe) Polled Mode Driver Reduced overhead per NVMe I/O
RDMA Queue Pair Polling No interrupt overhead
Connections pinned to CPU cores No synchronisation overhead

 

NVMe-oF Key Takeaways

  • Preserves the latency-optimised NVMe protocol through network hops
  • Potentially radically efficient, depending on implementation
  • Actually fabric agnostic: InfinBand, RDMA, TCP/IP, FC … all ok!
  • Underlying protocol for existing and emerging technologies
  • Using SPDK, can integrate NVMe and NVMe-oF directly into applications

 

VM I/O Efficiency Key Takeaways

  • Huge improvement in latency for VM workloads
  • Application-level sees 3-4X performance gains
  • Application unmodified: it’s all under the covers
  • Virtuous cycle with VM density
  • Fully compatible with NVMe-oF!

 

Further Reading and Conclusion

Intel said during the presentation that “[p]eople find ways of consuming resources you provide to them”. This is true, and one of the reasons I became interested in storage early in my career. What’s been most interesting about the last few years worth of storage developments (as we’ve moved beyond spinning disks and simple file systems to super fast flash subsystems and massively scaled out object storage systems) is that people are still really only interested in have lots of storage that is fast and reliable. The technologies talked about during this presentation obviously aren’t showing up in consumer products just yet, but it’s an interesting insight into the direction the market is heading. I’m mighty excited about NVMe over Fabrics and looking forward to this technology being widely adopted in the data centre.

If you’ve had the opportunity to watch the video from Storage Field Day 12 (and some other appearances by Intel Storage at Tech Field Day events), you’ll quickly understand that I’ve barely skimmed the surface of what Intel are doing in the storage space, and just how much is going on before your precious bits are hitting the file system / object store / block device. NVMe is the new way of doing things fast, and I think Intel are certainly pioneering the advancement of this technology through real-world applications. This is, after all, the key piece of the puzzle – understanding how to take blazingly fast technology and apply a useful programmatic framework that companies can build upon to deliver useful outcomes.

For another perspective, have a look at Chan’s article here. You also won’t go wrong checking out Glenn’s post here.

NetApp Aren’t Just a Pretty FAS

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

Here are some notes from NetApp‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here. I made a joke during the presentation about Dave Hitz being lucky enough to sit next to me, but he’s the smart guy in this equation.

While I’ve not had an awful lot to do with NetApp previously, it’s not often I get to meet guys like Dave in real life. As such I found the NetApp presentation to be a tremendous experience. But enough about stars in my eyes. Arthur Lent spent some time covering off two technologies that I found intriguing: SnapCenter and Cloud Control for Microsoft Office 365.

[image courtesy of Tech Field Day]

 

SnapCenter Overview

SnapCenter is a key part of NetApp’s data protection strategy. You can read about this here. Here’s an overview on what was delivered with version 1.0.

End-to-end Data Protection

  • Simple, scalable, single interfaces to protect enterprise data (physical and virtualised) across the data fabric;
  • Meets SLAs easily by leveraging NTAP technologies;
  • Replaces traditional tape infrastructure with backup to the cloud; and
  • Extensible using user-created custom plug-ins.

 

Efficient In-place Copy Data Management

  • Leverages your existing NTAP storage infrastructure;
  • Provides visibility of copies across the data fabric; and
  • Enables reuse of copies for test/dev, DR, and analytics.

 

Accelerated application development

  • Transforms traditional IT to be more agile
  • Empowers application and database admins to self-serve
  • Enables DevOps and data lifecycle management for faster time to market

Sounds pretty good? There’s more though …

 

New with SnapCenter Version 2.0

  • End-to-end data protection for NAS file services from flash to disk to cloud (public or private);
  • Flexible, cost-effective tape replacement solution;
  • Integrated file catalog for simplified file search and recovery across the hybrid cloud; and
  • Automated protection relationship management and pre canned backup policies reduce management overhead.

SnapCenter custom plug-ins enable the creation and use of custom plugins. There are two community plug-ins available at release. Why use plugins?

  • Some mission critical applications or DBs are difficult to backup;
  • Custom plugins offer a  way to consistently backup almost anything;
  • Write the plugin once and distribute it to multiple hosts through SnapCenter;
  • Get all the SnapCenter benefits; and
  • A plugin only has the capabilities written into it.

 

Cloud Control for Microsoft Office 365

NetApp advised that this product would be “Available Soon”. I don’t know when that is, but you can read more about it here. NetApp says it offers a “[h]ighly scalable, multi-tenant SaaS offering for data protection, security, and compliance”. In short, it:

  • Is a SaaS offering to provide backup for Office 365 data: Exchange Online, SharePoint Online, OneDrive for Business;
  • Is an automated and simplified way to backup copies of customer’s critical data;
  • Provides flexibility – select your deployment model, archiving length, backup window;
  • Delivers search-and-browse features as well as granular recovery capabilities to find and restore lost data; and
  • Provides off-boarding capability to migrate users (mailboxes, files, folders) and site collections to on-premises.

 

Use Cases

  • Retain control of sensitive data as you move users, folders, mailboxes to O365;
  • Enable business continuity with fault-tolerant data protection;
  • Store data securely on NetApp at non-MS locations; and
  • Meet regulatory compliance with cloud-ready services.

 

Conclusion and Further Reading

In my opinion, the improvements in SnapCenter 2.0 demonstrate NetApp’s focus on improving some key elements of the offering, with the ability to use custom plugins being an awesome feature. I’m even more excited by Cloud Control for Office 365, simply because I’ve lost count of the number of enterprises that have shoved their email services up there (“low-hanging fruit” for cloud migration) and haven’t even considered how the hell they’re going to protect or retain the data in a useful way (“Doesn’t Microsoft do that for me?”). The amount of times people have simply overlooked some of the regulatory requirements on corporate email services is troubling, to say the least. If you’re an existing or potential NetApp customer this kind of product is something you should be investigating post haste.

Of course, I’ve barely begun to skim the surface of NetApp’s Data Fabric offering. As a relative newcomer, I’m looking forward to diving into this further in the near future. If you’re thinking of doing the same, I recommend you check out this white paper on NetApp Data Fabric Architecture Fundamentals for a great overview of what NetApp are doing in this space.

Nimble Storage Gets Cloudy

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

Here are some notes from Nimble Storage‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

Nimble Cloud Volumes

[image courtesy of Nimble Storage]

Nimble Storage announced the beta of Nimble Cloud Volumes (NCV) in late February. Essentially, it is a block storage as a service and works with public cloud compute from AWS and Microsoft Azure.

 

Storage in the Cloud?

So what’s the problem with cloud block storage at the moment? Nimble spoke about the following issues:

  • Durability and features – Nimble suggest that there is 0.1 – 0.2% annual failure rate with cloud block storage and a real lack of data services wrapped around the solutions;
  • Cloud lock-in is a problem – data mobility is hard (the “Hotel California Effect”), and data egress costs real money;
  • “Black box penalty” – limited visibility into the solution as the storage is ostensibly a black box service provided by your cloud operator. Nimble are counting on people not being entirely comfortable with giving it up to the cloud gods.

 

So What’s NCV?

The solution’s built on Nimble’s own cloud and their own technology, with deployments existing in very close proximity to AWS and Azure data centres (DCs). The idea is you use NCV for storage and AWS or Azure for your compute. Note that this is currently only operating in American regions, but I imagine that this capability will be expanded based on positive results with the beta. You can read more in the very handy NCV FAQ.

 

Why Bother with NCV?

According to Nimble, this stuff is enterprise-grade, offering:

  • Millions of times more durable;
  • Data protection & copy data management; and
  • Multi-host access.

Which all sounds great. Although I’d probably take pause when people claim “millions” of times more durability with their solution. In any case, like all things Nimble, you get great visibility into both the cloud and data centre, along with the ability to predict, recommend and optimise the environment to suit your key applications, while leveraging Nimble’s tools to better predict and track usage.

 

Thoughts and Further Reading

A large part of Nimble’s success, in my opinion, has been their relentless focus on analytics and visibility. They’re betting that people like this (and why wouldn’t they?) and are looking for this kind of capability from their cloud block storage solutions. They’re also betting that everyone’s had some kind of bad experience with block storage in the cloud in the past and will want to take advantage of Nimble’s focus on performance and reliability. It’s a big bet, but sometimes you have to go big. I think the solutions ties in nicely with the market’s acceptance of cloud as a viable compute platform, while leveraging their strength with monitoring and analytics.

People want to consume their infrastructure as a service. Whether it’s the right thing to do or not. Nimble are simple stepping in and doing their own version of consumptive infrastructure. I’m keen to see how this gets adopted by existing Nimble customers and whether it draws in other customers who may have been on the fence regarding either public cloud adoption or migration from on-premises block storage solutions. You can read more at El Reg, while Dimitris does a much better job of writing about this than I do (as he should, given he works for them). The datasheet for NCV can also be downloaded from here.

 

The Elephant in the Room

For those of you playing along at home, you may have noticed that HPE announced their intent to acquire Nimble just a few days before their Storage Field Day presentation. You can read some interesting articles on the proposed acquisition here and here. For what it’s worth I think it’s going to be fascinating to see how they get bought into the fold, and what that means for the broader HPE portfolio of products. In any case, I wish them well, as everyone I’ve dealt with at Nimble has always been super friendly and very helpful.

 

Ryussi – Or Why Another SMB Stack Is Handy

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

 

 

Here are some notes from Ryussi‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

Musambi? Yeah, MoSMB

Ryussi Technologies are based in Bangalore, India. Ryussi’s MoSMB (“SMB with Mojo”) is an SMB3 stack. MoSMB rhymes with musambi. According to Ryussi, it offers a flexible, advanced, enterprise class feature set, including:

  • Lightweight, ANSI-C SMB 2/3 stack on Linux variants built from ground up;
  • Highly pluggable architecture with custom interfaces to integrate into diverse storage stacks quickly and efficiently;
  • Architected to support high performance, high scalability, and continuous availability;
  • Complete ecosystem support including multi-channel, SMB Direct, ODX, RVSS, active / passive and scale-out clustering and witness protocol;
  • Support for SMB clients on Windows, macOS and Linux; and
  • An enterprise class feature set to support common varied SMB use cases such as Microsoft Hyper-V and enterprise file serving.

Note that there is no support for SMB1 (this is a good thing).

Sunu Engineer (CTO) took us through some of the key features and architecture.

It provides:

  • Different types of storage layers;
  • Heterogenous support under the same SMB server (they’re working on plugging into storage spaces architecture as well);
  • Cluster Infrastructure Service with lightweight objects connecting to cluster service;
  • Ryussi’s own implementation of RPC server;
  • SMI-S providers, OpenStack drivers, etc; and
  • Full unicode support.

 

What Could I Use It For?

So what could I use this for? A bunch of stuff, really.

  • Hyper-V over SMB
  • Hyper-V based VDI
  • Continuously available SMB file server cluster
  • Application consistent Hyper-V backup solution using RVSS
  • Enterprise file server for SQL Server, SharePoint and Exchange data
  • High speed secure printing
  • HDFS data storage for Hadoop workloads
  • NAS gateway to object storage

 

So Why Another SMB Stack?

The big point of this is that it “can be integrated into other storage stacks quickly and efficiently”. There are about 8000 storage startups in Silicon Valley at the moment, and all of their focus groups are telling them they want massively scalable, cloud-friendly storage systems that run Linux and deliver some advanced version of SMB (3.1.1). MoSMB claim to have developed a product that will get you some of the way towards achieving that goal. So why not just use Samba? Some folks don’t dig the licensing terms when putting open source in their commercial products. But aren’t there other implementations already available? Sure, but the terms of these agreements may not be exciting for some folk.

Of course, some might argue that Ryussi are doing this in the hopes of getting acquired by some OEM somewhere. But I don’t think that’s really the play here. My impression from the presentation is that they’re building something because they want a really useful product, and they’re keen to solve some problems that exist around SMB and performance and scalability. In any case, I think what they’re doing is kind of cool, and certainly worth checking out, if for no other reason than it’s not something you’d be looking at every day. Unless you code storage protocols, in which case you’re already way ahead of me. They also did a nice series on the architecture that you can read here, here and here. My fellow delegate Chin-Fah also did a nice write-up that you can find here.

Excelero are doing what? For how much?

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are some notes from Excelero’s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.

 

So what do they do?

Something called NVMesh Server SAN. And it’s pretty wild from what I saw during the demo. But first, what is NVMesh Server SAN? It’s basically magical software-defined block storage designed for NVMe.

[image courtesy of Excelero]

 

Benefits

So what’s so special about this? Well, it provides:

  • A virtual SAN solution that is optimised for NVMe, and 100% software-based (using the proper hardware, of course);
  • “Unified NVMe” by pooling NVMe across the network at what appear to the host as local speeds and latencies;
  • Really, really low CPU usage on the storage target;
  • Flexibility, and can be deployed as a virtual, distributed non-volatile array and supporting both converged and disaggregated architectures; and
  • Efficiency, with performance scaling linearly at close to 100% efficiency.

According to Excelero, some of the key NVMesh benefits include:

  • maximum utilisation of NVMe flash devices by creating a single pool of high performance block storage (they really flog these devices);
  • no data localisation for scale-out applications;
  • predictable application performance – no noisy neighbours; and
  • making storage as efficient as the optimised hardware platform (such as Open19).

What does this mean for your enterprise applications? You get access to:

  • Higher performance for random IO intensive enterprise applications;
  • A flexible architecture to support multiple workloads;
  • lower operating costs through deployment efficiency and easy serviceability; and
  • all your data is “local” with no application changes. These is a mighty fine trick.

Excelero’s solution also helps with high-performance computing (HPC) environments, offering:

  • massive performance: high IOPS and bandwidth, low latency;
  • unlimited scalability, supports analytics for massive data sets; and
  • lowest cost/IOP.

 

Excelero Software Components

[image courtesy of Excelero]

 

The Centralized Management component:

  • runs as a Node.js application on top of MongoDB;
  • pools drives, provisions volumes and monitors stuff; and
  • transforms drives from raw storage into a pool.

It’s built as a scale-out service to support huge deployments and offers some standard integration, including RESTful API access for seamless provisioning.  There’s also a client block driver with the kernel module presenting logical volumes via the block driver API.

From a performance perspective, it interacts directly with drives via RDDA or NVMf offering single hop access to the data, minimising latency overhead,  and maximising throughput and IOPS. As a result of this you get consistent access to share volumes spread across remote drives anywhere in the DC. The solution offers “RAIN” data protection (cross-node / rack) for standard servers and from a scalability perspective there’s point to point communication with management and targets, simple discovery, and no broadcasts.

 

Topology Manager:

  • Performs cluster management to ensure high availability;
  • Manages volume life cycle and failure recovery operations; and
  • Uses Raft protocol to ensure data consistency – avoiding “split brain” scenarios

 

Key Takeaways

  • Excelero is a “Software-defined block storage solution” – using standard servers with state of the art flash components and leveraging an intuitive management portal;
  • Excelero offers virtual SAN for NVMe – pooling NVMe over the network at local speeds to maximise utilisation. By making all data local you move the compute, not the data;
  • Scale-Out Server SAN – scales performance & capacity linearly, across DCs without limits – enabling just-in-time storage orchestration; and
  • Converged and disaggregated architectures – no noisy neighbours through full logical disaggregation of storage and compute – grow storage or compute independently

 

Feelings and Further Reading

Excelero came out of stealth mode during Storage Field Day 12. Ray was mighty impressed with what he saw, as was I. I was also mightily impressed with the relatively inexpensive nature of the hardware that they used to demonstrate the solution. Every SDS solution has a reasonably strict hardware compatibility list. In this case, it makes a lot of sense, as Excelero’s patented RDDA technology contributes a lot to the performance and success of the solution. It’s also NVMe over Fabrics ready too, so as this gains traction the requirement for RDDA will potentially fade away.

Super-fast storage solutions based on NVMe are a lot like big data and bad reality TV shows. They’re front and centre in a lot of conversations around the water cooler but a lot of people aren’t exactly sure what they are or what they should make of them. While Dell recently put the bullet through DSSD,  it doesn’t mean that the technology or the requirement for this kind of solution doesn’t exist. What it does demonstrate is that these kind of solutions can be had in the data centre for a reasonably inexpensive investment in hardware coupled with some really smart software. Version 1.1 is still raw in places, and it will be a while before we see widespread adoption of these types of solutions in the enterprise data centre (people like to wrap data services around these kinds of solutions). That said, if you have the need for speed right now, it might be a good idea to reach out to the Excelero folks and have a conversation.

You’ll notice my title was a bit misleading, as I don’t have pricing information in this post. Ray did some rough calculations in his article, and you should talk to Excelero to find out more. As an aside, it’s also worth checking out the Storage Field Day presentation for Yuval Bachar‘s presentation on Open19 as well – it was riveting stuff.