Intel Optane And The DAOS Storage Engine

Disclaimer: I recently attended Storage Field Day 20.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Intel recently presented at Storage Field Day 20. You can see videos of the presentation here, and download my rough notes from here.

 

Intel Optane Persistent Memory

If you’re a diskslinger, you’ve very likely heard of Intel Optane. You may have even heard of Intel Optane Persistent Memory. It’s a little different to Optane SSD, and Intel describes it as “memory technology that delivers a unique combination of affordable large capacity and support for data persistence”. It looks a lot like DRAM, but the capacity is greater, and there’s data persistence across power losses. This all sounds pretty cool, but isn’t it just another form factor for fast storage? Sort of, but the application of the engineering behind the product is where I think it starts to get really interesting.

 

Enter DAOS

Distributed Asynchronous Object Storage (DAOS) is described by Intel as “an open source software-defined scale-out object store that provides high bandwidth, low latency, and high I/O operations per second (IOPS) storage containers to HPC applications”. It’s ostensibly a software stack built from the ground up to take advantage of the crazy speeds you can achieve with Optane, and at scale. There’s a handy overview of the architecture available on Intel’s website. Traditional object (and other storage systems) haven’t really been built to take advantage of Optane in quite the same way DAOS has.

[image courtesy of Intel]

There are some cool features built into DAOS, including:

  • Ultra-fine grained, low-latency, and true zero-copy I/O
  • Advanced data placement to account for fault domains
  • Software-managed redundancy supporting both replication and erasure code with online rebuild
  • End-to-end (E2E) data integrity
  • Scalable distributed transactions with guaranteed data consistency and automated recovery
  • Dataset snapshot capability
  • Security framework to manage access control to storage pools
  • Software-defined storage management to provision, configure, modify, and monitor storage pools

Exciting? Sure is. There’s also integration with Lustre. The best thing about this is that you can grab it from Github under the Apache 2.0 license.

 

Thoughts And Further Reading

Object storage is in its relative infancy when compared to some of the storage architectures out there. It was designed to be highly scalable and generally does a good job of cheap and deep storage at “web scale”. It’s my opinion that object storage becomes even more interesting as a storage solution when you put a whole bunch of really fast storage media behind it. I’ve seen some media companies do this with great success, and there are a few of the bigger vendors out there starting to push the All-Flash object story. Even then, though, many of the more popular object storage systems aren’t necessarily optimised for products like Intel Optane PMEM. This is what makes DAOS so interesting – the ability for the storage to fundamentally do what it needs to do at massive scale, and have it go as fast as the media will let it go. You don’t need to worry as much about the storage architecture being optimised for the storage it will sit on, because the folks developing it have access to the team that developed the hardware.

The other thing I really like about this project is that it’s open source. This tells me that Intel are both focused on Optane being successful, and also focused on the industry making the most of the hardware it’s putting out there. It’s a smart move – come up with some super fast media, and then give the market as much help as possible to squeeze the most out of it.

You can grab the admin guide from here, and check out the roadmap here. Intel has plans to release a new version every 6 months, and I’m really looking forward to seeing this thing gain traction. For another perspective on DAOS and Intel Optane, check out David Chapa’s article here.

 

 

StorONE Announces AFA.next

StorONE recently announced the All-Flash Array.next (AFAn). I had the opportunity to speak to George Crump (StorONE Chief Marketing Officer) about the news, and thought I’d share some brief thoughts here.

 

What Is It? 

It’s a box! (Sorry I’ve been re-watching Silicon Valley with my daughter recently).

[image courtesy of StorONE]

More accurately, it’s an Intel Server with Intel Optane and Intel QLC storage, powered by StorONE’s software.

S1:Tier

S1:Tier is StorONE’s tiering solution. It operates within the parameters of a high and low watermark. Once the Optane tier fills up, the data is written out, sequentially, to QLC. The neat thing is that when you need to recall the data on QLC, you don’t necessarily need to move it all back to the Optane tier. Rather, read requests can be served directly from QLC. StorONE call this a multi-tier capability, because you can then move data to cloud storage for long-term retention if required.

[image courtesy of StorONE]

S1:HA

Crump noted that the Optane drives are single ported, leading some customers to look highly available configurations. These are catered for with a variation of S1:HA, where the HA solution is now a synchronous mirror between 2 stacks.

 

Thoughts and Further Reading

I’m not just a fan of StorONE because the company occasionally throws me a few dollarydoos to keep the site running. I’m a fan because the folks over there do an awful lot of storage type stuff on what is essentially commodity hardware, and they’re getting results that are worth writing home about, with a minimum of fuss. The AFAn uses Optane as a storage tier, not just read cache, so you get all of the benefit of Optane write performance (many, many IOPS). It has the resilience and data protection features you see in many midrange and enterprise arrays today (namely vRAID, replication, and snapshots). Finally, it has varying support for all three use cases (block, file, and object), so there’s a good chance your workload will fit on the box.

More and more vendors are coming to market with Optane-based storage solutions. It still seems that only a small number of them are taking full advantage of Optane as a write medium, instead focusing on its benefit as a read tier. As I mentioned before, Crump and the team at StorONE have positioned some pretty decent numbers coming out of the AFAn. I think the best thing is that it’s now available as a configuration item on the StorONE TRUprice site as well, so you can see for yourself how much the solution costs. If you’re after a whole lot of performance in a small box, this might be just the thing. You can read more about the solution and check out the lab report here. My friend Max wrote a great article on the solution that you can read here.

Random Short Take #7

Here are a few links to some random things that I think might be useful, to someone. Maybe.