Intel – It’s About Getting The Right Kind Of Fast At The Edge

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Intel recently presented at Storage Field Day 22. You can see videos of the presentation here, and download my rough notes from here.

 

The Problem

A lot of countries have used lockdowns as a way to combat the community transmission of COVID-19. Apparently, this has led to an uptick in the consumption of streaming media services. If you’re somewhat familiar with streaming media services, you’ll understand that your favourite episode of Hogan’s Heroes isn’t being delivered from a giant storage device sitting in the bowels of your streaming media provider’s data centre. Instead, it’s invariably being delivered to your device from a content delivery network (CDN) device.

 

Content Delivery What?

CDNs are not a new concept. The idea is that you have a bunch of web servers geographically distributed delivering content to users who are also geographically distributed. Think of it as a way to cache things closer to your end users. There are many reasons why this can be a good idea. Your content will load faster for users if it resides on servers in roughly the same area as them. Your bandwidth costs are generally a bit cheaper, as you’re not transmitting as much data from your core all the way out to the end user. Instead, those end users are getting the content from something close to them. You can potentially also deliver more versions of content (in terms of resolution) easily. It can also be beneficial in terms of resiliency and availability – an outage on one part of your network, say in Palo Alto, doesn’t need to necessarily impact end users living in Sydney. Cloudflare does a fair bit with CDNs, and there’s a great overview of the technology here.

 

Isn’t All Content Delivery The Same?

Not really. As Intel covered in its Storage Field Day presentation, there are some differences with the performance requirements of video on demand and live-linear streaming CDN solutions.

Live-Linear Edge Cache

Live-linear video streaming is similar to the broadcast model used in television. It’s basically programming content streamed 24/7, rather than stuff that the user has to search for. Several minutes of content are typically cached to accommodate out-of-sync users and pause / rewind activities. You can read a good explanation of live-linear streaming here.

[image courtesy of Intel]

In the example above, Intel Optane PMem was used to address the needs of live-linear streaming.

  • Live-linear workloads consume a lot of memory capacity to maintain a short-lived video buffer.
  • Intel Optane PMem is less expensive than DRAM.
  • Intel Optane PMem has extremely high endurance, to handle frequent overwrite.
  • Flexible deployment options – Memory Mode or App-Direct, consuming zero drive slots.

With this solution they were able to achieve better channel and stream density per server than with DRAM-based solutions.

Video on Demand (VoD)

VoD providers typically offer a large library of content allowing users to view it at any time (e.g. Netflix and Disney+). VoD servers are a little different to live-linear streaming CDNs. They:

  • Typically require large capacity and drive fanout for performance / failure domains; and
  • Have a read-intensive workload, with typically large IOs.

[image courtesy of Intel]

 

Thoughts and Further Reading

I first encountered the magic of CDNs years ago when working in a data centre that hosted some Akamai infrastructure. Windows Server updates were super zippy, and it actually saved me from having to spend a lot of time standing in the cold aisle. Fast forward about 15 years, and CDNs are being used for all kinds of content delivery on the web. With whatever the heck this is is in terms of the new normal, folks are putting more and more strain on those CDNs by streaming high-quality, high-bandwidth TV and movie titles into their homes (except in backwards places like Australia). As a result, content providers are constantly searching for ways to tweak the throughput of these CDNs to serve more and more customers, and deliver more bandwidth to those users.

I’ve barely skimmed the surface of how CDNs help providers deliver content more effectively to end users. What I did find interesting about this presentation was that it reinforced the idea that different workloads require different infrastructure solutions to deliver the right outcomes. It sounds simple when I say it like this, but I guess I’ve thought about streaming video CDNs as being roughly the same all over the place. Clearly they aren’t, and it’s not just a matter of jamming some SSDs in one RU servers and hoping that your content will be delivered faster to punters. It’s important to understand that Intel Optane PMem and Intel Optane 3D NAND can give you different results depending on what you’re trying to do, with PMem arguably giving you better value for money (per GB) than DRAM. There are some great papers on this topic available on the Intel website. You can read more here and here.

Random Short Take #55

Welcome to Random Short Take #55. A few players have worn 55 in the NBA. I wore some Mutombo sneakers in high school, and I enjoy watching Duncan Robinson light it up for the Heat. My favourite ever to wear 55 was “White Chocolate” Jason Williams. Let’s get random.

  • This article from my friend Max around Intel Optane and VMware Cloud Foundation provided some excellent insights.
  • Speaking of friends writing about VMware Cloud Foundation, this first part of a 4-part series from Vaughn makes a compelling case for VCF on FlashStack. Sure, he gets paid to say nice things about the company he works for, but there is plenty of info in here that makes a lot of sense if you’re evaluating which hardware platform pairs well with VCF.
  • Speaking of VMware, if you’re a VCD shop using NSX-V, it’s time to move on to NSX-T. This article from VMware has the skinny.
  • You want an open source version of BMC? Fine, you got it. Who would have thought securing BMC would be a thing? (Yes, I know it should be)
  • Stuff happens, hard drives fail. Backblaze recently published its drive stats report for Q1. You can read about that here.
  • Speaking of drives, check out this article from Netflix on its Netflix Drive product. I find it amusing that I get more value from Netflix’s tech blog than I do its streaming service, particularly when one is free.
  • The people in my office laugh nervously when I say I hate being in meetings where people feel the need to whiteboard. It’s not that I think whiteboard sessions can’t be valuable, but oftentimes the information on those whiteboards should be documented somewhere and easy to bring up on a screen. But if you find yourself in a lot of meetings and need to start drawing pictures about new concepts or whatever, this article might be of some use.
  • Speaking of office things not directly related to tech, this article from Preston de Guise on interruptions was typically insightful. I loved the “Got a minute?” reference too.

 

Intel Optane – Challenges and Triumphs

Disclaimer: I recently attended Storage Field Day 21.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Intel recently presented at Storage Field Day 21. You can see videos of the presentation here, and download my rough notes from here.

 

Alive and Kicking

Kristie Mann, Sr. Director Products, Intel Optane Group, kicked off the session by telling us that “Intel Optane is alive and well”. I don’t think anyone thought it was really going away, particularly given the effort that folks inside Intel have gone to to get this product to market. But from a consumer perspective, it’s potentially been a little confusing.

[image courtesy of Intel]

In terms of data centre penetration, it’s been a different story, and taking Optane from idea to reality has been quite a journey. It was also noted that the “strong uptake of PMem in HPC was unexpected”, but no doubt welcome.

 

Learnings

Some of the other learnings that were covered as part of the session were as follows.

Software Really Matters

It’s one thing to come out with some super cool hardware that is absolutely amazing, but it’s quite another to get software support for that hardware. Unfortunately, the hardware doesn’t give you much without the software, no matter how well it performs. While this has been something of a challenge for Optane until recent times, there’s definitely been more noise from the big ISVs about enhanced Optane support.

IaaS Adoption

Adoption in IaaS has not been great, mainly due to some uneven performance. This will only improve as the software support matures. But the IaaS market can be tough for a bunch of reasons. IaaS vendors are trying to do a things at a certain price point. That doesn’t mean that they’re still making people run VMs on spinning disk (hopefully), but rolling out All-Flash support for platforms is something that’s only going to be done when the $/GB makes sense for the providers. You also might have seen in the field that IaaS providers are pretty particular about performance and quality of service. It makes sense when you’re trying to host a whole bunch of different workloads at large scale. So it makes sense that they’d be somewhat cautious about launching new media types on their platforms without running through a whole bunch of performance and integration testing. I’m not saying they’re not going to get there, they just may not be the first cabs off the rank.

Can you spell OEM?

OEM qualifications have been slow to date with Optane. This is key to getting the product out there. Enterprise folks don’t like to buy things until their favourite Tier 1 vendors are offering it as a default option in their server / storage array / fabric switch. If Dell has the Optane Inside sticker (not a real thing, but you know what I mean), the infrastructure architects inside large government entities are more likely to get on board.

Battling The Status Quo

Status quo thinking makes it hard to understand this isn’t just memory or storage. This has been something of a problem for Intel since Optane became a thing. I’m still having conversations with people and running up against significant confusion about the difference between PMem and Optane SSD. I think that’s going to improve as time goes on, but it can make things difficult when it comes to broad market penetration.

Thoughts and Further Reading

I don’t want people reading this to think that I’m down on Intel and what it’s done with Optane. If anything, I’m really into it. I enjoyed the presentation at Storage Field Day 21 tremendously, and not just because my friend Howard was on the panel repping VAST Data. It’s unusual that a vendor as big as Intel would be so frank about some of the challenges that it’s faced with getting new media to market. But I think it’s the willingness to share some of this information that demonstrates how committed Intel is to Optane moving forward. I was lucky enough to speak to Intel Senior Fellow Al Fazio about the Optane journey, and it was clear that there’s a whole lot of innovation and sweat that goes into making a product like this work.

Some folks think that these panel presentations are marketing disguised as a presentation. Invariably, the reference customers are friendly with the company, and you’ll only ever hear good stories. But I think those stories from those customers are still extremely powerful. After all, having a customer jump on a session to tell the world about how good your product has been means you’ve done something right. As a consumer of these products, I find these kind of testimonials invaluable. Ultimately, products are successful in the market when they serve the market’s needs. From what I can see, Intel Optane is on its way to meeting those needs, and it has a bright future.

Random Short Take #49

Happy new year and welcome to Random Short Take #49. Not a great many players have worn 49 in the NBA (2 as it happens). It gets better soon, I assure you. Let’s get random.

  • Frederic has written a bunch of useful articles around useful Rubrik things. This one on setting up authentication to use Active Directory came in handy recently. I’ll be digging in to some of Rubrik’s multi-tenancy capabilities in the near future, so keep an eye out for that.
  • In more things Rubrik-related, this article by Joshua Stenhouse on fully automating Rubrik EDGE / AIR deployments was great.
  • Speaking of data protection, Chris Colotti wrote this useful article on changing the Cloud Director database IP address. You can check it out here.
  • You want more data protection news? How about this press release from BackupAssist talking about its partnership with Wasabi?
  • Fine, one more data protection article. Six backup and cloud storage tips from Backblaze.
  • Speaking of press releases, WekaIO has enjoyed some serious growth in the last year. Read more about that here.
  • I loved this article from Andrew Dauncey about things that go wrong and learning from mistakes. We’ve all likely got a story about something that went so spectacularly wrong that you only made that mistake once. Or twice at most. It also reminds me of those early days of automated ESX 2.5 builds and building magical installation CDs that would happily zap LUN 0 on FC arrays connected to new hosts. Fun times.
  • Finally, I was lucky enough to talk to Intel Senior Fellow Al Fazio about what’s happening with Optane, how it got to this point, and where it’s heading. You can read the article and check out the video here.

Intel Optane And The DAOS Storage Engine

Disclaimer: I recently attended Storage Field Day 20.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Intel recently presented at Storage Field Day 20. You can see videos of the presentation here, and download my rough notes from here.

 

Intel Optane Persistent Memory

If you’re a diskslinger, you’ve very likely heard of Intel Optane. You may have even heard of Intel Optane Persistent Memory. It’s a little different to Optane SSD, and Intel describes it as “memory technology that delivers a unique combination of affordable large capacity and support for data persistence”. It looks a lot like DRAM, but the capacity is greater, and there’s data persistence across power losses. This all sounds pretty cool, but isn’t it just another form factor for fast storage? Sort of, but the application of the engineering behind the product is where I think it starts to get really interesting.

 

Enter DAOS

Distributed Asynchronous Object Storage (DAOS) is described by Intel as “an open source software-defined scale-out object store that provides high bandwidth, low latency, and high I/O operations per second (IOPS) storage containers to HPC applications”. It’s ostensibly a software stack built from the ground up to take advantage of the crazy speeds you can achieve with Optane, and at scale. There’s a handy overview of the architecture available on Intel’s website. Traditional object (and other storage systems) haven’t really been built to take advantage of Optane in quite the same way DAOS has.

[image courtesy of Intel]

There are some cool features built into DAOS, including:

  • Ultra-fine grained, low-latency, and true zero-copy I/O
  • Advanced data placement to account for fault domains
  • Software-managed redundancy supporting both replication and erasure code with online rebuild
  • End-to-end (E2E) data integrity
  • Scalable distributed transactions with guaranteed data consistency and automated recovery
  • Dataset snapshot capability
  • Security framework to manage access control to storage pools
  • Software-defined storage management to provision, configure, modify, and monitor storage pools

Exciting? Sure is. There’s also integration with Lustre. The best thing about this is that you can grab it from Github under the Apache 2.0 license.

 

Thoughts And Further Reading

Object storage is in its relative infancy when compared to some of the storage architectures out there. It was designed to be highly scalable and generally does a good job of cheap and deep storage at “web scale”. It’s my opinion that object storage becomes even more interesting as a storage solution when you put a whole bunch of really fast storage media behind it. I’ve seen some media companies do this with great success, and there are a few of the bigger vendors out there starting to push the All-Flash object story. Even then, though, many of the more popular object storage systems aren’t necessarily optimised for products like Intel Optane PMEM. This is what makes DAOS so interesting – the ability for the storage to fundamentally do what it needs to do at massive scale, and have it go as fast as the media will let it go. You don’t need to worry as much about the storage architecture being optimised for the storage it will sit on, because the folks developing it have access to the team that developed the hardware.

The other thing I really like about this project is that it’s open source. This tells me that Intel are both focused on Optane being successful, and also focused on the industry making the most of the hardware it’s putting out there. It’s a smart move – come up with some super fast media, and then give the market as much help as possible to squeeze the most out of it.

You can grab the admin guide from here, and check out the roadmap here. Intel has plans to release a new version every 6 months, and I’m really looking forward to seeing this thing gain traction. For another perspective on DAOS and Intel Optane, check out David Chapa’s article here.

 

 

StorONE Announces AFA.next

StorONE recently announced the All-Flash Array.next (AFAn). I had the opportunity to speak to George Crump (StorONE Chief Marketing Officer) about the news, and thought I’d share some brief thoughts here.

 

What Is It? 

It’s a box! (Sorry I’ve been re-watching Silicon Valley with my daughter recently).

[image courtesy of StorONE]

More accurately, it’s an Intel Server with Intel Optane and Intel QLC storage, powered by StorONE’s software.

S1:Tier

S1:Tier is StorONE’s tiering solution. It operates within the parameters of a high and low watermark. Once the Optane tier fills up, the data is written out, sequentially, to QLC. The neat thing is that when you need to recall the data on QLC, you don’t necessarily need to move it all back to the Optane tier. Rather, read requests can be served directly from QLC. StorONE call this a multi-tier capability, because you can then move data to cloud storage for long-term retention if required.

[image courtesy of StorONE]

S1:HA

Crump noted that the Optane drives are single ported, leading some customers to look highly available configurations. These are catered for with a variation of S1:HA, where the HA solution is now a synchronous mirror between 2 stacks.

 

Thoughts and Further Reading

I’m not just a fan of StorONE because the company occasionally throws me a few dollarydoos to keep the site running. I’m a fan because the folks over there do an awful lot of storage type stuff on what is essentially commodity hardware, and they’re getting results that are worth writing home about, with a minimum of fuss. The AFAn uses Optane as a storage tier, not just read cache, so you get all of the benefit of Optane write performance (many, many IOPS). It has the resilience and data protection features you see in many midrange and enterprise arrays today (namely vRAID, replication, and snapshots). Finally, it has varying support for all three use cases (block, file, and object), so there’s a good chance your workload will fit on the box.

More and more vendors are coming to market with Optane-based storage solutions. It still seems that only a small number of them are taking full advantage of Optane as a write medium, instead focusing on its benefit as a read tier. As I mentioned before, Crump and the team at StorONE have positioned some pretty decent numbers coming out of the AFAn. I think the best thing is that it’s now available as a configuration item on the StorONE TRUprice site as well, so you can see for yourself how much the solution costs. If you’re after a whole lot of performance in a small box, this might be just the thing. You can read more about the solution and check out the lab report here. My friend Max wrote a great article on the solution that you can read here.

Random Short Take #7

Here are a few links to some random things that I think might be useful, to someone. Maybe.