Intel – It’s About Getting The Right Kind Of Fast At The Edge

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Intel recently presented at Storage Field Day 22. You can see videos of the presentation here, and download my rough notes from here.

 

The Problem

A lot of countries have used lockdowns as a way to combat the community transmission of COVID-19. Apparently, this has led to an uptick in the consumption of streaming media services. If you’re somewhat familiar with streaming media services, you’ll understand that your favourite episode of Hogan’s Heroes isn’t being delivered from a giant storage device sitting in the bowels of your streaming media provider’s data centre. Instead, it’s invariably being delivered to your device from a content delivery network (CDN) device.

 

Content Delivery What?

CDNs are not a new concept. The idea is that you have a bunch of web servers geographically distributed delivering content to users who are also geographically distributed. Think of it as a way to cache things closer to your end users. There are many reasons why this can be a good idea. Your content will load faster for users if it resides on servers in roughly the same area as them. Your bandwidth costs are generally a bit cheaper, as you’re not transmitting as much data from your core all the way out to the end user. Instead, those end users are getting the content from something close to them. You can potentially also deliver more versions of content (in terms of resolution) easily. It can also be beneficial in terms of resiliency and availability – an outage on one part of your network, say in Palo Alto, doesn’t need to necessarily impact end users living in Sydney. Cloudflare does a fair bit with CDNs, and there’s a great overview of the technology here.

 

Isn’t All Content Delivery The Same?

Not really. As Intel covered in its Storage Field Day presentation, there are some differences with the performance requirements of video on demand and live-linear streaming CDN solutions.

Live-Linear Edge Cache

Live-linear video streaming is similar to the broadcast model used in television. It’s basically programming content streamed 24/7, rather than stuff that the user has to search for. Several minutes of content are typically cached to accommodate out-of-sync users and pause / rewind activities. You can read a good explanation of live-linear streaming here.

[image courtesy of Intel]

In the example above, Intel Optane PMem was used to address the needs of live-linear streaming.

  • Live-linear workloads consume a lot of memory capacity to maintain a short-lived video buffer.
  • Intel Optane PMem is less expensive than DRAM.
  • Intel Optane PMem has extremely high endurance, to handle frequent overwrite.
  • Flexible deployment options – Memory Mode or App-Direct, consuming zero drive slots.

With this solution they were able to achieve better channel and stream density per server than with DRAM-based solutions.

Video on Demand (VoD)

VoD providers typically offer a large library of content allowing users to view it at any time (e.g. Netflix and Disney+). VoD servers are a little different to live-linear streaming CDNs. They:

  • Typically require large capacity and drive fanout for performance / failure domains; and
  • Have a read-intensive workload, with typically large IOs.

[image courtesy of Intel]

 

Thoughts and Further Reading

I first encountered the magic of CDNs years ago when working in a data centre that hosted some Akamai infrastructure. Windows Server updates were super zippy, and it actually saved me from having to spend a lot of time standing in the cold aisle. Fast forward about 15 years, and CDNs are being used for all kinds of content delivery on the web. With whatever the heck this is is in terms of the new normal, folks are putting more and more strain on those CDNs by streaming high-quality, high-bandwidth TV and movie titles into their homes (except in backwards places like Australia). As a result, content providers are constantly searching for ways to tweak the throughput of these CDNs to serve more and more customers, and deliver more bandwidth to those users.

I’ve barely skimmed the surface of how CDNs help providers deliver content more effectively to end users. What I did find interesting about this presentation was that it reinforced the idea that different workloads require different infrastructure solutions to deliver the right outcomes. It sounds simple when I say it like this, but I guess I’ve thought about streaming video CDNs as being roughly the same all over the place. Clearly they aren’t, and it’s not just a matter of jamming some SSDs in one RU servers and hoping that your content will be delivered faster to punters. It’s important to understand that Intel Optane PMem and Intel Optane 3D NAND can give you different results depending on what you’re trying to do, with PMem arguably giving you better value for money (per GB) than DRAM. There are some great papers on this topic available on the Intel website. You can read more here and here.

StorONE Announces AFA.next

StorONE recently announced the All-Flash Array.next (AFAn). I had the opportunity to speak to George Crump (StorONE Chief Marketing Officer) about the news, and thought I’d share some brief thoughts here.

 

What Is It? 

It’s a box! (Sorry I’ve been re-watching Silicon Valley with my daughter recently).

[image courtesy of StorONE]

More accurately, it’s an Intel Server with Intel Optane and Intel QLC storage, powered by StorONE’s software.

S1:Tier

S1:Tier is StorONE’s tiering solution. It operates within the parameters of a high and low watermark. Once the Optane tier fills up, the data is written out, sequentially, to QLC. The neat thing is that when you need to recall the data on QLC, you don’t necessarily need to move it all back to the Optane tier. Rather, read requests can be served directly from QLC. StorONE call this a multi-tier capability, because you can then move data to cloud storage for long-term retention if required.

[image courtesy of StorONE]

S1:HA

Crump noted that the Optane drives are single ported, leading some customers to look highly available configurations. These are catered for with a variation of S1:HA, where the HA solution is now a synchronous mirror between 2 stacks.

 

Thoughts and Further Reading

I’m not just a fan of StorONE because the company occasionally throws me a few dollarydoos to keep the site running. I’m a fan because the folks over there do an awful lot of storage type stuff on what is essentially commodity hardware, and they’re getting results that are worth writing home about, with a minimum of fuss. The AFAn uses Optane as a storage tier, not just read cache, so you get all of the benefit of Optane write performance (many, many IOPS). It has the resilience and data protection features you see in many midrange and enterprise arrays today (namely vRAID, replication, and snapshots). Finally, it has varying support for all three use cases (block, file, and object), so there’s a good chance your workload will fit on the box.

More and more vendors are coming to market with Optane-based storage solutions. It still seems that only a small number of them are taking full advantage of Optane as a write medium, instead focusing on its benefit as a read tier. As I mentioned before, Crump and the team at StorONE have positioned some pretty decent numbers coming out of the AFAn. I think the best thing is that it’s now available as a configuration item on the StorONE TRUprice site as well, so you can see for yourself how much the solution costs. If you’re after a whole lot of performance in a small box, this might be just the thing. You can read more about the solution and check out the lab report here. My friend Max wrote a great article on the solution that you can read here.

Burlywood Tech Announces TrueFlash Insight

Burlywood Tech came out of stealth a few years ago, and I wrote about their TrueFlash announcement here. I had another opportunity to speak to Mike Tomky recently about Burlywood’s TrueFlash Insight announcement and thought I’d share some thoughts here.

 

The Announcement

Burlywood’s “TrueFlash” product delivers what they describe as a “software-defined SSD” drive. Since they’ve been active in the market they’ve gained traction in what they call the Tier 2 service provider segments (not the necessarily the “Big 7” hyperscalers).

They’ve announced TrueFlash Insight because, in a number of cases, customers don’t know what their workloads really look like. The idea behind TrueFlash Insight is that it can be run in a production environment for a period of time to collect metadata and drive telemetry. Engineers can also be sent on site if required to do the analysis. The data collected with TrueFlash Insight helps Burlywood with the process of designing and tuning the TrueFlash product for the desired workload.

How It Works

  • Insight is available only on Burlywood TrueFlash drives
  • Enabled upon execution of a SOW for Insight analysis services
  • Run your application as normal in a system with one or more Insight-enabled TrueFlash drives
  • Follow the instructions to download the telemetry files
  • Send telemetry data to Burlywood for analysis
  • Burlywood parses the telemetry, analyses data patterns, shares performance information, and identifies potential bottlenecks and trouble spots
  • This information can then be used to tune the TrueFlash SSDs for optimal performance

 

Thoughts and Further Reading

When I wrote about Burlywood previously I was fascinated by the scale that would be required for a company to consider deploying SSDs with workload-specific code sitting on them. And then I stopped and thought about my comrades in the enterprise space struggling to get the kind of visibility into their gear that’s required to make these kinds of decisions. But when your business relies so heavily on good performance, there’s a chance you have some idea of how to get information on the performance of your systems. The fact that Burlywood are making this offering available to customers indicates that even those customers that are on board with the idea of “Software-defined SSDs (SDSSDs?)” don’t always have the capabilities required to make an accurate assessment of their workloads.

But this solution isn’t just for existing Burlywood customers. The good news is it’s also available for customers considering using Burlywood’s product in their DC. It’s a reasonably simple process to get up and running, and my impression is that it will save a bit of angst down the track. Tomky made the comment that, with this kind of solution, you don’t need to “worry about masking problems at the drive level – [you can] work on your core value”. There’s a lot to be said for companies, even the ones with very complex technical requirements, not having to worry about the technical part of the business as much as the business part of the business. If Burlywood can make that process easier for current and future customers, I’m all for it.

Burlywood Tech Announces TrueFlash

Burlywood Tech came out of stealth late last year and recently announced their TrueFlash product. I had the opportunity to speak with Mike Tomky about what they’ve been up to since emerging from stealth and thought I’d cover the announcement here.

 

Burlywood TrueFlash

So what is TrueFlash? It’s a “modular controller architecture that accelerates time-to-market of new flash adoption”. The idea is that Burlywood can deliver a software-defined solution that will sit on top of commodity Flash. They say that one size doesn’t fit all, particularly with Flash, and this solution gives customers the opportunity to tailor the hardware to better meet their requirements.

It offers the following features:

  • Multiple interfaces (SATA, SAS, NVMe)
  • FTL Translation (Full SSD to None)
  • Capacity ->100TB
  • Traffic optimisation
  • Multiple Protocols (Block (NVMe, NVMe/F), File, Object, Direct Memory)

[image courtesy of Burlywood Tech]

 

Who’s Buying?

This isn’t really an enterprise play – those aren’t the types of companies that would buy Flash at the scale that this would make sense. This is really aimed at the hyperscalers, cloud providers, and AFA / HCI vendors. They sell the software, controller and SSD Reference Design to the hyperscalers, but treat the cloud providers and AFA vendors a little differently, generally delivering a completed SSD for them. All of their customers benefit from:

  • A dedicated support team (in-house drive team);
  • Manufacturing assembly & test;
  • Technical & strategic support in all phases; and
  • Collaborative roadmap planning.

The key selling point for Burlywood is that they claim to be able to reduce costs by 10 – 20% through better capacity utilisation, improved supply chain and faster product qualification times.

 

Thoughts

You know you’re doing things at a pretty big scale if you’re thinking it’s a good idea to be building your own SSDs to match particular workloads in your environment. But there are reasons to do this, and from what I can see, it makes sense for a few companies. It’s obviously not for everyone, and I don’t think you’ll be seeing this n the enterprise anytime soon. Which is the funny thing, when you think about it. I remember when Google first started becoming a serious search engine and they talked about some of their earliest efforts with DIY servers and battles with doing things at the scale they needed. Everyone else was talking about using appliances or pre-built solutions “optimised” by the vendors to provide the best value for money or best performance or whatever. As the likes of Dropbox, Facebook and LinkedIn have shown, there is value in going the DIY route, assuming the right amount of scale is there.

I’ve said it before, very few companies really qualify for the “hyper” in hyperscalers. So a company like Burlywood Tech isn’t necessarily going to benefit them directly. That said, these kind of companies, if they’re successful in helping the hyperscalers drive the cost of Flash in a downwards direction, will indirectly help enterprises by forcing the major Flash vendors to look at how they can do things more economically. And sometimes it’s just nice to peak behind the curtain to see how this stuff comes about. I’m oftentimes more interested in how networks put together their streaming media services than a lot of the content they actually deliver on those platforms. I think Burlywood Tech falls in that category as well. I don’t care for some of the services that the hyperscalers deliver, but I’m interested in how they do it nonetheless.

Storbyte Come Out Of Stealth Swinging

I had the opportunity to speak to Storbyte‘s Chief Evangelist and Design Architect Diamond Lauffin recently and thought I’d share some information on their recent announcement.

 

Architecture

ECO-FLASH

Storbyte have announced ECO-FLASH, positioning it as “a new architecture and flash management system for non-volatile memory”. Its integrated circuit, ASIC-based architecture abstracts independent SSD memory modules within the flash drive and presents the unified architecture as a single flash storage device.

 

Hydra

Each ECO-FLASH module is comprised of 16 mSATA modules, running in RAID 0. 4 modules are managed by each Hydra, with 4 “sub-master” Hydras being managed by a master Hydra. This makes up one drive that supports RAID 0, 5, 6 and N, so if you were only running a single-drive solution (think out at the edge), you can configure the modules to run in RAID 5 or 6.

 

[image courtesy of Storbyte]

 

Show Me The Product

[image courtesy of Storbyte]

 

The ECO-FLASH drives come in 4, 8, 16 and 32TB configurations, and these fit into a variety of arrays. Storbyte is offering three ECO-FLASH array models:

  • 131TB raw capacity in 1U (using 4 drives);
  • 262TB raw capacity in 2U (using 16 drives); and
  • 786TB raw capacity in 4U (using 48 drives).

Storbyte’s ECO-FLASH supports a blend of Ethernet, iSCSI, NAS and InfiniBand primary connectivity simultaneously. You can also add Storbyte’s 4U 1.18PB spinning disk JBOD expansion units to deliver a hybrid solution.

 

Thoughts

The idea behind Storbyte came about because some people were working in forensic security environments that had a very heavy write workload, and they needed to find a better way to add resilience to the high performance storage solutions they were using. Storbyte are offering a 10 year warranty on their product, so they’re clearly convinced that they’ve worked through a lot of the problems previously associated with the SSD Write Cliff (read more about that here, here, and here). They tell me that Hydra is the primary reason that they’re able to mitigate a number of the effects of the write cliff and can provide performance for a longer period of time.

Storbyte’s is not a standard approach by any stretch. They’re talking some big numbers out of the gate and have a pretty reasonable story to tell around capacity, performance, and resilience as well. I’ve scheduled another session with Storbyte to talk some more about how it all works and I’ll be watching these folks with some interest as they enter the market and start to get some units running workload on the floor. There’s certainly interesting heritage there, and the write cliff has been an annoying problem to solve. Coupled with some aggressive economics and support for a number of connectivity options and I can see this solution going in to a lot of DCs and being used for some cool stuff. If you’d like to read another perspective, check out what Rich over at Gestalt IT wrote about them and you can read the full press release here.

Kingston’s NVMe Line-up Is The Life Of The Party

Disclaimer: I recently attended VMworld 2017 – US.  My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

You can view the video of Kingston‘s presentation at Tech Field Day Extra VMworld US 2017 here, and download a PDF copy of my rough notes from here.

 

It’s A Protocol, Not Media

NVMe has been around for a few years now, and some people get it confused for a new kind of media that they plug into their servers. But it’s not really, it’s just a standard specification for accessing Flash media via the PCI Express bus. There’re a bunch of reasons why you might choose to use NVMe instead of SAS, including lower latency and less CPU overhead. My favourite thing about it though is the plethora of form factors available to use. Kingston touched on these in their presentation at Tech Field Day Extra recently. You can get them in half-height, half-length (HHHL) add-in cards (AIC), U.2 (2.5″) and M.2 sizes. To give you an idea of the use cases for each of these, Kingston suggested the following applications:

  • HHHL (AIC) card
    • Server / DC applications
    • High-end workstations
  • U.2 (2.5″)
    • Direct-attached, server backplane, just a bunch of flash (JBOF)
    • White box and OEM-branded
  • M.2
    • Client applications
    • Notebooks, desktops, workstations
    • Specialised systems

 

It’s Pretty Fast

NVMe has proven to be pretty fast, and a number of companies are starting to develop products that leverage the protocol in an extremely efficient manner. Coupled with the rise of NVMe/F solutions and you’ve got some pretty cool stuff coming to market. The price is also becoming a lot more reasonable, with Kingston telling us that their DCP1000 NVMe HHHL comes in at around “$0.85 – $0.90 per GB at the moment”. It’s obviously not as cheap as things that spin at 7200RPM but the speed is mighty fine. Kingston also noted that the 2.5″ form factor would be hanging around for some time yet, as customers appreciated the serviceability of the form factor.

 

[Kingston DCU1000 – Image courtesy of Kingston]

 

This Stuff’s Everywhere

Flash media has been slowly but surely taking over the world for a little while now. The cost per GB is reducing (slowly, but surely), and the range of form factors means there’s something for everyone’s needs. Protocol advancements such as NVMe make things even easier, particularly at the high end of town. It’s also been interesting to see these “high end” solutions trickle down to affordable form factors such as PCIe add-in cards. With the relative ubiquity of operating system driver support, NVMe has become super accessible. The interesting thing to watch now is how we effectively leverage these advancements in protocol technologies. Will we use them to make interesting advances in platforms and data access? Or will we keep using the same software architectures we fell in love with 15 years ago (albeit with dramatically improved performance specifications)?

 

Conclusion and Further Reading

I’ll admit it took me a little while to come up with something to write about after the Kingston presentation. Not because I don’t like them or didn’t find their content interesting. Rather, I felt like I was heading down the path of delivering another corporate backgrounder coupled with speeds and feeds and I know they have better qualified people to deliver that messaging to you (if that’s what you’re into). Kingston do a whole range of memory-related products across a variety of focus areas. That’s all well and good but you probably already knew that. Instead, I thought I could focus a little on the magic behind the magic. The Flash era of storage has been absolutely fascinating to witness, and I think it’s only going to get more interesting over the next few years. If you’re into this kind of thing but need a more comprehensive primer on NVMe, I recommend you check out J Metz’s article on the Cisco blog. It’s a cracking yarn and enlightening to boot. Data Centre Journal also provide a thorough overview here.

EMC – I heart EFD performance

We got some EFDs on our CX4-960s recently, and we had the chance to do some basic PassMark testing on them before we loaded them up with production configurations. We’re running 30 * 200GB EFDs in a Storage Pool on the CX4-960. FAST and FAST Cache haven’t been turned on. The VM was sitting on an HP BL460p G6 blade with 96GB RAM, 12 Nehalem cores and vSphere 4.1. The blades connect to the arrays via Cisco 9124e FC switches with 8Gbps port-channels to the Cisco MDS 9513. We’re only using 2 fe ports per SP on the CX4-960 at the moment. We used Pass Mark on a Windows 2008 R2 VM sitting on a 100GB vmdk. There wasn’t a lot of other LUNs bound in the Storage Pool, so the results are skewed. Nonetheless, they look pretty, and that’s what I’m going with.

100% Sequential Write:

 

Results:

100% Sequential Read:

Results:

File Server Simulation (80% Read, 20% Write):

Results:

EMC – FAST and FAST Cache on the CX4-960

Apologies for the lack of posts over the last few months – I have been nuts deep in work and holidays. I’m working on some literature around Storage Pools and FAST in general, but in the meantime I thought I’d share this nugget with you. We finally got approval to install the FAST and FAST Cache enablers on our production CX4-960s a few nights ago. We couldn’t install them on one of the arrays because we had a dead disk that prevented the NDU from going ahead. Fair enough. Two awesome things happened when we installed it on the other array. Both of which could have been avoided if I’d had my shit together. Firstly, when I got into the office the next morning at 8 am, we noticed that the Read Cache on the array was disabled. For those of you playing at home, we had the cache on the 960 set at 1000MB read and 9760MB for write. I think I read this in a whitepaper some where. But after FAST went on, we still had 9760MB allocated to Write, and 0MB available for Read. Awesome not so much. Seems that we lost 1000MB, presumably because we added another layered application. Funnily enough we didn’t observe this behaviour on our lab CX4-120s, although you could argue that they really have sweet FA of cache in the first place. So now we have 8760MB for Write, and 1000MB for Read. And I’m about to configure a few hundred GB of FAST Cache on the EFDs in any case. We’ll see how that goes.

The other slightly boneheaded thing we did was forget to trespass the LUN ownership of LUNs on SP A back from SP B. In other words, an NDU applies code to SP B first, reboots the SP, checks it, and then loads code on the other SP. As part of this, LUN ownership is temporarily trespassed to the surviving SP (this is the whole non-disruptive thing). Once the NDU is complete, you should go and check for trespassed LUNs and move them back to their owners. Or not, and have everything run on one SP for a while. And wait for about 9000 Exchange users to complain when one of the Exchange clusters goes off-line. Happy days.

EMC FAST and FAST Cache

I don’t have a lot to say on FAST or FAST Cache just yet – because the PO hasn’t quite made its way to EMC. But when I was researching the topic I found Matt Taylor’s post on this topic to be an excellent summary – particularly the point about configuration options. Particularly if you’re rolling with multiple CX4-960s – as we are. I think it’s going to scream :)