Random Short Take #22

Oh look, another semi-regular listicle of random news items that might be of some interest.

  • I was at Pure Storage’s //Accelerate conference last week, and heard a lot of interesting news. This piece from Chris M. Evans on FlashArray//C was particularly insightful.
  • Storage Field Day 18 was a little while ago, but that doesn’t mean that the things that were presented there are no longer of interest. Stephen Foskett wrote a great piece on IBM’s approach to data protection with Spectrum Protect Plus that’s worth read.
  • Speaking of data protection, it’s not just for big computers. Preston wrote a great article on the iOS recovery process that you can read here. As someone who had to recently recover my phone, I agree entirely with the idea that re-downloading apps from the app store is not a recovery process.
  • NetApp were recently named a leader in the Gartner Magic Quadrant for Primary Storage. Say what you will about the MQ, a lot of folks are still reading this report and using it to help drive their decision-making activities. You can grab a copy of the report from NetApp here. Speaking of NetApp, I’m happy to announce that I’m now a member of the NetApp A-Team. I’m looking forward to doing a lot more with NetApp in terms of both my day job and the blog.
  • Tom has been on a roll lately, and this article on IT hero culture, and this one on celebrity keynote speakers, both made for great reading.
  • VMworld US was a little while ago, but Anthony‘s wrap-up post had some great content, particularly if you’re working a lot with Veeam.
  • WekaIO have just announced some work their doing Aiden Lab at the Baylor College of Medicine that looks pretty cool.
  • Speaking of analyst firms, this article from Justin over at Forbes brought up some good points about these reports and how some of them are delivered.

Random Short Take #14

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Episode 14 – giddy-up!

WekaIO Continues To Evolve

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here. I’ve written about WekaIO before, and you can read those posts here and here.

 

WekaIO

Barbara Murphy described WekaIO Matrix as “the fastest, most scalable parallel file system for AI and technical compute workloads that ensure applications never wait for data”.

 

What They Do

So what exactly does WekaIO Matrix do?

  • WekaIO Matrix is software-defined storage solution that runs on anything from bare metal, VMs, containers, on-premises or in the cloud;
  • Fully-coherent POSIX file system that’s faster than a local file system;
  • Distributed Coding, More Resilient at Scale, Fast Rebuilds, End-to-End Data Protection; and
  • InfiniBand or Ethernet, Converged or Dedicated, on-premises or cloud.

[image courtesy of WekaIO]

 

Lots of Features

WekaIO Matrix now has a bunch of features, including:

  • Support for S3, SMB, and NFS protocols;
  • Cloud backup, Snapshots, Clones, and Snap-2-Obj;
  • Active Directory support and authentication;
  • POSIX;
  • Network High Availability;
  • Encryption;
  • Quotas;
  • HDFS; and
  • Tiering.

Flexible deployment models

  • Appliance model – compute and storage on separate infrastructure; and
  • Converged model – compute and storage on shared infrastructure.

Both models are cloud native because “[e]verybody wants the ability to be able to move to the cloud, or leverage the cloud”

 

Architectural Considerations

WekaIO is focused on delivering super fast storage via NVMe-oF, and say that NFS and SMB deliver legacy protocol support for convenience.

The Front-End

WekaIO front-ends are cluster-aware

  • Incoming read requests optimised re location and loading conditions – incoming writes can go anywhere
  • Metadata fully distributed
  • No redirects required

SR-IOV optimises network access WekaIO directly access NVMe Flash

  • Bypassing the kernel leads to better performance.

The Back-End

The WekaIO parallel clustered filesystem is

  • Optimised flash-native data placement
    • Not designed for HDD
    • No “cylinder groups” or other anachronisms – data protection (similar to EC)
    • 3-16 data drives, +2 or +4 parity drives
    • Optional hot spares – uses a “virtual” hot spare

Global namespace = hot tier + Object storage tier

  • Tiering to S3-API Object storage
    • Additional capacity with lower cost per GB
    • Files shared to object storage layer (parallelised access optimise performances, simplifies partial or offset reads)

WekaIO uses the S3-API as its equivalent of “SCSI” for HDD.

 

Conclusion and Further Reading

I like the WekaIO story. They take away a lot of the overheads associated with non-DAS storage through the use of a file system and control of the hardware. You can make DAS run really fast, but it’s invariably limited to the box that it’s in. Scale-out pools of storage still have a place, particularly in the enterprise, and WekaIO are demonstrating that the performance is there for the applications that need it. There’s a good story in terms of scale, performance, and enterprise resilience features.

Perhaps you like what you see with WekaIO Matrix but don’t want to run stuff on-premises? There’s a good story to be had with Matrix on AWS as well. You’ll be able to get some serious performance, and chances are it will fit in nicely with your cloud-native application workflow.

WekaIO continues to evolve, and I like seeing the progress they’ve been making to this point. It’s not always easy to convince the DAS folks that you can deliver a massively parallel file system and storage solution based on commodity hardware, but WekaIO are giving it a real shake. I recommend checking out Chris M. Evans’s take on WekaIO as well.

Random Short Take #11

Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe. Happy New Year too. I hope everyone’s feeling fresh and ready to tackle 2019.

  • I’m catching up with the good folks from Scale Computing in the next little while, but in the meantime, here’s what they got up to last year.
  • I’m a fan of the fruit company nowadays, but if I had to build a PC, this would be it (hat tip to Stephen Foskett for the link).
  • QNAP announced the TR-004 over the weekend and I had one delivered on Tuesday. It’s unusual that I have cutting edge consumer hardware in my house, so I’ll be interested to see how it goes.
  • It’s not too late to register for Cohesity’s upcoming Helios webinar. I’m looking forward to running through some demos with Jon Hildebrand and talking about how Helios helps me manage my Cohesity environment on a daily basis.
  • Chris Evans has published NVMe in the Data Centre 2.0 and I recommend checking it out.
  • I went through a basketball card phase in my teens. This article sums up my somewhat confused feelings about the card market (or lack thereof).
  • Elastifile Cloud File System is now available on the AWS Marketplace – you can read more about that here.
  • WekaIO have posted some impressive numbers over at spec.org if you’re into that kind of thing.
  • Applications are still open for vExpert 2019. If you haven’t already applied, I recommend it. The program is invaluable in terms of vendor and community engagement.

 

 

Random Short Take #6

Welcome to the sixth edition of the Random Short Take. Here are a few links to a few things that I think might be useful, to someone.

WekaIO – Not The Matrix You’re Thinking Of

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 15. It’s not the first time I’ve heard from them, and you can read my initial thoughts on them here. You can see their videos from Storage Field Day 15 here, and download a PDF copy of my rough notes from here.

 

Enter The Matrix

Fine, I just rewatched The Matrix on the plane home. But any company with Matrix in the product name is going to get a few extra points from me. So what is it, Neo?

  • Fully coherent POSIX file system that delivers local file system performance;
  • Distributed Coding, more resilient at scale, fast rebuilds, end to end data protection
  • Instantaneous snapshots, clones, tiering to S3, partial file rehydration;
  • InfiniBand or Ethernet, Hyper-converged or Dedicated Storage Server; and
  • Bare-metal, containerised, or running in a VM.

There’s an on-premises version and one built for public cloud use.

Liran Zvibel (Co-founder and CEO) took us through some of the key features of the architecture.

Software based for dynamic scalability

  • Software scales to thousands of nodes and trillions of records;
  • Significantly more scalable than any appliance offering; and
  • Metadata scales to thousands of servers.

Patented erasure coding technology

  • Allows us to use 66% less NVMe compared to triple replication;
  • Fully distributed data and metadata for best parallelism / performance; and
  • Snapshots for “free” with no performance impact.

Integrated tiering in a single namespace

  • Allows for unlimited namespace critical for deep learning; and
  • Enables backup and cloud bursting to public cloud.

 

I Know Kung Fu

[Look, I’m just going to torture the Matrix analogy for a little longer, so bear with me]. So what do I do with all of this performance in a storage subsystem? Well, the key focus areas for WekaIO include:

  • Machine learning / AI;
  • Digital Radiology / Pathology;
  • Algorithmic Trading; and
  • Genomic Sequencing and Analytics.

Most of these workloads deal with millions of files, very large capacities, and are very sensitive to poor latency. There’s also a cool use case for media and entertainment environments that’s worth checking out if you’re into that sort of thing.

 

Thoughts

WekaIO are aiming to do about 30% of their sales directly, meaning they lean heavily on the channel. Both HPE and Penguin Computing are OEM providers, and obviously there’s also a software-only play with the AWS version. They’re talking about delivering some very big numbers when it comes to performance, but my favourite thing about them is the focus on being able to access the same data through all interfaces, and quickly.

WekaIO make some strong claims about their ability to deliver a fast and scalable file system solution, but they certainly have the pedigree to deliver a solution that meets a number of those claims. There’re some nice features, such as the ability to add servers with different profiles to the cluster, and running nodes in hyper-converged mode. When it comes down to it, performance is defined by the amount of cores available. If you add more compute, you get more performance.

In my mind, the solution isn’t for everyone right now, but if you have a requirement for a performance focused, massively parallel, scale-out storage solution with the ability to combine NVMe and S3, you’d do worse than to check out what WekaIO can do.

WekaIO Have Been Busy – Really Busy

WekaIO recently announced Version 3.1 of their Matrix software, and I had the good fortune to catch up with David Hiatt. We’d spoken a little while ago when WekaIO came out of stealth and they’ve certainly been busy in the interim. In fact, they’ve been busy to the point that I thought it was worth putting together a brief overview of what’s new.

 

What Is WekaIO?

WekaIO have been around since 2013, gaining their first customers in 2016. They’ve had 17 patents filed, 45 identified, and 8 issued. Their focus has primarily been on delivering, in their words, the “highest performance file system targeted at compute intensive applications”. They deliver a fully POSIX-compliant file system that can run on bare metal, hypervisors, Docker, or in the public or private cloud.

[image courtesy of WekaIO]

Some of the key features of the architecture include the fact that it is distributed, resilient at scale, can perform fast rebuilds, and provides end-to-end protection. Right now, their key use cases include genomics, machine learning, media rendering, semiconductors, financial trading and analytics. The company has staff coming from XIV, NetApp, IBM, EMC, and Intel, amongst others.

 

So What’s News?

Well, there’s been a bit going on:

 

Matrix Version 3.1 – Much Better Than Matrix Revolutions

Not that that’s too hard to do. But there have been a bunch of new features added to WekaIO’s Matrix software. Here’s a table that summarises the new features.

Feature Explanation
Network Redundancy Binding network links and load balancing
Infiniband Native support for InfiniBand
Multiple File Systems Logical partitioning allows more granular allocation of performance and capacity
Cluster Scaling Dynamically shrink and grow clusters
NVMe Native support for NVMe devices
Snapshots and Clones High performance 4K granularity
Snap to Object Store Saving metadata of snap to OBS
Deployment in AWS Install and run Matrix on EC2 clusters

David also took me through what look like some very, very good SPECsfs2014 Software Build results, particularly when compared with some competitive solutions. He also walked me through the Marketplace configurator. This is really cool stuff – flexible and easy to use. You can check out a demo of it here.

 

Conclusion

All the cool kids are doing stuff with AWS. And that’s fine. But I really like that WekaIO also make stuff easy to run on-premises as well. And they also make it really fast. Because sometimes you just need to run stuff near you, and sometimes there needs to be an awful lot of it. WekaIO’s model is flexible, with the annual subscription approach and lack of maintenance contracts bound to appeal to a lot of people. The great thing is it’s easy to manage, easy to scale and supports all the file protocols you’d be interested in. There’s a bunch of (configurable) resiliency built in and support for hybrid workloads if required.

With a Formula One slide including customer testimonials from the likes of DreamWorks and SDSC, I get the impression that WekaIO are up to something pretty cool. Plus, I really enjoy chatting to David about what’s going on in the world of highly scalable file systems, and am looking forward to our next call in a few months’ time to see what they’ve been up to. I get the impression there’s little chance they’ll be sitting still.