Random Short Take #6

Welcome to the sixth edition of the Random Short Take. Here are a few links to a few things that I think might be useful, to someone.

WekaIO – Not The Matrix You’re Thinking Of

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 15. It’s not the first time I’ve heard from them, and you can read my initial thoughts on them here. You can see their videos from Storage Field Day 15 here, and download a PDF copy of my rough notes from here.

 

Enter The Matrix

Fine, I just rewatched The Matrix on the plane home. But any company with Matrix in the product name is going to get a few extra points from me. So what is it, Neo?

  • Fully coherent POSIX file system that delivers local file system performance;
  • Distributed Coding, more resilient at scale, fast rebuilds, end to end data protection
  • Instantaneous snapshots, clones, tiering to S3, partial file rehydration;
  • InfiniBand or Ethernet, Hyper-converged or Dedicated Storage Server; and
  • Bare-metal, containerised, or running in a VM.

There’s an on-premises version and one built for public cloud use.

Liran Zvibel (Co-founder and CEO) took us through some of the key features of the architecture.

Software based for dynamic scalability

  • Software scales to thousands of nodes and trillions of records;
  • Significantly more scalable than any appliance offering; and
  • Metadata scales to thousands of servers.

Patented erasure coding technology

  • Allows us to use 66% less NVMe compared to triple replication;
  • Fully distributed data and metadata for best parallelism / performance; and
  • Snapshots for “free” with no performance impact.

Integrated tiering in a single namespace

  • Allows for unlimited namespace critical for deep learning; and
  • Enables backup and cloud bursting to public cloud.

 

I Know Kung Fu

[Look, I’m just going to torture the Matrix analogy for a little longer, so bear with me]. So what do I do with all of this performance in a storage subsystem? Well, the key focus areas for WekaIO include:

  • Machine learning / AI;
  • Digital Radiology / Pathology;
  • Algorithmic Trading; and
  • Genomic Sequencing and Analytics.

Most of these workloads deal with millions of files, very large capacities, and are very sensitive to poor latency. There’s also a cool use case for media and entertainment environments that’s worth checking out if you’re into that sort of thing.

 

Thoughts

WekaIO are aiming to do about 30% of their sales directly, meaning they lean heavily on the channel. Both HPE and Penguin Computing are OEM providers, and obviously there’s also a software-only play with the AWS version. They’re talking about delivering some very big numbers when it comes to performance, but my favourite thing about them is the focus on being able to access the same data through all interfaces, and quickly.

WekaIO make some strong claims about their ability to deliver a fast and scalable file system solution, but they certainly have the pedigree to deliver a solution that meets a number of those claims. There’re some nice features, such as the ability to add servers with different profiles to the cluster, and running nodes in hyper-converged mode. When it comes down to it, performance is defined by the amount of cores available. If you add more compute, you get more performance.

In my mind, the solution isn’t for everyone right now, but if you have a requirement for a performance focused, massively parallel, scale-out storage solution with the ability to combine NVMe and S3, you’d do worse than to check out what WekaIO can do.

WekaIO Have Been Busy – Really Busy

WekaIO recently announced Version 3.1 of their Matrix software, and I had the good fortune to catch up with David Hiatt. We’d spoken a little while ago when WekaIO came out of stealth and they’ve certainly been busy in the interim. In fact, they’ve been busy to the point that I thought it was worth putting together a brief overview of what’s new.

 

What Is WekaIO?

WekaIO have been around since 2013, gaining their first customers in 2016. They’ve had 17 patents filed, 45 identified, and 8 issued. Their focus has primarily been on delivering, in their words, the “highest performance file system targeted at compute intensive applications”. They deliver a fully POSIX-compliant file system that can run on bare metal, hypervisors, Docker, or in the public or private cloud.

[image courtesy of WekaIO]

Some of the key features of the architecture include the fact that it is distributed, resilient at scale, can perform fast rebuilds, and provides end-to-end protection. Right now, their key use cases include genomics, machine learning, media rendering, semiconductors, financial trading and analytics. The company has staff coming from XIV, NetApp, IBM, EMC, and Intel, amongst others.

 

So What’s News?

Well, there’s been a bit going on:

 

Matrix Version 3.1 – Much Better Than Matrix Revolutions

Not that that’s too hard to do. But there have been a bunch of new features added to WekaIO’s Matrix software. Here’s a table that summarises the new features.

Feature Explanation
Network Redundancy Binding network links and load balancing
Infiniband Native support for InfiniBand
Multiple File Systems Logical partitioning allows more granular allocation of performance and capacity
Cluster Scaling Dynamically shrink and grow clusters
NVMe Native support for NVMe devices
Snapshots and Clones High performance 4K granularity
Snap to Object Store Saving metadata of snap to OBS
Deployment in AWS Install and run Matrix on EC2 clusters

David also took me through what look like some very, very good SPECsfs2014 Software Build results, particularly when compared with some competitive solutions. He also walked me through the Marketplace configurator. This is really cool stuff – flexible and easy to use. You can check out a demo of it here.

 

Conclusion

All the cool kids are doing stuff with AWS. And that’s fine. But I really like that WekaIO also make stuff easy to run on-premises as well. And they also make it really fast. Because sometimes you just need to run stuff near you, and sometimes there needs to be an awful lot of it. WekaIO’s model is flexible, with the annual subscription approach and lack of maintenance contracts bound to appeal to a lot of people. The great thing is it’s easy to manage, easy to scale and supports all the file protocols you’d be interested in. There’s a bunch of (configurable) resiliency built in and support for hybrid workloads if required.

With a Formula One slide including customer testimonials from the likes of DreamWorks and SDSC, I get the impression that WekaIO are up to something pretty cool. Plus, I really enjoy chatting to David about what’s going on in the world of highly scalable file systems, and am looking forward to our next call in a few months’ time to see what they’ve been up to. I get the impression there’s little chance they’ll be sitting still.