Disclaimer: I recently attended Storage Field Day 18. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
WekaIO recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here. I’ve written about WekaIO before, and you can read those posts here and here.
Barbara Murphy described WekaIO Matrix as “the fastest, most scalable parallel file system for AI and technical compute workloads that ensure applications never wait for data”.
What They Do
So what exactly does WekaIO Matrix do?
- WekaIO Matrix is software-defined storage solution that runs on anything from bare metal, VMs, containers, on-premises or in the cloud;
- Fully-coherent POSIX file system that’s faster than a local file system;
- Distributed Coding, More Resilient at Scale, Fast Rebuilds, End-to-End Data Protection; and
- InfiniBand or Ethernet, Converged or Dedicated, on-premises or cloud.
[image courtesy of WekaIO]
Lots of Features
WekaIO Matrix now has a bunch of features, including:
- Support for S3, SMB, and NFS protocols;
- Cloud backup, Snapshots, Clones, and Snap-2-Obj;
- Active Directory support and authentication;
- Network High Availability;
- HDFS; and
Flexible deployment models
- Appliance model – compute and storage on separate infrastructure; and
- Converged model – compute and storage on shared infrastructure.
Both models are cloud native because “[e]verybody wants the ability to be able to move to the cloud, or leverage the cloud”
WekaIO is focused on delivering super fast storage via NVMe-oF, and say that NFS and SMB deliver legacy protocol support for convenience.
WekaIO front-ends are cluster-aware
- Incoming read requests optimised re location and loading conditions – incoming writes can go anywhere
- Metadata fully distributed
- No redirects required
SR-IOV optimises network access WekaIO directly access NVMe Flash
- Bypassing the kernel leads to better performance.
The WekaIO parallel clustered filesystem is
- Optimised flash-native data placement
- Not designed for HDD
- No “cylinder groups” or other anachronisms – data protection (similar to EC)
- 3-16 data drives, +2 or +4 parity drives
- Optional hot spares – uses a “virtual” hot spare
Global namespace = hot tier + Object storage tier
- Tiering to S3-API Object storage
- Additional capacity with lower cost per GB
- Files shared to object storage layer (parallelised access optimise performances, simplifies partial or offset reads)
WekaIO uses the S3-API as its equivalent of “SCSI” for HDD.
Conclusion and Further Reading
I like the WekaIO story. They take away a lot of the overheads associated with non-DAS storage through the use of a file system and control of the hardware. You can make DAS run really fast, but it’s invariably limited to the box that it’s in. Scale-out pools of storage still have a place, particularly in the enterprise, and WekaIO are demonstrating that the performance is there for the applications that need it. There’s a good story in terms of scale, performance, and enterprise resilience features.
Perhaps you like what you see with WekaIO Matrix but don’t want to run stuff on-premises? There’s a good story to be had with Matrix on AWS as well. You’ll be able to get some serious performance, and chances are it will fit in nicely with your cloud-native application workflow.
WekaIO continues to evolve, and I like seeing the progress they’ve been making to this point. It’s not always easy to convince the DAS folks that you can deliver a massively parallel file system and storage solution based on commodity hardware, but WekaIO are giving it a real shake. I recommend checking out Chris M. Evans’s take on WekaIO as well.