Welcome to Random Short take #63. It’s Friday morning, and the weekend is in sight.
I really enjoyed this article from Glenn K. Lockwood about how just looking for an IOPS figure can be a silly thing to do, particularly with HPC workloads. “If there’s one constant in HPC, it’s that everyone hates I/O. And there’s a good reason: it’s a waste of time because every second you wait for I/O to complete is a second you aren’t doing the math that led you to use a supercomputer in the first place.”
Intrigued by Portworx and want to know more? Check out these two blog posts on configuring multi-cloud application portability (here and here) – they are excellent. Hat tip to my friend Mike at Pure Storage for the links.
I loved this article on project heroics from Chris Wahl. I’ve got a lot more to say about this and the impact this behaviour can have on staff but some of it is best not committed to print at this stage.
Happy new year and welcome to Random Short Take #49. Not a great many players have worn 49 in the NBA (2 as it happens). It gets better soon, I assure you. Let’s get random.
Frederic has written a bunch of useful articles around useful Rubrik things. This one on setting up authentication to use Active Directory came in handy recently. I’ll be digging in to some of Rubrik’s multi-tenancy capabilities in the near future, so keep an eye out for that.
Speaking of press releases, WekaIO has enjoyed some serious growth in the last year. Read more about that here.
I loved this article from Andrew Dauncey about things that go wrong and learning from mistakes. We’ve all likely got a story about something that went so spectacularly wrong that you only made that mistake once. Or twice at most. It also reminds me of those early days of automated ESX 2.5 builds and building magical installation CDs that would happily zap LUN 0 on FC arrays connected to new hosts. Fun times.
Finally, I was lucky enough to talk to Intel Senior Fellow Al Fazio about what’s happening with Optane, how it got to this point, and where it’s heading. You can read the article and check out the video here.
Disclaimer: I recently attended Storage Field Day 19. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Liran Zvibel (Co-founder and CEO) spent some time talking about the explosion in data storage requirements in the next 4 – 5 years. It was suggested that most of this growth will come in the form of unstructured data. The problem with today’s storage systems, he suggested, was that storage is broken into “Islands of Compromise” categories – each category carries a leader. What does that mean exactly? DAS and SAN cannot share data easily, and the performance of a number of NAS and Object architectures isn’t great.
A New Storage Category
WekaIO is positioning itself in a new storage category. One that delivers:
The highest performance for any workload
Complete data shareability
Cloud native, hybrid cloud support
Full enterprise features
Unique Product Differentiation
So what is that sets WekaIO apart from the rest of the storage industry? Zvibel listed a number of differentiators, including:
Only POSIX namespace that scales to exabytes of capacity and trillions of files
Only networked file system that is faster than local storage
Snap to object
Unique blend of All-Flash and Object storage for instant backup to cloud storage (no backup software required)
Cloud burst from on-premises to public cloud
Fully hybrid cloud enabled with highest performance
End-to-end data encryption with no performance degradation
Critical for modern workloads and compliance
[image courtesy of Barbara Murphy]
This all sounds great, but where is WekaIO really being used effectively? Barbara Murphy spent some time talking with the delegates about a number of customer examples across the following market verticals.
Genomics sequencing and analytics
Machine Learning / Artificial Intelligence
Thoughts and Further Reading
I’ve written enthusiastically about WekaIO before. It’s easy to get caught up in some of the hype that seems to go hand in hand with WekaIO presentations. But WekaIO has a lot of data to back up its claims, and it’s taken an interesting approach to solving traditional storage problems in a non-traditional fashion. I like that there’s a strong cloud story there, as well as the potential to leverage the latest hardware advancements to deliver the performance companies need.
The analysts and storage vendors drone on and on about the explosion in data growth over the coming years, but it’s a real problem. Our workload challenges are changing as well, and it seems like a new approach is needed for how we approach some of these challenges. The scale of the data that needs to be crunched doesn’t always mean that DAS is a good option. You’re more likely to see these kinds of challenges show up in the science and technology industries. And WekaIO seems to be well-positioned to meet these challenges, whether it’s in public cloud or on-premises. It strikes me that WekaIO’s focus on performance and resilience, along with a robust software-defined architecture, has it in a good position to tackle the types of workload problems we’re seeing at the edge and in AI / ML focused environments. I’m really looking forward to seeing what comes next for WekaIO.
Storage Field Day 18 was a little while ago, but that doesn’t mean that the things that were presented there are no longer of interest. Stephen Foskett wrote a great piece on IBM’s approach to data protection with Spectrum Protect Plus that’s worth read.
Speaking of data protection, it’s not just for big computers. Preston wrote a great article on the iOS recovery process that you can read here. As someone who had to recently recover my phone, I agree entirely with the idea that re-downloading apps from the app store is not a recovery process.
NetApp were recently named a leader in the Gartner Magic Quadrant for Primary Storage. Say what you will about the MQ, a lot of folks are still reading this report and using it to help drive their decision-making activities. You can grab a copy of the report from NetApp here. Speaking of NetApp, I’m happy to announce that I’m now a member of the NetApp A-Team. I’m looking forward to doing a lot more with NetApp in terms of both my day job and the blog.
If you work with VMware Cloud Director and haven’t visited Stellios’s blog before you’re missing out. Stellios has a wealth of experience from his days as a customer and now works as a VMware TAM in BrisVegas. This article on LDAPS in vCD 9.5 was particularly useful.
Disclaimer: I recently attended Storage Field Day 18. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
WekaIO recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here. I’ve written about WekaIO before, and you can read those posts here and here.
Barbara Murphy described WekaIO Matrix as “the fastest, most scalable parallel file system for AI and technical compute workloads that ensure applications never wait for data”.
What They Do
So what exactly does WekaIO Matrix do?
WekaIO Matrix is software-defined storage solution that runs on anything from bare metal, VMs, containers, on-premises or in the cloud;
Fully-coherent POSIX file system that’s faster than a local file system;
Distributed Coding, More Resilient at Scale, Fast Rebuilds, End-to-End Data Protection; and
InfiniBand or Ethernet, Converged or Dedicated, on-premises or cloud.
[image courtesy of WekaIO]
Lots of Features
WekaIO Matrix now has a bunch of features, including:
Support for S3, SMB, and NFS protocols;
Cloud backup, Snapshots, Clones, and Snap-2-Obj;
Active Directory support and authentication;
Network High Availability;
Flexible deployment models
Appliance model – compute and storage on separate infrastructure; and
Converged model – compute and storage on shared infrastructure.
Both models are cloud native because “[e]verybody wants the ability to be able to move to the cloud, or leverage the cloud”
WekaIO is focused on delivering super fast storage via NVMe-oF, and say that NFS and SMB deliver legacy protocol support for convenience.
WekaIO front-ends are cluster-aware
Incoming read requests optimised re location and loading conditions – incoming writes can go anywhere
No “cylinder groups” or other anachronisms – data protection (similar to EC)
3-16 data drives, +2 or +4 parity drives
Optional hot spares – uses a “virtual” hot spare
Global namespace = hot tier + Object storage tier
Tiering to S3-API Object storage
Additional capacity with lower cost per GB
Files shared to object storage layer (parallelised access optimise performances, simplifies partial or offset reads)
WekaIO uses the S3-API as its equivalent of “SCSI” for HDD.
Conclusion and Further Reading
I like the WekaIO story. They take away a lot of the overheads associated with non-DAS storage through the use of a file system and control of the hardware. You can make DAS run really fast, but it’s invariably limited to the box that it’s in. Scale-out pools of storage still have a place, particularly in the enterprise, and WekaIO are demonstrating that the performance is there for the applications that need it. There’s a good story in terms of scale, performance, and enterprise resilience features.
Perhaps you like what you see with WekaIO Matrix but don’t want to run stuff on-premises? There’s a good story to be had with Matrix on AWS as well. You’ll be able to get some serious performance, and chances are it will fit in nicely with your cloud-native application workflow.
WekaIO continues to evolve, and I like seeing the progress they’ve been making to this point. It’s not always easy to convince the DAS folks that you can deliver a massively parallel file system and storage solution based on commodity hardware, but WekaIO are giving it a real shake. I recommend checking out Chris M. Evans’stake on WekaIO as well.
Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe. Happy New Year too. I hope everyone’s feeling fresh and ready to tackle 2019.
QNAP announced the TR-004 over the weekend and I had one delivered on Tuesday. It’s unusual that I have cutting edge consumer hardware in my house, so I’ll be interested to see how it goes.
It’s not too late to register for Cohesity’s upcoming Helios webinar. I’m looking forward to running through some demos with Jon Hildebrand and talking about how Helios helps me manage my Cohesity environment on a daily basis.
Welcome to the sixth edition of the Random Short Take. Here are a few links to a few things that I think might be useful, to someone.
I’m a big fan of Plex, and recently moved it from my iMac onto a Debian-based NAS. There’s a comprehensive Linux Permissions Guide that you can get here. It came in handy because I have a number of NAS devices serving up media. And you don’t want to see what I did to get multiple volumes mounted via SMB. (It gets ugly when I want the DVR component to be able to record to any share)
Disclaimer: I recently attended Storage Field Day 15. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
WekaIO recently presented at Storage Field Day 15. It’s not the first time I’ve heard from them, and you can read my initial thoughts on them here. You can see their videos from Storage Field Day 15 here, and download a PDF copy of my rough notes from here.
Enter The Matrix
Fine, I just rewatched The Matrix on the plane home. But any company with Matrix in the product name is going to get a few extra points from me. So what is it, Neo?
Fully coherent POSIX file system that delivers local file system performance;
Distributed Coding, more resilient at scale, fast rebuilds, end to end data protection
Instantaneous snapshots, clones, tiering to S3, partial file rehydration;
InfiniBand or Ethernet, Hyper-converged or Dedicated Storage Server; and
Software scales to thousands of nodes and trillions of records;
Significantly more scalable than any appliance offering; and
Metadata scales to thousands of servers.
Patented erasure coding technology
Allows us to use 66% less NVMe compared to triple replication;
Fully distributed data and metadata for best parallelism / performance; and
Snapshots for “free” with no performance impact.
Integrated tiering in a single namespace
Allows for unlimited namespace critical for deep learning; and
Enables backup and cloud bursting to public cloud.
I Know Kung Fu
[Look, I’m just going to torture the Matrix analogy for a little longer, so bear with me]. So what do I do with all of this performance in a storage subsystem? Well, the key focus areas for WekaIO include:
Machine learning / AI;
Digital Radiology / Pathology;
Algorithmic Trading; and
Genomic Sequencing and Analytics.
Most of these workloads deal with millions of files, very large capacities, and are very sensitive to poor latency. There’s also a cool use case for media and entertainment environments that’s worth checking out if you’re into that sort of thing.
WekaIO are aiming to do about 30% of their sales directly, meaning they lean heavily on the channel. Both HPE and Penguin Computing are OEM providers, and obviously there’s also a software-only play with the AWS version. They’re talking about delivering some very big numbers when it comes to performance, but my favourite thing about them is the focus on being able to access the same data through all interfaces, and quickly.
WekaIO make some strong claims about their ability to deliver a fast and scalable file system solution, but they certainly have the pedigree to deliver a solution that meets a number of those claims. There’re some nice features, such as the ability to add servers with different profiles to the cluster, and running nodes in hyper-converged mode. When it comes down to it, performance is defined by the amount of cores available. If you add more compute, you get more performance.
In my mind, the solution isn’t for everyone right now, but if you have a requirement for a performance focused, massively parallel, scale-out storage solution with the ability to combine NVMe and S3, you’d do worse than to check out what WekaIO can do.