Random Short Take #63

Welcome to Random Short take #63. It’s Friday morning, and the weekend is in sight.

  • I really enjoyed this article from Glenn K. Lockwood about how just looking for an IOPS figure can be a silly thing to do, particularly with HPC workloads. “If there’s one constant in HPC, it’s that everyone hates I/O.  And there’s a good reason: it’s a waste of time because every second you wait for I/O to complete is a second you aren’t doing the math that led you to use a supercomputer in the first place.”
  • Speaking of things that are a bit silly, it seems like someone thought getting on the front foot with some competitive marketing videos was a good idea. It rarely is though.
  • Switching gears a little, you may have been messing about with Tanzu Community Edition and asking yourself how you could SSH to a node. Ask no more, as Mark has your answer.
  • Speaking of storage companies that are pretty pleased with how things are going, Weka has put out this press release on its growth.
  • Still on press releases, Imply had some good news to share at Druid Summit recently.
  • Intrigued by Portworx and want to know more? Check out these two blog posts on configuring multi-cloud application portability (here and here) – they are excellent. Hat tip to my friend Mike at Pure Storage for the links.
  • I loved this article on project heroics from Chris Wahl. I’ve got a lot more to say about this and the impact this behaviour can have on staff but some of it is best not committed to print at this stage.
  • Finally, I replaced one of my receivers recently and cursed myself once again for not using banana plugs. They just make things a bit easier to deal with.

Random Short Take #49

Happy new year and welcome to Random Short Take #49. Not a great many players have worn 49 in the NBA (2 as it happens). It gets better soon, I assure you. Let’s get random.

  • Frederic has written a bunch of useful articles around useful Rubrik things. This one on setting up authentication to use Active Directory came in handy recently. I’ll be digging in to some of Rubrik’s multi-tenancy capabilities in the near future, so keep an eye out for that.
  • In more things Rubrik-related, this article by Joshua Stenhouse on fully automating Rubrik EDGE / AIR deployments was great.
  • Speaking of data protection, Chris Colotti wrote this useful article on changing the Cloud Director database IP address. You can check it out here.
  • You want more data protection news? How about this press release from BackupAssist talking about its partnership with Wasabi?
  • Fine, one more data protection article. Six backup and cloud storage tips from Backblaze.
  • Speaking of press releases, WekaIO has enjoyed some serious growth in the last year. Read more about that here.
  • I loved this article from Andrew Dauncey about things that go wrong and learning from mistakes. We’ve all likely got a story about something that went so spectacularly wrong that you only made that mistake once. Or twice at most. It also reminds me of those early days of automated ESX 2.5 builds and building magical installation CDs that would happily zap LUN 0 on FC arrays connected to new hosts. Fun times.
  • Finally, I was lucky enough to talk to Intel Senior Fellow Al Fazio about what’s happening with Optane, how it got to this point, and where it’s heading. You can read the article and check out the video here.

Random Short Take #45

Welcome to Random Short Take #45. The number 45 has taken a bit of a beating in terms of popularity in recent years, but a few pretty solid players have nonetheless worn 45 in the NBA, including MJ and The Rifleman. My favourite from this list is A.C. Green (“slam so hard, break your TV screen“). So let’s get random.

WekaIO And A Fresh Approach

Disclaimer: I recently attended Storage Field Day 19.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 19. You can see videos of their presentation here, and download my rough notes from here.

 

More Data And New Architectures

Liran Zvibel (Co-founder and CEO) spent some time talking about the explosion in data storage requirements in the next 4 – 5 years. It was suggested that most of this growth will come in the form of unstructured data. The problem with today’s storage systems, he suggested, was that storage is broken into “Islands of Compromise” categories – each category carries a leader. What does that mean exactly? DAS and SAN cannot share data easily, and the performance of a number of NAS and Object architectures isn’t great.

A New Storage Category

WekaIO is positioning itself in a new storage category. One that delivers:

  • The highest performance for any workload
  • Complete data shareability
  • Cloud native, hybrid cloud support
  • Full enterprise features
  • Simple management

Unique Product Differentiation

So what is that sets WekaIO apart from the rest of the storage industry? Zvibel listed a number of differentiators, including:

  • Only POSIX namespace that scales to exabytes of capacity and trillions of files
  • Only networked file system that is faster than local storage
    • Massively parallel
    • Lowest latency
  • Snap to object
    • Unique blend of All-Flash and Object storage for instant backup to cloud storage (no backup software required)
  • Cloud burst from on-premises to public cloud
    • Fully hybrid cloud enabled with highest performance
  • End-to-end data encryption with no performance degradation
    • Critical for modern workloads and compliance

[image courtesy of Barbara Murphy]

 

Customer Examples

This all sounds great, but where is WekaIO really being used effectively? Barbara Murphy spent some time talking with the delegates about a number of customer examples across the following market verticals.

Life sciences

  • Genomics sequencing and analytics
  • Drug discovery
  • Microscopy

Deep Learning

  • Machine Learning / Artificial Intelligence
  • Real-time analytics
  • IoT

 

Thoughts and Further Reading

I’ve written enthusiastically about WekaIO before. It’s easy to get caught up in some of the hype that seems to go hand in hand with WekaIO presentations. But WekaIO has a lot of data to back up its claims, and it’s taken an interesting approach to solving traditional storage problems in a non-traditional fashion. I like that there’s a strong cloud story there, as well as the potential to leverage the latest hardware advancements to deliver the performance companies need.

The analysts and storage vendors drone on and on about the explosion in data growth over the coming years, but it’s a real problem. Our workload challenges are changing as well, and it seems like a new approach is needed for how we approach some of these challenges. The scale of the data that needs to be crunched doesn’t always mean that DAS is a good option. You’re more likely to see these kinds of challenges show up in the science and technology industries. And WekaIO seems to be well-positioned to meet these challenges, whether it’s in public cloud or on-premises. It strikes me that WekaIO’s focus on performance and resilience, along with a robust software-defined architecture, has it in a good position to tackle the types of workload problems we’re seeing at the edge and in AI / ML focused environments. I’m really looking forward to seeing what comes next for WekaIO.

Random Short Take #22

Oh look, another semi-regular listicle of random news items that might be of some interest.

  • I was at Pure Storage’s //Accelerate conference last week, and heard a lot of interesting news. This piece from Chris M. Evans on FlashArray//C was particularly insightful.
  • Storage Field Day 18 was a little while ago, but that doesn’t mean that the things that were presented there are no longer of interest. Stephen Foskett wrote a great piece on IBM’s approach to data protection with Spectrum Protect Plus that’s worth read.
  • Speaking of data protection, it’s not just for big computers. Preston wrote a great article on the iOS recovery process that you can read here. As someone who had to recently recover my phone, I agree entirely with the idea that re-downloading apps from the app store is not a recovery process.
  • NetApp were recently named a leader in the Gartner Magic Quadrant for Primary Storage. Say what you will about the MQ, a lot of folks are still reading this report and using it to help drive their decision-making activities. You can grab a copy of the report from NetApp here. Speaking of NetApp, I’m happy to announce that I’m now a member of the NetApp A-Team. I’m looking forward to doing a lot more with NetApp in terms of both my day job and the blog.
  • Tom has been on a roll lately, and this article on IT hero culture, and this one on celebrity keynote speakers, both made for great reading.
  • VMworld US was a little while ago, but Anthony‘s wrap-up post had some great content, particularly if you’re working a lot with Veeam.
  • WekaIO have just announced some work their doing Aiden Lab at the Baylor College of Medicine that looks pretty cool.
  • Speaking of analyst firms, this article from Justin over at Forbes brought up some good points about these reports and how some of them are delivered.

Random Short Take #14

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Episode 14 – giddy-up!

WekaIO Continues To Evolve

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 18. You can see videos of their presentation here, and download my rough notes from here. I’ve written about WekaIO before, and you can read those posts here and here.

 

WekaIO

Barbara Murphy described WekaIO Matrix as “the fastest, most scalable parallel file system for AI and technical compute workloads that ensure applications never wait for data”.

 

What They Do

So what exactly does WekaIO Matrix do?

  • WekaIO Matrix is software-defined storage solution that runs on anything from bare metal, VMs, containers, on-premises or in the cloud;
  • Fully-coherent POSIX file system that’s faster than a local file system;
  • Distributed Coding, More Resilient at Scale, Fast Rebuilds, End-to-End Data Protection; and
  • InfiniBand or Ethernet, Converged or Dedicated, on-premises or cloud.

[image courtesy of WekaIO]

 

Lots of Features

WekaIO Matrix now has a bunch of features, including:

  • Support for S3, SMB, and NFS protocols;
  • Cloud backup, Snapshots, Clones, and Snap-2-Obj;
  • Active Directory support and authentication;
  • POSIX;
  • Network High Availability;
  • Encryption;
  • Quotas;
  • HDFS; and
  • Tiering.

Flexible deployment models

  • Appliance model – compute and storage on separate infrastructure; and
  • Converged model – compute and storage on shared infrastructure.

Both models are cloud native because “[e]verybody wants the ability to be able to move to the cloud, or leverage the cloud”

 

Architectural Considerations

WekaIO is focused on delivering super fast storage via NVMe-oF, and say that NFS and SMB deliver legacy protocol support for convenience.

The Front-End

WekaIO front-ends are cluster-aware

  • Incoming read requests optimised re location and loading conditions – incoming writes can go anywhere
  • Metadata fully distributed
  • No redirects required

SR-IOV optimises network access WekaIO directly access NVMe Flash

  • Bypassing the kernel leads to better performance.

The Back-End

The WekaIO parallel clustered filesystem is

  • Optimised flash-native data placement
    • Not designed for HDD
    • No “cylinder groups” or other anachronisms – data protection (similar to EC)
    • 3-16 data drives, +2 or +4 parity drives
    • Optional hot spares – uses a “virtual” hot spare

Global namespace = hot tier + Object storage tier

  • Tiering to S3-API Object storage
    • Additional capacity with lower cost per GB
    • Files shared to object storage layer (parallelised access optimise performances, simplifies partial or offset reads)

WekaIO uses the S3-API as its equivalent of “SCSI” for HDD.

 

Conclusion and Further Reading

I like the WekaIO story. They take away a lot of the overheads associated with non-DAS storage through the use of a file system and control of the hardware. You can make DAS run really fast, but it’s invariably limited to the box that it’s in. Scale-out pools of storage still have a place, particularly in the enterprise, and WekaIO are demonstrating that the performance is there for the applications that need it. There’s a good story in terms of scale, performance, and enterprise resilience features.

Perhaps you like what you see with WekaIO Matrix but don’t want to run stuff on-premises? There’s a good story to be had with Matrix on AWS as well. You’ll be able to get some serious performance, and chances are it will fit in nicely with your cloud-native application workflow.

WekaIO continues to evolve, and I like seeing the progress they’ve been making to this point. It’s not always easy to convince the DAS folks that you can deliver a massively parallel file system and storage solution based on commodity hardware, but WekaIO are giving it a real shake. I recommend checking out Chris M. Evans’s take on WekaIO as well.

Random Short Take #11

Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe. Happy New Year too. I hope everyone’s feeling fresh and ready to tackle 2019.

  • I’m catching up with the good folks from Scale Computing in the next little while, but in the meantime, here’s what they got up to last year.
  • I’m a fan of the fruit company nowadays, but if I had to build a PC, this would be it (hat tip to Stephen Foskett for the link).
  • QNAP announced the TR-004 over the weekend and I had one delivered on Tuesday. It’s unusual that I have cutting edge consumer hardware in my house, so I’ll be interested to see how it goes.
  • It’s not too late to register for Cohesity’s upcoming Helios webinar. I’m looking forward to running through some demos with Jon Hildebrand and talking about how Helios helps me manage my Cohesity environment on a daily basis.
  • Chris Evans has published NVMe in the Data Centre 2.0 and I recommend checking it out.
  • I went through a basketball card phase in my teens. This article sums up my somewhat confused feelings about the card market (or lack thereof).
  • Elastifile Cloud File System is now available on the AWS Marketplace – you can read more about that here.
  • WekaIO have posted some impressive numbers over at spec.org if you’re into that kind of thing.
  • Applications are still open for vExpert 2019. If you haven’t already applied, I recommend it. The program is invaluable in terms of vendor and community engagement.

 

 

Random Short Take #6

Welcome to the sixth edition of the Random Short Take. Here are a few links to a few things that I think might be useful, to someone.

WekaIO – Not The Matrix You’re Thinking Of

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 15. It’s not the first time I’ve heard from them, and you can read my initial thoughts on them here. You can see their videos from Storage Field Day 15 here, and download a PDF copy of my rough notes from here.

 

Enter The Matrix

Fine, I just rewatched The Matrix on the plane home. But any company with Matrix in the product name is going to get a few extra points from me. So what is it, Neo?

  • Fully coherent POSIX file system that delivers local file system performance;
  • Distributed Coding, more resilient at scale, fast rebuilds, end to end data protection
  • Instantaneous snapshots, clones, tiering to S3, partial file rehydration;
  • InfiniBand or Ethernet, Hyper-converged or Dedicated Storage Server; and
  • Bare-metal, containerised, or running in a VM.

There’s an on-premises version and one built for public cloud use.

Liran Zvibel (Co-founder and CEO) took us through some of the key features of the architecture.

Software based for dynamic scalability

  • Software scales to thousands of nodes and trillions of records;
  • Significantly more scalable than any appliance offering; and
  • Metadata scales to thousands of servers.

Patented erasure coding technology

  • Allows us to use 66% less NVMe compared to triple replication;
  • Fully distributed data and metadata for best parallelism / performance; and
  • Snapshots for “free” with no performance impact.

Integrated tiering in a single namespace

  • Allows for unlimited namespace critical for deep learning; and
  • Enables backup and cloud bursting to public cloud.

 

I Know Kung Fu

[Look, I’m just going to torture the Matrix analogy for a little longer, so bear with me]. So what do I do with all of this performance in a storage subsystem? Well, the key focus areas for WekaIO include:

  • Machine learning / AI;
  • Digital Radiology / Pathology;
  • Algorithmic Trading; and
  • Genomic Sequencing and Analytics.

Most of these workloads deal with millions of files, very large capacities, and are very sensitive to poor latency. There’s also a cool use case for media and entertainment environments that’s worth checking out if you’re into that sort of thing.

 

Thoughts

WekaIO are aiming to do about 30% of their sales directly, meaning they lean heavily on the channel. Both HPE and Penguin Computing are OEM providers, and obviously there’s also a software-only play with the AWS version. They’re talking about delivering some very big numbers when it comes to performance, but my favourite thing about them is the focus on being able to access the same data through all interfaces, and quickly.

WekaIO make some strong claims about their ability to deliver a fast and scalable file system solution, but they certainly have the pedigree to deliver a solution that meets a number of those claims. There’re some nice features, such as the ability to add servers with different profiles to the cluster, and running nodes in hyper-converged mode. When it comes down to it, performance is defined by the amount of cores available. If you add more compute, you get more performance.

In my mind, the solution isn’t for everyone right now, but if you have a requirement for a performance focused, massively parallel, scale-out storage solution with the ability to combine NVMe and S3, you’d do worse than to check out what WekaIO can do.