Coho Data – It’s numerology, but not as we know it

Disclaimer: I recently attended Storage Field Day 8.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD8, there are a few things I want to include in the post. Firstly, you can see video footage of the Coho Data presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Coho Data website that covers some of what they presented.



Coho Data wants to be “a data-centric bridge for evolving enterprise infrastructure”, by delivering “storage transformation without disruption for the modern enterprise”.

Andy Warfield (@AndyWarfield) is a really smart dude. And he does a really great presentation. I encourage you to check out the videos from SFD8, because I can’t really do justice to a lot of the content he presented this time around. Indeed, I had the same problem last time I encountered Andy.



It’s numerology, but not as we know it

Andy spent some time talking about the concept of “workload numerology”. No, not that kind of numerology. This is more about the ability to understand workloads hosted on your storage platform – “it’s not just a matter of speeds and feeds”.

Coho Data are right into the concept of being able to manage the placement of both data and network traffic based on detailed workload analysis. They do this in a few different ways, and I recommend you read their architecture white paper for more information.

Andy pointed out that with traditional storage, you were worried about durability, and getting high performance off crappy hardware. However “modern storage design is about solving a connectivity and locality problem, rather than a durability problem”. As such, we need to be taking a new approach to the design and operation of these platforms.

Andy also noted that performance based placement decisions benefit from workload characterisations, but characterising things like working sets are expensive in time and space. You can, however, use counter stacks to efficiently encode the cardinality of uniqueness over time. You can read more about Hyper Log Logs “HLLs” here.

In short, understand the working set characteristics and you’ll get a lot more value from your platform.


Closing Thoughts and Further Reading

Ray Lucchesi (@RayLucchesi) did a much better job of covering Andy’s presentation than I ever would, and I strongly encourage you to go read it. If you watch the Coho presentation from SFD6 and compare it to the one by Andy at SFD8, you’ll see that, not only has Andy gotten really good at keeping the discussion on track, but also Coho has made some really good progress with the DataStream platform in general. I’m looking forward to seeing what the next 12 months brings for Coho.