Disclaimer: I recently attended Storage Field Day 13. My flights, accommodation and other expenses were paid for by Tech Field Day and Pure Storage. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Primary Data recently presented at Storage Field Day 13. You can see videos of their presentation here, and download my rough notes from here.
My Applications Are Ignorant
I had the good fortune of being in a Primary Data presentation at Storage Field Day. I’ve written about them a few times (here, here and here) so I won’t go into the basic overview again. What I would like to talk about is the idea raised by Primary Data that “applications are unaware”. Unaware of what exactly? Usually it’s the underlying infrastructure. When you deploy applications they generally want to run as fast as they can, or as fast as they need to. When you think about it though, it’s unusual that they can determine the platform they run on and only run as fast as the supporting infrastructure allows. This is different to phone applications, for example, that are normally written to operate within the constraints of the hardware.
An application’s ignorance has an impact. According to Primary Data, this impact can be in terms of performance, cost, protection (or all three). The cost of unawareness can have the following impact on your environment:
- Bottlenecks hinder performance
- Cold data hogs hot capacity
- Over provisioning creates significant overspending
- Migration headaches keep data stuck until retirements
- Vendor lock-in limits agility and adds more cost
As well as this, the following trends are being observed in the data centre:
- Cost: Budgets are getting smaller;
- Time: We never have enough; and
- Resources: We have limited resources and infrastructure to run all of this stuff.
Put this all together and you’ve got a problem on your hands.
Primary Data In The Mix
Primary Data tells us that DataSphere is solving the main pain points through:
- Data migration and tiering
- Cloud adoption and improved TCO
- Scale-out NAS and performance
- Virtualisation and storage QoS
They do this with DataSphere, “a metadata engine that automates the flow of data across the enterprise infrastructure and the cloud to meet evolving application demands”. It:
- Is storage and vendor agnostic;
- Virtualises the view of data;
- Automates the flow of data; and
- Solves the inefficiency of traditional storage and compute architectures.
Can We Be More Efficient?
Probably. But the traditional approach of architecting infrastructure for various workloads isn’t really working as well as we’d like. I like the way Primary Data are solving the problem of application ignorance. But I think it’s treating a symptom, rather than providing a cure. I’m not suggesting that I think what Pd are doing is wrong by any stretch, but rather that my applications will still remain ignorant. They’re still not going to have an appreciation of the infrastructure they’re running on, and they’re still going to run at the speed they want to run at. That said, with the approach that Primary Data takes to managing data, I have a better chance of having applications running with access to the storage resources they need.
Application awareness means different things to different people. For some people, it’s about understanding how the application is going to behave based on the constraints it was designed within, and what resources they think it will need to run as expected. For other people, it’s about learning the behavior of the application based on past experiences of how the application has run and providing infrastructure that can accommodate that behaviour. And some people want their infrastructure to react to the needs of the application in real time. I think this is probably the nirvana of infrastructure and application interaction.
Ultimately, I think Primary Data provides a really cool way of managing various bits of heterogeneous storage in a way that aligns with some interesting notions of how applications should behave. I think the way they pick up on the various behaviours of applications within the infrastructure and move data around accordingly is also pretty neat. I think we’re still some ways away from running the kind of infrastructure that interacts intelligently with applications at the right level, but Primary Data’s solution certainly helps with some of the pain of running ignorant applications.
You can read more about DataSphere Extended Services (DSX) here, and the DataSphere metadata engine here.
Pingback: Storage Field Day 13 – Wrap-up and Link-o-rama | penguinpunk.net
Pingback: Primary Data Attacks Application Ignorance - Tech Field Day