Disclaimer: I recently attended Storage Field Day 12. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Here are some notes from Elastifile‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here.
What is it?
Elastifile’s crown jewel is the Elastic Cloud File System (ECFS). It’s a scalable data platform, supporting:
- 1000s of nodes, 100000s of clients
- 100s thousands FS (data containers), unlimited files/directories
- Exabyte scale capacity, 10s millions IOPS (and above)
And offering “advanced embedded data management and analytics”.
It is software only, and is virtualisation and cloud friendly, running on:
- Physical servers (tested on 20+ platforms)
- On-premises virtualisation (VMware, KVM, etc)
- Cloud VMs (Amazon, Google, etc)
It’s also “Flash Native”, supporting all NAND technologies interfaces and offering support for object/S3 cold tiers (dynamic tiering/HSM/ILM).
From an architectural perspective, ECFS also offers:
- Enterprise Level Features, including non-disruptive upgrades (NDUs), n-way redundancy, self healing, snapshots, sync/async DR
- Storage Interfaces based on NFSv3/v4, SMB2/3, S3, HDFS
It also has cloud-friendly features, including:
- Multi-tenancy, QoS, Hot add/remove nodes/capacity
- Snapshot shipping to S3 (“CloudConnect”)
- ILM/HSM/Dynamic tiering to other sites/clouds, Object/S3
- Multi-site (geographically distributed access)
You can also leverage a number of different deployment modes, including:
- Hyperconverged mode (HCI)
- Dedicated storage mode (DSM) – with 2 tiers or single tier
- Cloud (“Marketplace”) – as a service / as an application
Design Objectives and Journey
So what did Elastifile consider the foundation for a successful architecture when designing the product? They told us it would be “[a] good trade-off between all relevant dimensions, resources and requirements to produce the best solution for the desired target(s)”. Note, however, that it isn’t a linear path. Their goal is to “[b]e the best data platform for the cloud era enterprise”. It’s a lofty goal, to be sure, but when you’re starting from scratch it’s always good to aim high. Elastifile went on to tell us a little bit about what they’ve seen in the marketplace, what requirements these produced, and how those requirements drove product-direction decisions.
Elastifile Architectural Base Observations
Elastifile went into what they saw out in the marketplace:
- Enterprises increasingly use clouds or cloud-like techniques;
- In a cloud (like) environment the focus ISN’T infrastructure (storage) but rather services (data);
- Data services must be implemented by simple, efficient mechanisms for many concurrent I/O data patterns;
- Everything should be managed by APIs;
- Data management should be very fine grained; and
- Data mobility has to be solved.
Elastifile Architectural Base Requirements
The following product requirements were then developed based on those observations:
- Everything must be automatic;
- Avoid unnecessary restrictions and limitations. Assume as little as you can about the customer’s I/O patterns;
- Bring Your Own Hardware (BYOH): Avoid unnecessary/uncommon hardware requirements like NVRAM, RDMA networks, etc (optional optimisations are OK);
- Support realtime & dynamic reconfiguration of the system;
- Support heterogeneous hardware (CPU, memory, SSD types, sizes, etc); and
- Provide good consistent predictable performance for the given resources (even under failures and noisy environments).
Elastifile Architectural Base Decisions
So how do these requirements transform into architectural directions?
Scale-out (and Scale-up)
- Cloud, cloud and cloud! (Cloudy is popular for a reason)
Software only
- Cloud and virtualisation friendly (smart move)
- Cost effective (potentially, although my experience of software companies is that they all eventually want to be Oracle)
Flash only
- Provides flexibility and efficient multiple concurrent IO patterns
- Capacity efficiency achieved by dedupe, compression and tiering
Application level file system
- Enables unique data level services and the best performance (!) (a big claim, but Elastifile are keen to stand behind it)
- Superset of block/object interfaces
- Enables data sharing (and not only storage sharing) for user self-service. (The focus on data, not just storage, is great).
Bumps in the Road
Elastifile started in 2013 and launched in v1 in Q4 2016 and they had a few bumps along the way.
- To start with, the uptake in private clouds didn’t happen as expected;
- OpenStack didn’t gather enough momentum;
- It seems that private clouds don’t make sense short of the web-scale guys – lack of economy of scale;
- Many enterprises do not attempt to modernise their legacy systems, but rather attempt to shift (some/all) workloads to the public cloud.
The impact on product development? They had to support public cloud use cases earlier than expected. According to Elastifile, this turned out to be a good thing in the end, and I agree.
The hyperconverged infrastructure (HCI) model has proven problematic for many use cases. (And I know this won’t necessarily resonate with a lot of people I know, but horses for course and all that). Why not?
- It’s not perceived well by many storage administrators due to problematic responsibilities boundaries (“where is my storage?);
- Requires coordination with the applications/server infrastructure
- Limits the ability to scale resources (e.g. scale capacity, not performance)
HCI is nonetheless a good fit for
- Places/use cases without (appropriate/skilled) IT resources (e.g. ROBO, SMB); and
- Vertical implementations (this means web scale companies in most places).
The impact on Elastifile’s offering? They added a dedicated storage mode to provide separate storage resources and scaling to get around these issues.
Conclusion
One of the things I really like about Elastifile is that the focus isn’t on the infrastructure (storage) but rather services (data). I’ve a million conversations lately with people across a bunch of different business types around the importance of understanding their applications (and the data supporting their applications) and why that should be more important to them than the badge on the front of the platform. That said, plenty of companies are running applications and don’t really understand the value of these applications in terms of business revenue or return. There’s no point putting in a massively scalable storage solution of you’re doing it to support applications that aren’t helping the business do its business. It seems like a reasonable thing to be able to articulate, but as anyone who’s worked in large enterprise knows, it’s often not well understood at all.
Personally, I love it when vendors go into the why of their key product architectures – it’s an invaluable insight into how some of these things get to minimum viable product. Of course, if you’re talking to the wrong people, then your product may be popular with a very small number of shops that aren’t spending a lot of money. Not only should you understand the market you’re trying to develop for, you need to make sure you’re actually talking to people representative of that market, and not some confused neckbeards sitting in the corner who have no idea what they’re doing. Elastifile have provided some nice insight into why they’ve done what they’ve done, and are focusing on areas that I think are really important in terms of delivering scalable and resilient platforms for data storage. I’m looking forward to seeing what they get up to in the future. If you’d like to know more, and don’t mind giving up some details, check out the Elastifile resources page for a nice range of white papers, blog posts and videos.
Pingback: Storage Field Day 12 – Wrap-up and Link-o-rama | penguinpunk.net