It’s been a little while since I last wrote about Excelero. I recently had the opportunity to catch up with Josh Goldenhar and Tom Leyden and thought I’d share some of my thoughts here.
NVMe Performance Good, But Challenging
NVMe has really delivered storage performance improvements in recent times.
All The Kids Are Doing It
Great performance:
- Up to 1.2M IOPs, 6GB/s per drive
- Ultra-low latency (20μs)
Game changer for data-intensive workloads:
- Mission-Critical Databases
- Analytical Processing
- AI and Machine Learning
But It’s Not Always What You’d Expect
IOPs and Bandwidth Utilisation
- Applications struggle to use local NVMe performance beyond 3-4 drives
- Stranded IOPS and / or bandwidth = poor ROI
Sharing is the Logical Answer, with local latency
- Physical disaggregation is often operationally desirable
- 24 Drive servers are common and readily available
Data Protection Desired
- NVMe performs, but by itself offers no data protection
- Local data protection does not protect against server failures
Some NVMe-over-fabrics solutions offer controller based data protection, but limit IOPs, bandwidth and sacrifice latency.
Scale Up Or Out?
NVMesh – Scale-out design: data centre scale
- Disaggregated & converged architecture
- No CPU overhead: no noisy neighbours
- Lowest latency: 5μs
NVEdge – Scale-up design: rack scale
- Disaggregated architecture
- Full bandwidth even at 4K IO
- Client-less architecture with NVMe-oF initiators
- Enterprise-ready: RAID 1/0, High Availability with fast failover, Thin Provisioning, CRC
Flexible Deployment Models
There are a few different ways you can deploy Excelero.
Converged – Local NVMe drives in Application Servers
- Single, unified storage pool
- NVMesh initiator and client on all nodes
- NVMesh bypasses server CPU
- Various protection levels
- No dedicated storage servers needed
- Linearly scalable
- Highest aggregate bandwidth
Top-of-Rack Flash
- Single, unified storage pool
- NVMesh Target runs on dedicated storage nodes
- NVMesh Client runs on application servers
- Applications get performance of local NVMe storage
- Various Protection Levels
- Linearly scalable
Data Protection
There are also a number of options when it comes to data resiliency.
[image courtesy of Excelero]
Networking Options
You can choose either TCP/IP or RDMA. TCP/IP offers a latency hit, but it works with any NIC (and your existing infrastructure). RDMA has super low latency, but is only available on a limited subset of NICs.
NVEdge Then?
Excelero described NVEdge as “block storage software for building NVMe Flash Arrays for demanding workflows such as AI, ML and databases in the Cloud and at the Edge”.
Scale-up architecture
- High NVMe AFA performance, leveraging NVMe-oF
- Full bandwidth performance even at 4K block size
High availability, supporting:
- Dual-port NVMe drives
- Dual controllers (with fast failover, less than 100ms)
- Active / active controller operation and active/passive logical volume access
Data services include:
- RAID 1/0 data protection
- Thin Provisioning: thousands of striped volumes of up to 1PB each
- Enterprise grade block checksums (CRC 16/32/64).
Hardware Compatibility?
Supported Platforms
- x86-based systems for higher aggregate performance
- SmartNIC-based architectures for lower power & cost
HW Requirements
- Each controller has PCIe connectivity to all drives
- Controllers can communicate over a network
- Controllers communicate over both the network and drive pairs to identify connectivity (failure) issues
Supported Networking
- RDMA (InfiniBand or Ethernet) TCP/IP networking
Thoughts and Further Reading
NVMe has been a good news story for folks struggling with the limitations of the SAS protocol. I’ve waxed lyrical in the past about how impressed I was with Excelero’s offering. Not every workload is necessarily suited to NVMesh though, and NVEdge is an interesting approach to solving that problem. Where NVMesh provides a tonne of flexibility when it comes to deployment options and the hardware used, NVEdge doubles down on availability and performance for different workloads.
NVMe isn’t a handful of magic beans that will instantly have your storage workloads. You need to be able to feed it to really get value from it, and you need to be able to protect it too. It comes down to understanding what it is you’re trying to achieve with your applications, rather than just splashing cash on the latest storage protocol in the hope that it will make your business more money.
At this point I’d make some comment about data being the new oil, but I don’t really have enough background in the resources sector to be able to carry that analogy much further than that. Instead I’ll say this: data (in all of its various incantations) is likely very important to your business. Whether it’s something relatively straightforward like seismic data, or financial results, or policy documents, or it may be the value that you can extract from that data by having fast access to a lot of it. Whatever you’re doing with it, you’re likely investing in hardware and software that helps you get to that value. Excelero appears to have focused on ensuring that the ability to access data in a timely fashion isn’t the thing that holds you back from achieving your data value goals.