Cloudian Announces HyperFile, Makes Object Better

Cloudian recently announced an addition to their HyperStore appliance. I had the opportunity to be briefed by Jon Toor and thought I’d share the highlights of the announcement here. I’ve had the pleasure of talking to Cloudian at a few Storage Field Day events. If you’re unfamiliar with the HyperStore 4000, you can read my coverage of it here. In short, it’s 840TB of object storage in 4RU with really, really, comprehensive S3 compliance, amongst other things.


HyperFile You Say?

HyperFile is the new file front-end controller for the HyperStore appliance. It supports the following features:

  • SMB3 and NFS3;
  • High Availabilty with active / passive controllers;
  • Non-disruptive failover;
  • POSIX compliance;
  • Active Direcotry / LDAP authentication;
  • Write Once Read Many (WORM); and
  • Snapshots.

It wouldn’t be a product announcement without a bezel shot. I can’t say whether this is actually what it looks like, but if it does, it’s kind of cool.

[image courtesy of Cloudian]

The appliance itself is 2RU with dual controllers and a shared backplane. The cool thing is that it can be deployed as VMs, making it appealing for service providers looking to setup multiple environments for customers. Supported hypervisors include vSphere 5.1 (or later) and KVM. Replication is handled at the HyperStore level.

Multi-tenancy is supported with dedicated controllers.

[image courtesy of Cloudian]

There’s a global namespace between file and object and it also supports a shared namespace across multiple NAS controllers, meaning you can up your number of controllers to increase bandwidth or replication performance. From a scalability perspective, it supports up to 64 namespaces per controller. One of my favourite features is what Cloudian call “converged access” between file and object, meaning you could use S3 for storing files. It also supports Microsoft Azure, Google Cloud Platform and Amazon S3 formats, opening up some interesting possibilities for file consumption on-premises and in the cloud.

There are two editions available. The Basic HyperFile NAS Controller includes

  • Full protocol support;
  • High-availability;
  • Converged data access; and
  • Data migration.

The Enterprise HyperFile NAS Controller adds

  • Snapshot;
  • WORM; and
  • Geo-distribution with file versioning/locking.



I’ve been a fan of Cloudian’s products for some time, and this addition to the HyperStore platform makes them a compelling option for file and object storage in the data centre. With this approach they’re looking to push further into Media Asset Management (MAM) and video surveillance solutions. The title of the post is misleading. Object is already pretty cool, and a very suitable solution for a number of workloads. So why would an object vendor need to add file to work in these industries? Isn’t object ideally suited to these kinds of workloads? Yes, but sometimes the leading software vendors and people in charge of workflows are focused on other things, like only supporting file. So Cloudian have adapted to take a bigger piece of the pie. In much the same way that some data protection solutions are still file oriented, the HyperFile allows Cloudian to play in areas where it’s traditionally been excluded.

I’m also a fan of the appliance as VM approach and I like the breadth of protocol support and cloud integration available. If you’re going to put cloud in the name of your company the expectation will be there that you know what you’re doing. Cloudian haven’t disappointed thus far. If you’re in the market for a solid object (and now file) solution, you could do worse than talking to the folks at Cloudian.

So NooBaa, eh?

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.



I had the opportunity to speak with NooBaa about six months ago. At the time they were still developing their product, but I thought it looked pretty cool. At Tech Field Day Extra,  they demoed their cloud services engine. The company was founded by Yuval Dimnik (Co-founder and CEO) and Guy Margalit (Co-founder and CTO). If you’re familiar with Exanet or Dell FluidFS, you’ll be familiar with some of their capabilities. NooBaa was founded in 2014, with a product launch in September 2016, and a current headcount of 14 (they tell us have a strong security/storage DNA).

“Customers don’t care how you do your tech, they care how it fixes their problems”


So NooBaa, eh?

They have thought about the name. A lot. It’s a pure software product enabling folks to create and provision cloud services

  • Storage (like AWS S3) – First!
  • Serverless compute (like AWS Lambda) – Future

The key is that the customer owns the service, with

  • Full control of who accesses what, and what stays on-premises
  • No cloud vendor lock-in

The services use

  • Heterogeneous resources – cloud resources and servers
  • In the cloud, on-premises, and spanned

So, take all the spare storage you have lying about on Windows and Linux VMs, bang it all in a single namespace and present it back to your object-friendly apps. Replicate it to the cloud if you like. Or use all your spare clouds. Sounds like a cool idea.
Design Considerations (once bitten, twice shy)

They wanted to design a product that behaves like the cloud, but gives you the choice to consume from on-premises or cloud.

But can you predict the unpredictable?

  • Cloud strategy? Everyone has one of those, they’re just not sure what it really means.
  • Growth rate? Oh, it grows a lot.
  • Hardware technologies? Yep, software still needs hardware.
  • Vendors? Who can really work out what they do?
  • Organisational changes?
  • Security issues and lurking “heart bleeds”?

Stuff is hard. Along with this, NooBaa were looking to add the following capabilities

  • On-premises, multi-cloud, and supporting cloud migration
  • P2P scalable capacity
  • Monitor hardware and adapt
  • Agnostic to the machine
  • Allowed to grow, allowed to shrink
  • User space as a religion – when you need to fix that you can do it right away


NooBaa is all about a hybrid approach to resources, supporting multiple cloud providers and on-premises resources. It also has support for multiple sites.


The key to NooBaa’s storage performance in what might seem to be non-performant environments is the way it stores data, as you can see in the below diagram.



Note that they’re not targeting low-latency workloads. At this stage they’re cloud agnostic and hoping to keep things that way. Heterogeneous resources are key for NooBaa. You can also sign up for the Community Edition – limited to 20TB aggregate object size.
Final Thoughts and Reading


The name doesn’t roll off the tongue, and the colour-scheme is very pretty. But I think this belies the thought that’s gone into this product. Yuval and his team have a strong background in scalable object storage, and I’m excited to see them finally come out of stealth. The concept of treating storage nodes as second class citizens is interesting, and I’m looking forward to taking the Community Edition for a spin when I get my act together in the near future. In the meantime, head over to Alastair’s blog for a more succinct write-up on what we saw. John White also did a great post here. You can grab a copy of my raw notes here, and watch NooBaa’s TFDx presentations here.


Caringo Announces SwarmNFS

Caringo recently announced SwarmNFS, and I recently had the opportunity to be briefed by Caringo’s Adrian J Herrera (VP Marketing). If you’re not familiar with Caringo, their main platform is Swarm, which “provides a platform for data protection, management, organization and search at massive scale”. You can read an overview of Swarm here, and there’s also a technical overview here.


So what is it?

SwarmNFS is a “stateless Linux process that integrates directly with Caringo Swarm. It delivers a global namespace across NFSv4, HTTP, SCSP (Caring’s protocol), S3, and HDFS, delivering data distribution and data management at scale”.

SwarmNFS is basically an NFS server modified with proprietary code. It is:

  • Stateless and lightweight;
  • Has no caching or spooling;
  • Supports parallel data streaming; and
  • Has no single point of failure, with built-in high availability.

Caringo tell me this makes it a whole lot easier to centralise, distribute and manage data, while using a bunch less resources than a traditional file gateway. You can run it as either a Linux process, an appliance or via a VM. Caringo also tell me that, since they connect directly into Swarm, there are less bottlenecks than the traditional approach using gateways, FUSE and proxies.


Everything in the UI can be done via the API as well, and it has support for multi-tenancy. As I mentioned before, there’s a global namespace with “Universal Access”, meaning that files can be written, read and edited through any interface (NFSv4, SCSP/HTTP, S3, HDFS). Having been a protocol prisoner in previous roles it’s nice to think the there’s a different way to do things.


What do I use it for?

You can use this for all kinds of stuff Adrian ran me through some use cases, including:

  • Media and entertainment (think media streaming / content delivery); and
  • Street view type image storage.

One of the key things here is that, because the platform uses NFS, a lot of application re-work doesn’t necessarily need to occur to take advantage of the object storage platform. In my opinion this is a pretty cool feature of the platform, and one that should definitely see people look at SwarmNFS fairly seriously when evaluating their object storage options.



Caringo are doing some really cool stuff. If you haven’t checked out FileFly before, it’s also worth a look. The capabilities of the Swarm platform are growing at a rapid place. And the storage world is becoming more object and less block and file as each day passes. Enrico‘s been telling me that for ages now, and everything I’m seeing supports that. Caringo’s approach to metadata – storing metadata with the object itself – also means you can do a bunch of cool stuff with it fairly easily, like replicating it, applying erasure coding to it, and so forth. The upshot is that now the data’s truly portable. So, if you’re object-curious but still hang out with file types, maybe SwarmNFS might be a nice compromise for everyone.


Exablox Isn’t Just Pretty Hardware

Disclaimer: I recently attended Storage Field Day 10.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.


Before I get started, you can find a link to my raw notes on Exablox‘s presentation here. You can also see videos of the presentation here.  You can find a preview post from Chris M. Evans here.


It’s Not Just the Hardware

I waxed lyrical about the Exablox hardware platform after seeing it at Storage Field Day 7. But while the OneBlox hardware is indeed pretty cool (you can see the specifications here), the cloud-based monitoring platform, OneSystem, is really the interesting bit.

According to Exablox, the “OneSystem application is used to combine OneBlox appliances into Rings as well as configuring shares, user access, and remote replication”. It’s the mechanism used for configuration, as well as monitoring, alerting and reporting.

OneSystem is built on a cloud-based, multi-tenant architecture. There’s nothing to install for organisations, VARs, and MSPs. Although if you feel a bit special about how your data is treated, there is an optional, private OneSystem deployment available for on-premises management. Exablox pride themselves on the “world-class” support they provide to customers, with a customer-first culture being one of the dominant themes when talking to them about support capability. Some of the other benefits of the OneSystem approach is:

  • The ability to globally manage OneBlox anywhere; and
  • Deliver seamless OneBlox software upgrades.

Exablox also provide 24×7 proactive monitoring, providing insight into, amongst other things:

  • Storage utilisation and analysis;
  • Storage health and alerts; and
  • OneBlox drive health.

The cool thing about this platform is that it offers the ability to configure custom storage policies and simple scaling for individual applications. In this manner you can configure the following data services on a “per application” basis:

  • Variable or fixed-length deduplication;
  • Compression on/off;
  • Continuous data protection on/off and retention; and
  • Remote replication on/off.


I Want My Data Everywhere

While the OneBlox ring is currently limited to 7 systems per cluster, you can have two or more (up to 10) clusters operating in a mesh for replication. You can then conceivably have a whole bunch of different data protection schemes in place depending on what you need to protect and where you need it protected. The great thing is that, with the latest version of OneSystem, you can have a one-to-many replication relationship between directories as well. This kind of flexibility is really neat in my opinion. Note that replication is asynchronous.



Further Reading and Final Thoughts

If you’ve read any of my recent posts on the likes of Pure, Nimble and Tintri, it would feel like everyone and their dog is into cloud-based monitoring and analytics systems for storage platforms. This is in no way a bad thing, and something that I’m glad we’re seeing become a prevalent feature with these “modern” storage architectures. We store a whole bunch of data on these things. And sometimes it’s even data that is vital to the success of the various business endeavours we undertake on a daily basis. So it’s great to see vendors are taking this requirement seriously. It also helps somewhat that people are a little more comfortable with the concept of keeping information in “the cloud”. This certainly helps the vendors control the end user experience form a support viewpoint, rather than relyin on arcane systems deployed across multiple VMs that invariably fail at the time you need to dig into the data to find out what’s really going on in the environment.

Exablox have come up with a fairly unique approach to scale-out NAS, and I’m keen to see where they take it from here. Features such as remote replication and the continuing maturity of the OneSystem platform make me think that they’re gearing up to push things a little beyond the BYO drives SMB space. I’ll be interested to see just how that plays out.

Ray Lucchesi did a thorough write-up on Exablox that you can read here, while Francesco Bonetti did a great write-up here. Exablox has also published a technical overview of OneBlox and OneSystem that is worth checking out.


SwiftStack Announces Object Storage Version 4.0

If you’ve not heard of SwiftStack before, they do “object storage for the enterprise”, with the core product built on OpenStack Swift. I recently had the opportunity to be briefed by Mario Blandini on their 4.0 announcement. Mario describes them as “Like Amazon cloud but inside your DC and behind your firewall”.

New SwiftStack 4.0 innovations introduced today (and available now or in the next 90 days) include:

  • Integrated load balancing reducing the need for expensive dedicated network hardware and minimizing latency and bandwidth costs while scaling to larger numbers of storage nodes
  • Metadata search increases business value with integrated third-party indexing and search services to make stored object data analytics-ready
  • SwiftStack Drive is an optional desktop client that enables access to objects directly from desktops or laptops
  • Enhanced management with new IPv6 support, capacity planning and advanced data migration tools


One of the key points in this announcement is the metadata search capability. Object storage is not just about “cheap and deep”, and the way we use metadata can have a big impact on the value of the data, often to applications that didn’t necessarily generate the data in the first place.

Like all good scale out solutions, you don’t need to buy everything up front, just what you need to get started. SwiftStack aren’t in the hardware business though, so you’ll be rolling your own. The hardware requirements for SwiftStack are here, and there’s also a reference architecture for Cisco.



SwiftStack have plans to introduce “Swift File Access” in 2016


Some of the benefits of this include:

  • Scale-out file services; SMB and NFS – minimizes the need for gateways
  • Fully bimodal > files can come in over SMB and accessed through object APIs and visa versa
  • Integrated into the proxy role > performance scales independently of capacity

SwiftStack also have plans to introduce “Object Synchronization” in 2016


This will provide S3 Synchronization capability, including

  • Replication of objects to S3 buckets
  • Policy-driven > protecting and accessing files using centralized policies
  • Supporting any cloud compatible with the S3 API

This is pretty cool as there’s a lot of momentum within enterprises to consume data in places where it’s needed, not necessarily where it’s created.


Final Thoughts

Object storage is hot, because folks love cloud, and object is a big part of that. I like what object can do for storage, particularly as it relates to metadata and scale out performance. I’m happy to see SwiftStack making a decent play inside the enterprise, rather than aiming to be just another public cloud storage provider. I think they’re worth checking out, particularly if you have data that could benefit from object storage without necessarily having live in the public cloud.

Storage Field Day 7 – Day 3 – Cloudian

Disclaimer: I recently attended Storage Field Day 7.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the Cloudian presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Cloudian website that covers some of what they presented.



Michael Tso, CEO and co-founder of Cloudian, provided us with a brief overview of the company. It was founded about 4 years ago, and a lot of the staff’s background was experience with hyper-scale messaging systems for big telcos. They now have about 65 staff.

Cloudian  offers a software version as well as a hardware appliance that runs their HyperStore software. The hardware appliance comes in 3 different flavours:

  • Entry Level;
  • Capacity Optimised; and
  • Performance Optimised.

The software is supported on RedHat and CentOS.



Paul Turner, Chief Marketing and Product Officer, gave us an introduction to the architecture behind Cloudian. Their focus is on using commodity servers, that provide scale out capability, are durable, and simple to use. “If you don’t make it dead easy to add nodes or remove nodes on the fly you don’t have a good platform”.

The platform uses

  • Erasure Coding;
  • Replication; and
  • Compression

Here’s a picture of what’s inside:


Features include:

  • Natively S3;
  • Hybrid Storage Cloud;
  • Extreme durability;
  • Multi-tenant;
  • Geo-distribution;
  • Scale out;
  • Intelligence in Software;
  • Smart Support;
  • Data Protection;
  • QoS;
  • Programmable; and
  • Billing and Reporting.

They also make use of an Adaptive Policy Engine (multi-tenant, continuous, adaptive, policy engine), which offers:

  • Policy controlled virtual storage pools (buckets like Amazon);
  • Scale / reduce storage on demand;
  • Multi-tenanted with many application tenants on same infrastructure;
  • Dynamically adjust protection policies;
  • Optimise for small objects by policy; and
  • Cloud archiving by virtual pool.


Here’s a diagram of the logical architecture.


They use Cassandra as the core metadata and distribution mechanism. Why Cassandra? Well it’s


  • Supports 1000s of nodes
  • Adds capacity by adding nodes to running system
  • Distributed shared-nothing P2P architecture, with no single point of failure


  • Data durability, synced to disk
  • Resilient to network or hardware failures
  • Multi-DC replication
  • Tuneable data consistency level

Provides Features such as

  • Vnodes, TTL, secondary indexes, compression, encryption


  • Write path especially fast

Multiple data protection policies, including:

  • NoSQL DB, Replicas, Erasure Coding

Policy features

  • ACL, QoS, Tiering, versioning, etc.


  • Nodes remapped to physical disks. then one disk failure only affects those nodes;
  • Maximum 256 nodes per physical node. no token management. tokens randomly assigned;
  • Parallel I/O across nodes;
  • Increased repair speed in case of disk or node failure; and
  • Allows heterogeneous machines in a cluster.


Further Reading and Final Thoughts

If you’re doing a bit with cloud storage, I think these guys are worth checking out. I particularly like the use case for Cloudian deployed as an on-premises S3 cloud behind the firewall. There’s also a Community Edition available for download. You can use HyperStore Community Edition software for:

  • For product evaluation;
  • Testing HyperStore software features in a single or multi-node install; and
  • Building 10TB object storage systems free of charge.

I think that’s pretty neat. I also recommend checking out Keith’s preview of Cloudian.


Storage Field Day 7 – Day 3 – Exablox

Disclaimer: I recently attended Storage Field Day 7.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the Exablox presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Exablox website that covers some of what they presented.

Brief Overview

Exablox was founded in 2010 and launched publicly in April 2013. There are two key elements to their solution:

  • OneBlox – scale-out storage for the enterprise, offering converged storage for primary and backup / archival data; and
  • OneSystem – manage on-premises storage exclusively from anywhere, providing visibility, control, and security without cost / complexity of traditional management

Here’s a photo of Tad Hunt (CTO and Co-founder) showing us the internals of the Exablox appliance.




Exablox started the presentation by talking about what we want from storage re-imagined (my words, not theirs):

  • Scale out;
  • Deduplication;
  • Snapshots;
  • Replication;
  • Be simple yet powerful; and
  • Be managed from everywhere.

The Exablox approach is not your father’s standard storage presentation play. Instead of providing block storage via SMB / NFS, or object storage via APIs, it instead presents file protocols via the front-end and services these with object storage on the back-end.


Technology Vision

Exablox’s approach revolves around software-defined storage (SDS) and storage management, with the following goals:

  • Manage the policy, not the technology;
  • SDS “wrapped in tin” for the mid market;
  • Eliminate complexity;
  • Plug-and-play; and
  • Next generation features.

They deliver NAS features atop object storage:

  • Without metadata servers;
  • Without bolt-on NAS gateways;
  • Without separate data and metadata servers; and
  • To scale capacity, performance, or resilience: just add a node.


Technology Benefits

Exablox say they can create scale-out NAS and object clusters atop mixed media – HDD, SSD, Shingled drives. This approach delivers the benefits of object storage technology to traditional applications:

  • By using standard file protocols; and
  • eliminating forklift upgrades – single namespace across the scale of the cluster.

They also use “RAID-free” data protection:

  • Self-healing from multiple drive and node failures;
  • Rebalancing time proportional to the quantity of objects on the failed drive;
  • Mix and match drive types, capacities, technologies; and
  • Introduce next generation drives without long validation cycles.

This provides the ability to scale capacity from TB to PB easily, whilst also offering:

  • Zero configuration expansion; and
  • Manage from anywhere capability.

Exablox say they are able to support all NAS workloads well. Whereas other object stores are designed primarily for large files, a OneBlox 3308 can handle 1B objects. All nodes perform all functions: storage, control, NAS interface, with a node being a single failure domain.


Hardware Notes and Thoughts

For the purposes of this post, I wanted to focus on the OneBlox appliance. While the OneSystem architecture is super neat, I still get a bit of a nerd tingle when I see some nice hardware. (BTW if Exablox want me test one long-term I’d be happy to oblige).

Exablox claims to be the sole provider of the following features in a single storage solution:

  • Scale-out deduplication;
  • Scale-out, continuous snapshots;
  • Scale-out, RAID-less capacity;
  • Scale-out, site-to-site disaster recovery; and
  • Bring any drive – one at a time at retail pricing.

They also support auto-clustering, with each node adding:

  • Capacity;
  • Performance; and
  • Resiliency.

The Exablox 3308 appliance:

  • Is seriously bloody quiet;
  • Uses 100W under peak load;
  • Has 8 * 3.5” drive bays, supporting up to 48 raw TB; and
  • Can use a mix of SATA & SAS drives.

Here is a picture of some appliances on a rack.


Further Reading

I was impressed with the strategy presented to me by Exablox, and the apparent ease of deployment and overall design of the appliance seemed great on the surface. I’d like to be clear that I haven’t used these in the wild, nor have I had any view of any benchmark data, so I can’t comment as to the effective performance of these devices. Like most things in storage, your mileage might vary. But I will say they seem quite inexpensive for what they do, and I recommend taking a more detailed look at them.

I also recommend you check out Keith’s preview post on Exablox.  For a different perspective on the hardware, have a look at Storage Review’s take on things as well.