SwiftStack Announces 7

SwiftStack recently announced version 7 of its solution. I had the opportunity to speak to Joe Arnold and Erik Pounds from SwiftStack about the announcement and thought I’d share some thoughts here.

 

Insane Data Requirements

We spoke briefly about just how insane modern data requirements are becoming, in terms of both volume and performance requirements. The example offered up was that of an Advanced Driver-Assistance System (ADAS). These things need a lot of capacity to work, with training data starting at 15PB of data with performance requirements approaching 100GB/s.

  • Autonomy – Level 2+
  • 10 Deep neural networks needed
  • Survey car – 2MP cameras
  • 2PB per year per car
  • 100 NVIDIA DGX-1 servers per car

When your hot data is 15 – 30PB and growing – it’s a problem.

 

What’s New In 7?

SwiftStack has been working to address those kinds of challenges with version 7.

Ultra-scale Performance Architecture

SwiftStack has managed to get some pretty decent numbers under its belt, delivering over 100GB/s at scale with a platform that’s designed to scale linearly to higher levels. The numbers stack up well against some of their competitors, and have been validated through:

  • Independent testing;
  • Comparing similar hardware and workloads; and
  • Results being posted publicly (with solutions based on Cisco Validated Designs).

 

ProxyFS Edge

ProxyFS Edge takes advantage of SwiftStack’s file services to deliver distributed file services between edge, core, and cloud. The idea is that you can use it for “high-throughput, data-intensive use cases”.

[image courtesy of SwiftStack]

Enabling functionality:

  • Containerised deployment of ProxyFS agent for orchestrated elasticity
  • Clustered filesystem enables scale-out capabilities
  • Caching at the edge, minimising latency for improved application performance
  • Load-balanced, high-throughput API-based communication to the core

 

1space File Connector

But what if you have a bunch of unstructured data sitting in file environments that you want to use with your more modern apps? 1space File Connector brings enterprise file data into the cloud namespace, and “[g]ives modern, cloud-native applications access to existing data without migration”. The thinking is that you can modernise your workflows at an incremental rate, rather than having to deal with the app and the storage all in one go.  incrementally

[image courtesy of SwiftStack]

Enabling functionality:

  • Containerised deployment 1space File Connector for orchestrated elasticity
  • File data is accessible using S3 or Swift object APIs
  • Scales out and is load balanced for high-throughput
  • 1space policies can be applied to file data when migration is desired

The SwiftStack AI Architecture

SwiftStack has also developed a comprehensive AI Architecture model, describing it as “the customer-proven stack that enables deep learning at ultra-scale”. You can read more on that here.

Ultra-Scale Performance

  • Shared-nothing distributed architecture
  • Keep GPU compute complexes busy

Elasticity from Edge-to-Core-to-Cloud

  • With 1space, ingest and access data anywhere
  • Eliminate data silos and move beyond one cloud

Data Immutability

  • Data can be retained and referenced indefinitely as it was originally written
  • Enabling traceability, accountability, confidence, and safety throughout the life of a DNN

Optimal TCO

  • Compelling savings compared to public cloud or all-flash arrays Real-World Confidence
  • Notable AI deployments for autonomous vehicle development

SwiftStack PRO

The final piece is the SwiftStack PRO offering, a support service delivering:

  • 24×7 remote management and monitoring of your SwiftStack production cluster(s);
  • Incorporating operational best-practices learned from 100s of large-scale production clusters;
  • Including advanced monitoring software suite for log aggregation, indexing, and analysis; and
  • Operations integration with your internal team to ensure end-to-end management of your environment.

 

Thoughts And Further Reading

The sheer scale of data enterprises are working with every day is pretty amazing. And data is coming from previously unexpected places as well. The traditional enterprise workloads hosted on NAS or in structured applications are insignificant in size when compared to the PB-scale stuff going on in some environments. So how on earth do we start to derive value from these enormous data sets? I think the key is to understand that data is sometimes going to be in places that we don’t expect, and that we sometimes have to work around that constraint. In this case, SwiftStack has recognised that not all data is going to be sitting in the core, or the cloud, and it’s using some interesting technology to get that data where you need it to get the most value from it.

Getting the data from the edge to somewhere useable (or making it useable at the edge) is one thing, but the ability to use unstructured data sitting in file with modern applications is also pretty cool. There’s often reticence associated with making wholesale changes to data sources, and this solution helps to make that transition a little easier. And it gives the punters an opportunity to address data challenges in places that may have been inaccessible in the past.

SwiftStack has good pedigree in delivering modern scale-out storage solutions, and it’s done a lot of work ensure that its platform adds value. Worth checking out.

Komprise Continues To Gain Momentum

I first encountered Komprise at Storage Field Day 17, and was impressed by the offering. I recently had the opportunity to take a briefing with Krishna Subramanian, President and COO at Komprise, and thought I’d share some of my notes here.

 

Momentum

Funding

The primary reason for our call was to discuss Komprise’s Series C funding round of US $24 million. You can read the press release here. Some noteworthy achievements include:

  • Revenue more than doubled every single quarter, with existing customers steadily growing how much they manage with Komprise; and
  • Some customers now managing hundreds of PB with Komprise.

 

Key Verticals

Komprise are currently operating in the following key verticals:

  • Genomics and health care, with rapidly growing footprints;
  • Financial and Insurance sectors (5 out of 10 of the largest insurance companies in the world apparently use Komprise);
  • A lot of universities (research-heavy environments); and
  • Media and entertainment.

 

What’s It Do Again?

Komprise manages unstructured data over three key protocols (NFS, SMB, S3). You can read more about the product itself here, but some of the key features include the ability to “Transparently archive data”, as well as being able to put a copy of your data in another location (the cloud, for example).

 

So What’s New?

One of Komprise’s recent announcements was NAS to NAS migration.  Say, for example, you’d like to migrate your data from an Isilon environment to FlashBlade, all you have to do is set one as a source, and one as target. The ACLs are fully preserved across all scenarios, and Komprise does all the heavy lifting in the background.

They’re also working on what they call “Deep Analytics”. Komprise already aggregates file analytics data very efficiently. They’re now working on indexing metadata on files and exposing that index. This will give you “a Google-like search on all your data, no matter where it sits”. The idea is that you can find data using any combination of metadata. The feature is in beta right now, and part of the new funding is being used to expand and grow this capability.

 

Other Things?

Komprise can be driven entirely from an API, making it potentially interesting for service providers and VARs wanting to add support for unstructured data and associated offerings to their solutions. You can also use Komprise to “confine” data. The idea behind this is that data can be quarantined (if you’re not sure it’s being used by any applications). Using this feature you can perform staged deletions of data once you understand what applications are using what data (and when).

 

Thoughts

I don’t often write articles about companies getting additional funding. I’m always very happy when they do, as someone thinks they’re on the right track, and it means that people will continue to stay employed. I thought this was interesting enough news to cover though, given that unstructured data, and its growth and management challenges, is an area I’m interested in.

When I first wrote about Komprise I joked that I needed something like this for my garage. I think it’s still a valid assertion in a way. The enterprise, at least in the unstructured file space, is a mess based on the what I’ve seen in the wild. Users and administrators continue to struggle with the sheer volume and size of the data they have under their management. Tools such as this can provide valuable insights into what data is being used in your organisation, and, perhaps more importantly, who is using it. My favourite part is that you can actually do something with this knowledge, using Komprise to copy, migrate, or archive old (and new) data to other locations to potentially reduce the load on your primary storage.

I bang on all the time about the importance of archiving solutions in the enterprise, particularly when companies have petabytes of data under their purview. Yet, for reasons that I can’t fully comprehend, a number of enterprises continue to ignore the problem they have with data hoarding, instead opting to fill their DCs and cloud storage with old data that they don’t use (and very likely don’t need to store). Some of this is due to the fact that some of the traditional archive solution vendors have moved on to other focus areas. And some of it is likely due to the fact that archiving can be complicated if you can’t get the business to agree to stick to their own policies for document management. In just the same way as you can safely delete certain financial information after an amount of time has elapsed, so too can you do this with your corporate data. Or, at the very least, you can choose to store it on infrastructure that doesn’t cost a premium to maintain. I’m not saying “Go to work and delete old stuff”. But, you know, think about what you’re doing with all of that stuff. And if there’s no value in keeping the “kitchen cleaning roster May 2012.xls” file any more, think about deleting it? Or, consider a solution like Komprise to help you make some of those tough decisions.

SwiftStack 6.0 – Universal Access And More

I haven’t covered SwiftStack in a little while, and they’ve been doing some pretty interesting stuff. They made some announcements recently but a number of scheduling “challenges” and some hectic day job commitments prevented me from speaking to them until just recently. In the end I was lucky enough to snaffle 30 minutes with Mario Blandini and he kindly took me through the latest news.

 

6.0 Then, So What?

Universal Access

Universal Access is really very cool. Think of it as a way to write data in either file or object format, and then read it back in file or object format, depending on how you need to consume it.

[image courtesy of SwiftStack]

Key features include:

  • Gateway free – the data is stored in cloud-native format in a single namespace;
  • Accessible via file (SMB3 / NFS4) and / or object API (S3 / Swift). Note that this is not a replacement for NAS, but it will give you the ability to work with some of those applications that expect to see file in places; and
  • Applications can write data one way, access the data another way, and vice versa.

The great thing is that, according to SwiftStack, “Universal Access enables applications to take advantage of all data under management, no matter how it was written or where it is stored, without the need to refactor applications”.

 

Universal Access Multi-Cloud

So what if you take to really neat features like, say, Cloud Sync and Universal Access, and combine them? You get access to a single, multi-cloud, storage namespace.

[image courtesy of SwiftStack]

 

Thoughts

As Mario took me through the announcements he mentioned that SwiftStack are “not just an object storage thing based on Swift” and I thought that was spot on. Universal Access (particularly with multi-cloud) is just the type of solution that enterprises looking to add mobility to workloads are looking for. The problem for some time has been that data gets tied up in silos based on the protocol that a controller speaks, rather than the value of the data to the business. Products like this go a long way towards relieving some of the pressure on enterprises by enabling simpler access to more data. Being able to spread it across on-premises and public cloud locations also makes for simpler consumption models and can help business leverage the data in a more useful way than was previously possible. Add in the usefulness of something like Cloud Sync in terms of archiving data to public cloud buckets and you’ll start to see that these guys are onto something. I recommend you head over to the SwiftStack site and request a demo. You can read the press release here.

WekaIO Have Been Busy – Really Busy

WekaIO recently announced Version 3.1 of their Matrix software, and I had the good fortune to catch up with David Hiatt. We’d spoken a little while ago when WekaIO came out of stealth and they’ve certainly been busy in the interim. In fact, they’ve been busy to the point that I thought it was worth putting together a brief overview of what’s new.

 

What Is WekaIO?

WekaIO have been around since 2013, gaining their first customers in 2016. They’ve had 17 patents filed, 45 identified, and 8 issued. Their focus has primarily been on delivering, in their words, the “highest performance file system targeted at compute intensive applications”. They deliver a fully POSIX-compliant file system that can run on bare metal, hypervisors, Docker, or in the public or private cloud.

[image courtesy of WekaIO]

Some of the key features of the architecture include the fact that it is distributed, resilient at scale, can perform fast rebuilds, and provides end-to-end protection. Right now, their key use cases include genomics, machine learning, media rendering, semiconductors, financial trading and analytics. The company has staff coming from XIV, NetApp, IBM, EMC, and Intel, amongst others.

 

So What’s News?

Well, there’s been a bit going on:

 

Matrix Version 3.1 – Much Better Than Matrix Revolutions

Not that that’s too hard to do. But there have been a bunch of new features added to WekaIO’s Matrix software. Here’s a table that summarises the new features.

Feature Explanation
Network Redundancy Binding network links and load balancing
Infiniband Native support for InfiniBand
Multiple File Systems Logical partitioning allows more granular allocation of performance and capacity
Cluster Scaling Dynamically shrink and grow clusters
NVMe Native support for NVMe devices
Snapshots and Clones High performance 4K granularity
Snap to Object Store Saving metadata of snap to OBS
Deployment in AWS Install and run Matrix on EC2 clusters

David also took me through what look like some very, very good SPECsfs2014 Software Build results, particularly when compared with some competitive solutions. He also walked me through the Marketplace configurator. This is really cool stuff – flexible and easy to use. You can check out a demo of it here.

 

Conclusion

All the cool kids are doing stuff with AWS. And that’s fine. But I really like that WekaIO also make stuff easy to run on-premises as well. And they also make it really fast. Because sometimes you just need to run stuff near you, and sometimes there needs to be an awful lot of it. WekaIO’s model is flexible, with the annual subscription approach and lack of maintenance contracts bound to appeal to a lot of people. The great thing is it’s easy to manage, easy to scale and supports all the file protocols you’d be interested in. There’s a bunch of (configurable) resiliency built in and support for hybrid workloads if required.

With a Formula One slide including customer testimonials from the likes of DreamWorks and SDSC, I get the impression that WekaIO are up to something pretty cool. Plus, I really enjoy chatting to David about what’s going on in the world of highly scalable file systems, and am looking forward to our next call in a few months’ time to see what they’ve been up to. I get the impression there’s little chance they’ll be sitting still.

Cloudian Announces HyperFile, Makes Object Better

Cloudian recently announced an addition to their HyperStore appliance. I had the opportunity to be briefed by Jon Toor and thought I’d share the highlights of the announcement here. I’ve had the pleasure of talking to Cloudian at a few Storage Field Day events. If you’re unfamiliar with the HyperStore 4000, you can read my coverage of it here. In short, it’s 840TB of object storage in 4RU with really, really, comprehensive S3 compliance, amongst other things.

 

HyperFile You Say?

HyperFile is the new file front-end controller for the HyperStore appliance. It supports the following features:

  • SMB3 and NFS3;
  • High Availabilty with active / passive controllers;
  • Non-disruptive failover;
  • POSIX compliance;
  • Active Direcotry / LDAP authentication;
  • Write Once Read Many (WORM); and
  • Snapshots.

It wouldn’t be a product announcement without a bezel shot. I can’t say whether this is actually what it looks like, but if it does, it’s kind of cool.

[image courtesy of Cloudian]

The appliance itself is 2RU with dual controllers and a shared backplane. The cool thing is that it can be deployed as VMs, making it appealing for service providers looking to setup multiple environments for customers. Supported hypervisors include vSphere 5.1 (or later) and KVM. Replication is handled at the HyperStore level.

Multi-tenancy is supported with dedicated controllers.

[image courtesy of Cloudian]

There’s a global namespace between file and object and it also supports a shared namespace across multiple NAS controllers, meaning you can up your number of controllers to increase bandwidth or replication performance. From a scalability perspective, it supports up to 64 namespaces per controller. One of my favourite features is what Cloudian call “converged access” between file and object, meaning you could use S3 for storing files. It also supports Microsoft Azure, Google Cloud Platform and Amazon S3 formats, opening up some interesting possibilities for file consumption on-premises and in the cloud.

There are two editions available. The Basic HyperFile NAS Controller includes

  • Full protocol support;
  • High-availability;
  • Converged data access; and
  • Data migration.

The Enterprise HyperFile NAS Controller adds

  • Snapshot;
  • WORM; and
  • Geo-distribution with file versioning/locking.

 

Thoughts

I’ve been a fan of Cloudian’s products for some time, and this addition to the HyperStore platform makes them a compelling option for file and object storage in the data centre. With this approach they’re looking to push further into Media Asset Management (MAM) and video surveillance solutions. The title of the post is misleading. Object is already pretty cool, and a very suitable solution for a number of workloads. So why would an object vendor need to add file to work in these industries? Isn’t object ideally suited to these kinds of workloads? Yes, but sometimes the leading software vendors and people in charge of workflows are focused on other things, like only supporting file. So Cloudian have adapted to take a bigger piece of the pie. In much the same way that some data protection solutions are still file oriented, the HyperFile allows Cloudian to play in areas where it’s traditionally been excluded.

I’m also a fan of the appliance as VM approach and I like the breadth of protocol support and cloud integration available. If you’re going to put cloud in the name of your company the expectation will be there that you know what you’re doing. Cloudian haven’t disappointed thus far. If you’re in the market for a solid object (and now file) solution, you could do worse than talking to the folks at Cloudian.

Storage Field Day 7 – Day 3 – Exablox

Disclaimer: I recently attended Storage Field Day 7.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the Exablox presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Exablox website that covers some of what they presented.

Brief Overview

Exablox was founded in 2010 and launched publicly in April 2013. There are two key elements to their solution:

  • OneBlox – scale-out storage for the enterprise, offering converged storage for primary and backup / archival data; and
  • OneSystem – manage on-premises storage exclusively from anywhere, providing visibility, control, and security without cost / complexity of traditional management

Here’s a photo of Tad Hunt (CTO and Co-founder) showing us the internals of the Exablox appliance.

IMG_1214_11

 

Architecture

Exablox started the presentation by talking about what we want from storage re-imagined (my words, not theirs):

  • Scale out;
  • Deduplication;
  • Snapshots;
  • Replication;
  • Be simple yet powerful; and
  • Be managed from everywhere.

The Exablox approach is not your father’s standard storage presentation play. Instead of providing block storage via SMB / NFS, or object storage via APIs, it instead presents file protocols via the front-end and services these with object storage on the back-end.

exablox-architecture-diagram

Technology Vision

Exablox’s approach revolves around software-defined storage (SDS) and storage management, with the following goals:

  • Manage the policy, not the technology;
  • SDS “wrapped in tin” for the mid market;
  • Eliminate complexity;
  • Plug-and-play; and
  • Next generation features.

They deliver NAS features atop object storage:

  • Without metadata servers;
  • Without bolt-on NAS gateways;
  • Without separate data and metadata servers; and
  • To scale capacity, performance, or resilience: just add a node.

 

Technology Benefits

Exablox say they can create scale-out NAS and object clusters atop mixed media – HDD, SSD, Shingled drives. This approach delivers the benefits of object storage technology to traditional applications:

  • By using standard file protocols; and
  • eliminating forklift upgrades – single namespace across the scale of the cluster.

They also use “RAID-free” data protection:

  • Self-healing from multiple drive and node failures;
  • Rebalancing time proportional to the quantity of objects on the failed drive;
  • Mix and match drive types, capacities, technologies; and
  • Introduce next generation drives without long validation cycles.

This provides the ability to scale capacity from TB to PB easily, whilst also offering:

  • Zero configuration expansion; and
  • Manage from anywhere capability.

Exablox say they are able to support all NAS workloads well. Whereas other object stores are designed primarily for large files, a OneBlox 3308 can handle 1B objects. All nodes perform all functions: storage, control, NAS interface, with a node being a single failure domain.

 

Hardware Notes and Thoughts

For the purposes of this post, I wanted to focus on the OneBlox appliance. While the OneSystem architecture is super neat, I still get a bit of a nerd tingle when I see some nice hardware. (BTW if Exablox want me test one long-term I’d be happy to oblige).

Exablox claims to be the sole provider of the following features in a single storage solution:

  • Scale-out deduplication;
  • Scale-out, continuous snapshots;
  • Scale-out, RAID-less capacity;
  • Scale-out, site-to-site disaster recovery; and
  • Bring any drive – one at a time at retail pricing.

They also support auto-clustering, with each node adding:

  • Capacity;
  • Performance; and
  • Resiliency.

The Exablox 3308 appliance:

  • Is seriously bloody quiet;
  • Uses 100W under peak load;
  • Has 8 * 3.5” drive bays, supporting up to 48 raw TB; and
  • Can use a mix of SATA & SAS drives.

Here is a picture of some appliances on a rack.

IMG_1213_cropped

Further Reading

I was impressed with the strategy presented to me by Exablox, and the apparent ease of deployment and overall design of the appliance seemed great on the surface. I’d like to be clear that I haven’t used these in the wild, nor have I had any view of any benchmark data, so I can’t comment as to the effective performance of these devices. Like most things in storage, your mileage might vary. But I will say they seem quite inexpensive for what they do, and I recommend taking a more detailed look at them.

I also recommend you check out Keith’s preview post on Exablox.  For a different perspective on the hardware, have a look at Storage Review’s take on things as well.