ClearSky Data Are Here To Help

Disclaimer: I recently attended VMworld 2016 – US.  My flights were paid for by myself, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.



ClearSky Data presented recently at Tech Field Day Extra VMworld US 2016. You can see video from the presentation here. My rough notes on the session are here.



Lazarus Vekiarides, CTO and Co-founder, took us through an overview. “ClearSky’s Global Storage Network delivers enterprise storage, spanning the entire data lifecycle, as a fully-managed service”. Sounds good. I like when people talk about lifecycles, and fully managed. These things are hard to do though.

ClearSky are aiming to provide “the performance and availability of on-premises storage with the economics and scale of the cloud”. They do this with:

  • economics
  • scalability
  • reliability
  • security
  • performance

According to ClearSky, we’ve previously used a “Fragmented Hybrid” model when it comes to cloud storage.


I must have been watching too much Better Off Ted with my eldest daughter, but when I heard of the Global Storage Network, it sounded a lot like something from a Veridian Dynamics advertisement. It’s not though, it’s cooler than that. With the Global Storage Network, ClearSky brings it all together.


You can read a whitepaper from ClearSky here, and there’s a data sheet here.


These Pictures are Compelling, But What Is It?

ClearSky say they are changing how enterprises access data

  • eliminate storage silos
  • pay only for what you use – up to 100% useable storage only
  • guaranteed 100% uptime
  • multi-site data access without replication
  • maximum of 30minute response time for Sev 1 and 2 tickets


This is all delivered via consumption-based model. The idea behind this is that you get charged for only the capacity you use, but your applications have all the performance they need. Like all good consumption models, if you delete data, you give back the space ClearSky and are no longer billed for any of it.

“Customers simply plug into the ClearSky service to get the storage they need, when and where they need it, with the security, scalability and resilience that a business depends on.”


I’m Still Not Sure

That’s because I’m bad at explaining things. There’s an edge appliance (2RU appliance / 24 slots – about 6TB of flash cache) that is used. Cache is available (on resilient storage), but not copied. ClearSky POPs then offer distributed and optimised storage, with multiple copies to the cloud. Maybe a picture will explain it a bit better.


With this architecture, ClearSky manages the entire data lifecycle. Active data lives either next to your applications, or in the metro area near your applications. Any cold data, backup and DR stuff is stored as multiple copies of data geographically dispersed in the network.

There’s support for iSCSI or FC today and write back cache is processed every 10 minutes and pushed to the metro cache or cloud.


What Do I Use It For?

Data in the ClearSky network can be accessed from multiple locations without replication, offering mobility and availability.

Multi-site availability

  • Load balancing and disaster recovery

Workload mobility

  • In-metro and cross-metro
  • Application data can be accessed from other metros

And you can use it in all the ways you think you would, including DR, DC migration, and load balancing.


Make it Splunky

You probably know that companies use Splunk to analyse machine data. I’ve used it at home to munge squid logs when trying to track my daughter’s internet use. Splunk captures, indexes and correlates machine data in a searchable repository from which it can generate graphs, reports, alerts, and visualisations. Spunk demands high performance and agile storage, and ClearSky have some experience with this. There’s also a Splunk Reference Architecture. ClearSky say they’re a good fit for Splunk Enterprise. The indexers simply write to the ClearSky Edge Cache & ClearSky manages index migration through cache and storage layers – greatly simplifying the solution. They also offer “[h]ighly consistent ingest performance, cloud capacity, and integrated backup using ClearSky snapshot technology”.



This was the first time I’d encountered ClearSky Data, and I liked the sound of a lot of what I heard. They make some big claims on performance, but the architecture seems to support these, at least on the face of it. I’m a fan of people who are into fully-managed data lifecycles. I hope to have the opportunity to dig further into this technology at some stage to see if they’re the real deal. People use caching solutions because they have the ability to greatly improve the perceived (and actual) performance of infrastructure. And managed services are certainly popular with enterprises looking at alternatives to their current, asset-heavy, models of storage consumption. If ClearSky can do everything it says it can, they are worth looking into further.