Storage Field Day 15 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

This is a quick post to say thanks once again to Stephen and Ben, and the presenters at Storage Field Day 15. I had a super fun and educational time. For easy reference, here’s a list of the posts I did covering the events (they may not match the order of the presentations).

Storage Field Day – I’ll Be At Storage Field Day 15

Storage Field Day 15 – Day 0

Storage Field Day 15 – (Fairly) Full Disclosure

IBM Spectrum Protect Plus Has A Nice Focus On Modern Data Protection

Dropbox – It’s Scale Jim, But Not As We Know It

StarWind VTL? What? Yes, And It’s Great!

WekaIO – Not The Matrix You’re Thinking Of

Cohesity Understands The Value Of What Lies Beneath

Western Digital – The A Is For Active, The S Is For Scale

Come And Splash Around In NetApp’s Data Lake

Huawei – Probably Not What You Expected

Datrium Cloud DVX – Not Your Father’s Cloud Data Protection Solution

Hedvig’s Evolution

 

Also, here’s a number of links to posts by my fellow delegates (in no particular order). They’re all very smart people, and you should check out their stuff, particularly if you haven’t before. I’ll attempt to keep this updated as more posts are published. But if it gets stale, the Storage Field Day 15 landing page will have updated links.

 

Josh De Jong (@EuroBrew)

The Challenge Of Scale

Convergence Without Compromise

 

Glenn Dekhayser (@GDekhayser)

#SFD15: Datrium impresses

 

Chan Ekanayake (@ChanEk81)

Storage Field Day 15 – Introduction

Dropbox’s Magic Pocket: Power Of Software Defined Storage

A Look At The Hedvig Distributed Hybrid Cloud Storage Solution

Cohesity: A Secondary Storage Solution For The Hybrid Cloud?

NetApp’s & Next Generation Storage Technologies

 

Chin-Fah Heoh (@StorageGaga)

Always serendipitous Storage Field Days

Storage dinosaurs evolving too

Magic happening

Cohesity SpanFS – a foundational shift

NetApp and IBM gotta take risks

Own the Data Pipeline

Huawei Dorado – All about Speed

 

Mariusz Kaczorek (@Settlersoman)

 

Ray Lucchesi (@RayLucchesi)

Western Digital at SFD15: ActiveScale object storage

Huawei presents OceanStor architecture at SFD15

 

Dukagjin Maloku (@DugiDM)

Storage Field Day 15 … #SFD15

 

Michael Stanclift (@VMStan)

 

Lino Telera (@LinoTelera)

Back to Silicon Valley for Storage Field Day 15

Storage Field Day 15: Dropbox the high availability in a pocket

Storage Field Day 15: Cohesity the solution for secondary data

Storage Field Day 15: Weka.io

Storage Field Day 15: The open convergence by Datrium

 

Arjan Timmerman (@ArjanTim)

Starwind software: SFD15 preview

 

Dr Rachel Traylor (@Mathpocalypse)

Commentary: White Papers Dont Impress Me Much

Dialogue: What Do We Mean By Predictive Analytics?

Little’s Law: For Estimation Only

 

Vendor Posts

Datrium @ Storage TechFieldDay

Storage Field Day Wrap-up: How Cohesity is Disrupting Legacy Backup

 

Thanks.

Hedvig’s Evolution

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Hedvig recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.

 

More Hybrid Than Ever

It’s been a little while since I’ve spoken to Hedvig. Since that time they’ve built on a platform that was already pretty robust and feature-rich.

[image courtesy of Hedvig]

 

Features

If you’re unfamiliar with Hedvig, this post by Ray Lucchesi provides a nice overview of the offering. There are a number of nice features, including the fact that it’s hypervisor agnostic. You can also run the proxy on bare metal deployed as KVM instance. Each host requires a proxy and there are 2 proxies per host (active / passive) for HA. It provides protocol consolidation on a single platform and can do deduplication, compression and encryption at a virtual disk level. Workloads map to a virtual disk, and the deduplication is global (and can be toggled on / off at a virtual disk level). Deduplication is performed at a block-level to a 4K granularity.

The default replication policy is “Agnostic” (let the system decide where to put the data), but you can also tell it that you need it to be “Rack Aware” or even “DC Aware”. The cool thing is that the same policies apply whatever protocol you use.

Hedvig uses a concept called Containers (no, not those containers, or those containers). These are assigned to storage pools, and striped across 3 disks.

There is demarcation between metadata and data.

Data Process:

  • Local data persistence
  • Replication

Metadata Process:

  • Global knowledge of everything happening in the cluster

The solution can integrate with external KMS infrastructure if you’re into that sort of thing, and there’s a really focus on “correctness” of data in the system.

 

Hedvig’s Evolution

Hedvig already had a good story to tell in terms of scalable, software-defined storage by the time I saw them in 2016. Their recent presentation demonstrated not just some significant re-branding, but also increased maturity around the interface and data protection features on offer with the platform. Most of the demonstration time was spent in the Hedvig GUI, in stark contrast to the last time I saw them when there was an almost constant requirement to drop in to the CLI to do a variety of tasks. At the time this made sense as the platform was relatively new in the market. Don’t misunderstand me, I’m as much a fan as anyone of the CLI, but it feels like you’re in with a better chance of broad adoption if you can also present a useable GUI for people to leverage.

Of course, whether or not you have a snazzy HTML 5 UI means nothing if you don’t have a useful product sitting behind that interface. It was clear from Hedvig’s presentation that they certainly do have something worthy of further consideration, particularly given its focus on data protection, geo-resilience and storage efficiency. The fact that it runs on pretty much anything you can think of is also a bonus. I don’t think too many people would dispute that SDS has a lot of advantages over traditional storage deployments. It’s often a lot more accessible and provides an easier, cheaper entry point for deployment. It can often be easier to get changes and improvements made to the platform that aren’t necessarily tied to particular hardware architectures, and, depending on the software in play, it can often run on just about any bit of x86 compute you want it to. The real value of solutions like Hedvig’s are the additional data protection and efficiency features that provide performance, scalability and resilience beyond the standard 2-node, 1000 disk midrange offerings.

Hedvig seem to be listening to their current and (potential) customers and are making usability and reliability a key part of their offering. I look forward to seeing how this develops over the next 12 months.

Datrium Cloud DVX – Not Your Father’s Cloud Data Protection Solution

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Datrium recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here. Datrium presented on both DVX (Distributed Virtual x) and Cloud DVX. In this article I’m going to focus on Cloud DVX.

 

Cloud DVX

Cloud DVX is “a cloud-native instance of Datrium DVX that offers recovery services for VMs running in DVX on-premises“. They say that Cloud DVX is “Cloud Backup Done Right”. They say it’s super simple to use, highly efficient, and delivers really fast recovery capabilities. They say a lot, so what are some of the use cases?

Consolidated Backup

  • Consolidated backups.
  • Faster access vs. off-site tapes.
  • Long retention – no media mgmt.

 

Recover to On-premises

  • Off-site backups.
  • Cloud as second or third site.
  • Retrieve on-prem for recovery.

 

Recover to Cloud [Future feature]

  • Cloud as the DR site.
  • On-demand DR infrastructure.
  • SAAS-based DR orchestration.

 

Thoughts and Further Reading

Datrium are positioning Cloud DVX as a far more cost-effective solution for cloud backup then simply storing your data on S3. For existing Datrium customers this solution makes a lot of sense. There are great efficiencies to be had through Datrium’s global deduplication and, based on the demo I saw, it looks to be a simple solution to get up and running. Get yourself an AWS account, addd your key to your on-premises DVX environment, and set your protection policy. Then let Datrium take care of the rest. Datrium are really keen to make this an “iPhone to iCloud-like” experience.

I’ve been working with data protection solutions for some time now. I’m not a greybeard by any stretch but there’s certainly some silver in those sideburns. In my opinion, data protection solutions have been notoriously complicated to deploy in a cost effective and reliable manner. It often seems like the simple act of protecting critical data (yes, I’m overstating things a bit) has to be made difficult in order for system administrators to feel a sense of accomplishment. The advent of cloud has moved the goalposts again, with a number of solutions being positioned as “cloud-ready” that really just add to the complexity of the solution rather than giving enterprise what they need: a simple and easy way to protect data, and recover, in a cost effective fashion. Data protection shouldn’t be the pain in the rear that it is today. And shoving traditional data protection solutions onto platforms such as AWS or Azure and calling them “cloud ready” is disingenuous at best and, at worst, really quite annoying. That’s why something like Cloud DVX, coupled with Datrium’s on-premises solution, strikes me as an elegant solution that could really change the way people protect their traditional applications.

Datrium have plans for the future that involve some on-demand disaster recovery services and other orchestration pieces that will be built around Cloud DVX. Their particular approach to “open” convergence has certainly made some waves in the market and generated a lot of interest. The architecture is not what we’ve come to see in “traditional” converged and hyper-converged systems (can we say traditional for this stuff yet?) and delivers a number of efficiencies in terms of cost and performance that makes for a compelling solution. The company was founded by a lot of pretty smart folks who know a thing or two about data efficiency on storage platforms (amongst other things), so they might just have a chance at making this whole “open convergence” thing work (I need to lay off the air quotes too).

You can grab the datasheet here, a copy of the DVX solution brief here, an article from El Reg here and a post by Storage Review hereGlenn Dekhayser also did a great article on Datrium that you can read here.

Huawei – Probably Not What You Expected

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Huawei recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.

 

Back In The Day

Huawei have been working with Flash technologies since 2005.

[image courtesy of Huawei]

 

Their first AFA came to market in 2011 (the Dorado 2100 & 5100, offering 6Gbit/s SAS SSDs) and in 2017 they released their first NVMe AFA. This was a refresh of the Dorado line and offers 0.5ms latency and 3:1 data reduction (more on that later).

Interestingly, they also make their own SSDs, along with their own storage controller chips.

 

It’s A Box

Not the kind of box that the good lads at Pied Piper tried to sell, but there is hardware involved. And software for that matter. It supports a lot of the features you’d look for in a modern storage platform, including:

  • Thin provisioning;
  • In-line global deduplication;
  • Support for VMware VVOLs;
  • Intelligent QoS;
  • In-line compression;
  • Synchronous and asynchronous replication;
  • Heterogeneous storage virtualisation;
  • Writable snapshots;
  • Metro clustering via synchronous replication;
  • LUN migration and cloning; and
  • Internal encryption key management.

From a hardware perspective, there are a variety of speeds and feeds available supporting iSCSI, FC and InfiniBand.

 

Let’s Do Something Nice With The Data

Data protection capabilities, such as snapshots and clones, have been around for some time, and Huawei provide the ability to do both. HyperSnap provides support for “instant, writable snapshots with no performance penalty”, while HyperClone provides the ability to create clones quickly and easily. There’s also data efficiency technology in the form of SmartDedupe and SmartCompression. Huawei seem quite keen to put their money where their mouth is (can a company do that?), and offer a 3:1 Dedupe guarantee. This doesn’t apply to every bit of data (i.e. not your movie collection) but it does apply to database and virtual machine data (or other data depending on the output of their data reduction evaluation tool). The great news is it’s flexible (meaning you can turn deduplication and compression on or off individually) as well as being global and inline.

They also provide “Evolvable Data Protection Schemes” which basically means you can start with a presence in one data centre and evolve your replication / remote site protection strategy as you go with limited re-configuration necessary. The fan-in ratio goes up to 64:1 too, which is a nice high number.

[image courtesy of Huawei]

 

Thoughts And Further Reading

I’m the first to admit I was pretty ignorant of Huawei’s storage portfolio prior to this presentation. I’ve seen Huawei logos in data centres before, but they were mostly on the front of comms gear, not storage devices. Clearly, they’ve been in the game for a while and are doing some pretty neat stuff with their platform. On the surface, it looks like they’re doing a lot of the things that the likes of EMC and IBM have done before. But once you dig into the architecture a little, it becomes clear that there’s some really cool tech being deployed, and that a lot of thought and research has gone into the product.

Whilst a number fo the features available in the platform might seem like table stakes to some, it strikes me that Huawei have gone to some effort to ensure that the capabilities people really rely on in a storage platform, such as data integrity and protection options, are delivered in a robust and reliable fashion. I’m always a fan of companies that back themselves when it comes to data efficiency guarantees, and I appreciate that Huawei call out the fact that you should run everything through their tool before they guarantee the guarantee. I’m also a big fan of flexible replication options, with the ability to replicate to multiple DCs being a nice touch. Sure, it might have been a while since you found yourself in that position, but there are plenty of enterprises out there that will benefit from this workhouse approach to performance storage. The focus on hardware and software integration is not as common as it once was, but Huawei are doing a decent job of controlling their own destiny by sticking with that approach.

Huawei probably aren’t at the top of your shopping list when it comes to enterprise storage arrays, but after watching their presentation I think they’re worthy of a second look.

Come And Splash Around In NetApp’s Data Lake

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

NetApp recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.

 

You Say Day-ta, I Say Dar-ta

Santosh Rao (Senior Technical Director, Workloads and Ecosystems) took us through some of the early big data platform challenges NetApp are looking to address.

 

Early Generation Big Data Analytics Platform

These were designed to deliver initial analytics solutions and were:

  • Implemented as Proof of concept; and
  • Solved a point project need.

The primary considerations of these solutions were usually cost and agility. The focus was to:

  • Limit up front costs and get the system operational quickly; and
  • Scalability, availability, and governance were afterthoughts

A typical approach to this was to use cloud or commodity infrastructure. This ended up becoming the final architecture. The problem with this approach, according to NetApp, is that it lead to unpredictable behaviour as copies manifested. You’d end up with 3-5 replicas of data copied across lines of business and various functions. Not a great situation.

 

Early Generation Analytics Platform Challenges

Other challenges with this architecture included:

  • Unpredictable performance;
  • Inefficient storage utilisation;
  • Media and node failures;
  • Total cost of ownership;
  • Not enterprise ready; and
  • Storage and compute tied (creating imbalance).

 

Next Generation Data Pipeline

So what do we really need from a data pipeline? According to NetApp, the key is “Unified Insights across LoBs and Functions”. By this they mean:

  • A unified enterprise data lake;
  • Federated data sources across the 2nd and 3rd platforms;
  • In-place access to the data pipeline (copy avoidance);
  • Spanned across edge, core and cloud; and
  • Future proofed to allow shifts in architecture.

Another key consideration is the deployment. The first proof of concept is performed by the business unit, but it needs to scale for production use.

  • Scale edge, core and cloud as a single pipeline
  • Predictable availability
  • Governance, data protection, security on data pipeline

This provides for a lower TCO over the life of the solution.

 

Data Pipeline Requirements

We’re not just playing in the core any more, or exclusively in the cloud. This stuff is everywhere. And everywhere you look the requirements differ as well.

Edge

  • Massive data (few TB/device/day)
  • Real-time Edge Analytics / AI
  • Ultra Low Latency
  • Network Bandwidth
  • Smart Data Movement

Core

  • Ultra high IO bandwidth (20 – 200+ GBps)
  • Ultra-low latency (micro – nanosecond)
  • Linear scale (1 – 128 node AI)
  • Overall TCO for 1-100+ PB

Cloud

  • Cloud analytics, AI/DL/ML
  • Consume and not operate
  • Cloud vendor vs on-premises stack
  • Cost-effective archive
  • Need to avoid cloud lock-in

Here’s picture of what the data pipeline looks like for NetApp.

[Image courtesy of NetApp]

 

NetApp provided the following overview of what the data pipeline looks like for AI / Deep Learning environments. You can read more about that here.

[Image courtesy of NetApp]

 

What Does It All Mean?

NetApp have a lot of tools at their disposal, and a comprehensive vision for meeting the requirements of big data, AI and deep learning workloads from a number of different angles. It’s not just about performance, it’s about understanding where the data needs to be to be considered useful to the business. I think there’s a good story to tell here with NetApp’s Data Fabric, but it felt a little like there remains some integration work to do. Big data, AI and deep learning means different things to different people, and there’s sometimes a reluctance to change the way people do things for the sake of adopting a new product. NetApp’s biggest challenge will be demonstrating the additional value they bring to the table, and the other ways in which they can help enterprise succeed.

NetApp, like some of the other Tier 1 storage vendors, has a broad portfolio of products at its disposal. The Data Fabric play is a big bet on being able to tie this all together in a way that their competitors haven’t managed to do yet. Ultimately, the success of this strategy will rely on NetApp’s ability to listen to customers and continue to meet their needs. As a few companies have found out the hard way, it doesn’t matter how cool you think your idea is, or how technically innovative it is, if you’re not delivering results for the business you’re going to struggle to gain traction in the market. At this stage I think NetApp are in a good place, and hopefully they can stay there by continuing to listen to their existing (and potentially new) customers.

For an alternative perspective, I recommend reading Chin-Fah’s thoughts from Storage Field Day 15 here.

Western Digital – The A Is For Active, The S Is For Scale

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

   

Western Digital recently presented at Storage Field Day 15. You might recall there are a few different brands under the WD umbrella, including Tegile and HGST and folks from both Tegile and HGST presented during Storage Field Day 15. I’d like to talk about the ActiveScale session however, mainly because I’m interested in object solutions. I’ve written about Tegile previously, although obviously a fair bit has changed for them too. You can see their videos from Storage Field Day 15 here, and download a PDF copy of my rough notes from here.

 

ActiveScale, Probably Not What You Thought It Was

ActiveScale isn’t some kind of weight measurement tool for exercise fanatics, but rather the brand of scalable object system that HGST sells. It comes in two flavours: the P100 and X100. Apparently the letters in product names sometimes do mean things, with the “P” standing for Petabyte, and the “X” for Exabyte (possibly in the same way that X stands for Excellent). From a speeds and feeds perspective, the typical specs are as follows:

  • P100 – starts as low as 720TB, goes to 18PB. 17x 9s data durability, 4.6KVA typical power consumption; and
  • X100 – 5.4PB in a rack, 840TB – 52PB, 17x 9s data durability, 6.5KVA typical power consumption.

You can scale out to 9 expansion racks, with 52PB of scale out object storage goodness per namespace. Some of the key capabilities of the ActiveScale platform include:

  • Archive and Backup;
  • Active Data for Analytics;
  • Data Forever Architecture;
  • Versioning;
  • Encryption;
  • Replication;
  • Single Pane Management;
  • S3 Compatible APIs;
  • Multi-Geo Availability Zones; and
  • Scale Up and Scale Out.

They use “BitSpread” for dynamic data placement and you can read a little about their erasure coding mechanism here. “BitDynamics” assures continuous data integrity, offering the following features:

  • Background – verification process always running
  • Performance – not impacted by verification or repair
  • Automatic – all repairs happen with no intervention

There’s also a feature called “GeoSpread” for geographical availability.

  • Single – Distributed erasure coded copy;
  • Available – Can sustain the loss of an entire site; and
  • Efficient – Better than 2 or 3 copy replication.

 

What Do I Use It For Again?

Like a number of other object storage systems in the market, ActiveScale is being positioned as a very suitable platform for:

  • Media & Entertainment
    • Media Archive
    • Tape replacement and augmentation
    • Transcoding
    • Playout
  • Life Sciences
    • Bio imaging
    • Genomic Sequencing
  • Analytics

 

Thoughts And Further Reading

Unlike a lot of people, I find technical sessions discussing object storage at extremely large scale to be really interesting. It’s weird, I know, but there’s something that I really like about the idea of petabytes of storage servicing media and entertainment workloads. Maybe it’s because I don’t frequently come across these types of platforms in my day job. If I’m lucky I get to talk to folks about using object as a scalable archive platform. Occasionally I’ll bump into someone doing stuff with life sciences stuff in a higher education setting, but they’ve invariably built something that’s a little more home-brew than HGST’s offering. Every now and then I’m lucky enough to spend some time with media types who regale me with tales of things that go terribly wrong when the wrong bit of storage infrastructure is put in the path of a particular editing workflow or transcode process. Oh how we laugh. I can certainly see these types of scalable platforms being a good fit for archive and tape replacement. I’m not entirely convinced they make for a great transcode or playout platform, but I’m relatively naive when it comes to those kinds of workloads. If there are folks reading this who are familiar with that kind of stuff, I’d love to have a chat.

But enough with my fascination with the media and entertainment industry’s infrastructure requirements. From what I’ve seen of ActiveScale, it looks to be a solid platform with a lot of very useful features. Coupled with the cloud management feature it seems like they’re worth a look. Western Digital aren’t just making hard drives for your NAS (and other devices), they’re doing a whole lot more, and a lot of it is really cool. You can read El Reg’s article on the X100 here.

Cohesity Understands The Value Of What Lies Beneath

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Cohesity recently presented at Storage Field Day 15. It’s not the first time I’ve spoken about them, and you can read a few of my articles on them here and here. You can see their videos from Storage Field Day 15 here, and download a PDF copy of my rough notes from here.

 

The Data Centre Is Boring

Well, not boring exactly. Okay, it’s a little boring. Cohesity talk a lot about the concept of secondary storage and, in their view, most of the storage occupying the DC is made up of secondary storage. Think of your primary storage tier as your applications, and your secondary storage as being comprised of:

  • Backups;
  • Archival data;
  • Analytics; Test/Dev workloads; and
  • File shares.

In other words, it’s a whole lot of unstructured data. Cohesity like to talk about the “storage iceberg”, and it’s a pretty reasonable analogy for what’s happening.

[Image courtesy of Cohesity]

 

Cohesity don’t see all this secondary data as simply a steaming pile of unmanaged chaos and pain. Instead, they see it as a potential opportunity for modernisation. The secondary storage market has delivered, in Cohesity’s view, an opportunity to “[c]lean up the mess left by enterprise backup products”. The idea is that you can use an “Apple-like UI”, operating at “Google-like scale”, to consolidate workloads on the Cohesity DataPlatform and then take advantage of copy data management to really extract value from that data.

 

The Cohesity Difference

So what differentiates Cohesity from other players in the secondary storage space?

Mohit Aron (pictured above) took us though a number of features in the Cohesity DataPlatform that are making secondary storage both useful and interesting. These include:

  • Global Space Efficiency
    • Variable length dedupe
    • Erasure coding
  • QoS
    • Multi workload isolation
    • Noisy neighbour prevention
  • Instant Mass Restore
    • Any point in time
    • Highly available
  • Data Resiliency
    • Strict consistency
    • Ensures data integrity
  • Cloud/Apps Integration
    • Multiprotocol
    • Universal access

I’ve been fortunate enough to have some hands on experience with the Cohesity solution and can attest that these features (particularly things like storage efficiency and resiliency) aren’t just marketing. There are some other neat features, such as public cloud support with AWS and Azure that are also worthy of further investigation.

 

Thoughts And Further Reading

There’s a lot to like about Cohesity’s approach to leveraging secondary storage in the data centre. For a very long time, the value of secondary storage hasn’t been at the forefront of enterprise analytics activities. Or, more bluntly put, copy data management has been something of an ongoing fiasco, with a number of different tools and groups within organisations being required to draw value from the data that’s just sitting there. Cohesity don’t like to position themselves simply as a storage target for data protection, because the DataPlatform is certainly capable of doing a lot more than that. While the messaging has occasionally been confusing, the drive of the company to deliver a comprehensive data management solution that extends beyond traditional solutions shouldn’t be underestimated. Coupled with a relentless focus on ease of use and scalability and the Cohesity offering looks to be a great way of digging in to the “dark data” in your organisation to make sense of what’s going on.

There are still situations where Cohesity may not be the right fit (at the moment), particularly if you have requirements around non-x86 workloads or particularly finicky (read: legacy) enterprise applications. That said, Cohesity are working tirelessly to add new features to the solution at a rapid pace, and are looking to close the gap between themselves and some of the more established players in the market. The value here, however, isn’t just in the extensive data protection capability, but also in the analytics that can be leveraged to provide further insight into your organisation’s protected data. It’s sometimes not immediately obvious why you need to be mining your unstructured data for information. But get yourself the right tools and the right people and you can discover a whole lot of very useful (and sometimes scary) information about your organisation that you wouldn’t otherwise know. And it’s that stuff that lies beneath the surface that can have a real impact on your organisation’s success. Even if it is a little boring.

WekaIO – Not The Matrix You’re Thinking Of

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

WekaIO recently presented at Storage Field Day 15. It’s not the first time I’ve heard from them, and you can read my initial thoughts on them here. You can see their videos from Storage Field Day 15 here, and download a PDF copy of my rough notes from here.

 

Enter The Matrix

Fine, I just rewatched The Matrix on the plane home. But any company with Matrix in the product name is going to get a few extra points from me. So what is it, Neo?

  • Fully coherent POSIX file system that delivers local file system performance;
  • Distributed Coding, more resilient at scale, fast rebuilds, end to end data protection
  • Instantaneous snapshots, clones, tiering to S3, partial file rehydration;
  • InfiniBand or Ethernet, Hyper-converged or Dedicated Storage Server; and
  • Bare-metal, containerised, or running in a VM.

There’s an on-premises version and one built for public cloud use.

Liran Zvibel (Co-founder and CEO) took us through some of the key features of the architecture.

Software based for dynamic scalability

  • Software scales to thousands of nodes and trillions of records;
  • Significantly more scalable than any appliance offering; and
  • Metadata scales to thousands of servers.

Patented erasure coding technology

  • Allows us to use 66% less NVMe compared to triple replication;
  • Fully distributed data and metadata for best parallelism / performance; and
  • Snapshots for “free” with no performance impact.

Integrated tiering in a single namespace

  • Allows for unlimited namespace critical for deep learning; and
  • Enables backup and cloud bursting to public cloud.

 

I Know Kung Fu

[Look, I’m just going to torture the Matrix analogy for a little longer, so bear with me]. So what do I do with all of this performance in a storage subsystem? Well, the key focus areas for WekaIO include:

  • Machine learning / AI;
  • Digital Radiology / Pathology;
  • Algorithmic Trading; and
  • Genomic Sequencing and Analytics.

Most of these workloads deal with millions of files, very large capacities, and are very sensitive to poor latency. There’s also a cool use case for media and entertainment environments that’s worth checking out if you’re into that sort of thing.

 

Thoughts

WekaIO are aiming to do about 30% of their sales directly, meaning they lean heavily on the channel. Both HPE and Penguin Computing are OEM providers, and obviously there’s also a software-only play with the AWS version. They’re talking about delivering some very big numbers when it comes to performance, but my favourite thing about them is the focus on being able to access the same data through all interfaces, and quickly.

WekaIO make some strong claims about their ability to deliver a fast and scalable file system solution, but they certainly have the pedigree to deliver a solution that meets a number of those claims. There’re some nice features, such as the ability to add servers with different profiles to the cluster, and running nodes in hyper-converged mode. When it comes down to it, performance is defined by the amount of cores available. If you add more compute, you get more performance.

In my mind, the solution isn’t for everyone right now, but if you have a requirement for a performance focused, massively parallel, scale-out storage solution with the ability to combine NVMe and S3, you’d do worse than to check out what WekaIO can do.

StarWind VTL? What? Yes, And It’s Great!

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

StarWind recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.

 

VTL? Say What Now?

Max and Anton from StarWind are my favourites. If I was a professional analyst I wouldn’t have favourites, but I do. Anyone who calls their presentation “From Dusk Till Dawn” is alright in my books. Here’s a shot of Max presenting.

 

In The Beginning

The concept of sending recovery data to tape is not a new one. After all, tape was often referred to as “backup’s best friend”. Capacity-wise it’s always been cheap compared to disk, and it’s been a (relatively) reliable medium to work with. This was certainly the case in the late 90s when I got my start in IT. Since then, though, disks have come a long way in terms of capacity (and reduced cost). StorageTek introduced Virtual Tape Libraries (VTLs) in the late 90s and a lot of people moved to using disk storage for their backups. Tape still played a big part in this workflow, with a lot of people being excited about disk to disk to tape (D2D2T) architectures in the early 2000s. IT was cool because it was a fast way to do backups (when it worked). StarWind call this the “dusk” of the VTL era.

 

Disks? Object? The Cloud? Heard Of Them?

According to StarWind though (and I have anecdotal evidence to support this), backup applications (early on) struggled to speak sensibly to disk. Since then, object storage has become much more popular. StarWind also suggested that it’s hard to do file or block to object effectively.

Tape (or a tape-like mechanism) for cold data is still a great option.  No matter how you slice it, tape is still a lot cheaper than disk. At least in terms of raw $/GB. It also offers:

  • Longevity;
  • Can be stored offline; and
  • Streams at a reasonably high bandwidth.

Object storage is a key cloud technology. And object storage can deliver similar behaviour to tape, in that it is:

  • Non-blocking;
  • Capable of big IO; and
  • Doesn’t need random writes.

From StarWind’s perspective, the “dawn” of VTL is back. The combination of cheap disk, mature object storage technology and newer backup software means that VTL can be a compelling option for business that still needs a tape-like workflow. They offer a turnkey appliance, based on NL-SAS. It has 16 drives per appliance (in a 3.5” form factor), delivering roughly 120TB of capacity before deduplication. You can read more about it here.

 

Thoughts And Conclusion

StarWind never fail to deliver an interesting presentation at Tech Field Day events. I confess I didn’t expect to be having a conversation with someone about their VTL offering. But I must also confess that I do come across customers in my day job who still need to leverage VTL technologies to ensure their data protection workflow continues to work. Why don’t they re-tool their data protection architecture to get with the times? I wish it were that simple. Sometimes the easiest part of modernising your data protection environment is simply replacing the hardware.

StarWind are not aiming to compete in enterprise environments, focusing more on the SMB market. There are some nice integration points with their existing product offerings. And the ability to get the VTL data to a public cloud offering will keep CxOs playing the “cloud at all cost” game happy as well.

[Image courtesy of StarWind]

 

There are a lot of reasons to get your data protected in as many locations as possible. StarWind has a good story here with the on-premises part of the equation. According to StarWind, VTL will remain around “until backup applications (all of them) learn all cloud and on-premises object storage APIs … or until all object storage settles on a single, unified “standard” API”. This looks like it might still be some time away. A lot of environments are still using technology from last decade to perform business-critical functions inside their companies. There’s no shame in delivering products that can satisfy that market segment. It would be nice if everyone would refactor their applications for cloud, but it’s simply not the case right now. StarWind understand this, and understand that VTL is performs a useful function right now, particularly in environments where the advent of virtualisation might still be a recent event. I know people still using VTL in crusty mainframe environments and flashy, cloud-friendly, media and entertainment shops. Tape might be dead, but it feels like there are a lot of folks still using it, or its virtual counterpart.

Dropbox – It’s Scale Jim, But Not As We Know It

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Dropbox recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.

 

What’s That In Your Pocket?

James Cowling spent some time talking to us about Dropbox’s “Magic Pocket” system architecture. Despite the naff name, it’s a pretty cool bit of tech. Here’s a shot of James answering a question.

 

Magic Pocket

Dropbox uses Magic Pocket to store users’ file content:

  • 1+ EB of user file data currently stored
  • Growing at over 10PB per month

Customising the stack end-to-end allowed them to:

  • Improve performance and reliability for our unique use case
  • Improve economics

 

Inside the Magic Pocket

Brief history of development

  • Prototype and development
  • Production validation
    • Ran in dark phase to find any unknown bugs
    • Deleted first byte of data from third party cloud provider in February 2015
  • Scale out and cut over
    • 600,000+ disks
    • 3 regions in USA, expanding to EU
  • Migrated more than 500PB of user data from third party cloud provider into MP in 6 months

It’s worth watching the video to get a feel for the scale of the operation. You can also read more on the Magic Pocket here and here. Chan also did a nice write-up that you can access here.

 

Beyond Public Cloud

A bit’s been made of Dropbox’s move from public cloud back to its own infrastructure, but Dropbox were careful to point out that they used third parties where it made sense for them, and still leveraged various public cloud and SaaS offerings as part of their daily operations. The key for them was understanding whether building their own solution made sense or not. To that end, they asked themselves three questions:

  • At what scale is investment in infrastructure cost effective?
  • Will this scale enable innovation by building custom services and integrating hardware / software more tightly?
  • Can that innovation add value for users?

From a scale perspective, it was fairly simple, with Dropbox being one of the oldest, largest and most used collaboration platforms around. From an integration perspective, they needed a lot of network and storage horsepower, which set them apart from some of the other web-scale services out there. They were able to add value to users through an optimised stack, increased reliability and better security.

 

It Makes Sense, But It’s Not For Everyone

That all sounds pretty good, but one of the key things to remember is that they haven’t just cobbled together a bunch of tin and a software stack and become web-scale overnight. While the time to production was short, all things considered, there was still investment (in terms of people, infrastructure and so forth) in making the platform work. When you commit to going your own way, you need to be mindful that there are a lot of ramifications involved, including the requirement to invest in people who know what they’re doing, the capacity to do what you need to do from a hardware perspective, and the right minds to come up with the platform to make it all work together. The last point is probably hardest for people to understand. I’ve ranted before about companies not being anywhere near the scale of Facebook, Google or the other hyperscalers and expecting that they can deliver similar services, for a similar price, with minimal investment.

Scale at this level is a hard thing to do well, and it takes investment in terms of time and resources to get it right. And to make that investment it has to make sense for your business. If your company’s main focus is putting nuts and bolts together on an assembly line, then maybe this kind of approach to IT infrastructure isn’t really warranted. I’m not suggesting that we can’t all learn something from the likes of Dropbox in terms of how to do cool infrastructure at scale. But I think they key takeaways should be that Dropbox have:

  • Been around for a while;
  • Put a lot of resources into solving the problems they faced; and
  • Spent a lot of time deciding what did and did not make sense to do themselves.

I must confess I was ignorant of the scale at which Dropbox is operating, possibly because I saw them as a collaboration piece and didn’t really think of them as an infrastructure platform company. The great thing, however, is they’re not just a platform company. In the same way that Netflix does a lot of really cool stuff with their tech, Dropbox understands that users value performance, reliability and security, and have focused their efforts on ensuring that the end user experience meets those requirements. The Dropbox backend infrastructure makes for a fascinating story, because the scale of their operations is simply not something we come across every day. But I think the real success for Dropbox is their relentless focus on making the end user experience a positive one.