Random Short Take #12

Here are a few links to some random news items and other content that I found interesting. You might find it interesting too. Maybe.

  • I’ve been a fan of Backblaze for some time now, and I find their blog posts useful. This one, entitled “A Workflow Playbook for Migrating Your Media Assets to a MAM“, was of particular interest to me.
  • Speaking of Backblaze, this article on SSDs and reliability should prove useful, particularly if you’re new to the technology. And the salty comments from various readers are great too.
  • Zerto just announced the myZerto Labs Program as a way for “IT professionals to test, understand and experiment with the IT Resilience Platform using virtual infrastructure”. You can sign up here.
  • If you’re in the area, I’m speaking at the Sydney VMUG UserCon on Tuesday 19th March. I’ll be covering how to “Build Your Personal Brand by Starting and Maintaining a Blog”. It’s more about blogging than branding, but I’m hoping there’s enough to keep the punters engaged. Details here. If you can’t get along to the event, I’ll likely publish the deck on this site in the near future.
  • The nice people at Axellio had some success at the US Air Force Pitch Day recently. You can read more about that here.
  • UltraViolet is going away. This kind of thing is disheartening (and a big reason why I persist in buying physical copies of things still).
  • I’m heading to Dell Technologies World this year. Michael was on the TV recently, talking about the journey and looking ahead. You can see more here.

StorPool And The Death of Hardware-Defined Storage

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

StorPool recently presented at Storage Field Day 18. You can see their videos from Storage Field Day 18 here, and download a PDF copy of my rough notes from here.



StorPool delivers block storage software. Fundamentally, it “pools the attached storage (hard disks or SSDs) of standard servers to create a single pool of shared block storage. The StorPool software is installed on each server in the cluster and combines the performance and capacity of all drives attached to the servers into one global namespace”. There’s a useful technical overview that you can read here.

[image courtesy of StorPool]

StorPool position themselves as a software company delivering scale-out, block storage software. They say they’ve been doing this before SDS / SDN / SDDC & “marketing-defined storage” were popular terms. The idea is that it is always delivered as a working storage solution on customer’s hardware. There are a few ways that the solution can be used, including:

  1. Fully-Managed software + 24/7/365 support, SLAs, etc
  2. On HCL-compatible hardware; or
  3. As a pre-integrated solution.

Data Integrity

The kind of data management features you’d expect from modern storage systems are present here as well, including:

  • Thin provisioning / reclaim;
  • Copy on Write snapshots, clones; and
  • Changed block tracking, incremental recovery, and transfer.

There’s also support for multi-site deployments:

  • Connect 2 or more StorPool clusters over public Internet; and
  • Send snapshots between clusters for backup and DR.

Developed from Scratch

One of the cool things about StorPool is that whole thing has been developed from scratch. They use their own on-disk format, protocol, quorum, client, etc. They’ve had systems running in production for 6+ years, as well as:

  • Numerous 1PB+ flash systems;
  • 17 major releases; and
  • Global customers.

Who Uses It?

So who uses StorPool? Their target customers are companies building private and public clouds, including:

  • Service Providers and folk operating public clouds; and
  • Enterprises and various private cloud implementations.

That’s obviously a fairly broad spectrum of potential customers, but I think that speaks somewhat to the potential versatility of software-defined solutions.


Thoughts and Further Reading

“Software-defined” storage solutions have become more and more popular in the last few years. Customers seem to be getting more comfortable with using and supporting their own hardware (up to a point), and vendors seem to be more willing to position these kinds of solutions as viable, production-ready platforms. It helps tremendously, in my opinion, that a lot of the heavy lifting previously done with dedicated silicon on traditional storage systems can now be done by a core on an x86 or ARM-based CPU. And there seem to be a lot more cores going around, giving vendors the option to do a lot more with these software-defined systems too.

There are a number of benefits to adopting software-defined solutions, including the ability to move from one hardware supplier to another without the need to dramatically change the operation environment. There’s a good story to be had in terms of updates too, and it’s no secret that people like that they aren’t tied to the vendor’s professional services arm to get installations done in quite the same way they perhaps were with dedicated storage arrays. It’s important to remember, though, that software isn’t magic. If you throw cruddy hardware at a solution like StorPool, it’s not going to somehow exceed the limitations of that hardware. You still need to give it some grunt to get some good performance in return. That said, there are plenty of examples where software-defined solutions can be improved dramatically through code optimisations, without changing hardware at all.

The point of all this is that, whilst I don’t really think hardware-defined storage solutions are going anywhere for the moment, companies like StorPool are certainly delivering compelling solutions in code that mean you don’t need to be constrained by what the big box storage vendors are selling you. StorPool have put some careful consideration into the features they offer with their platform, and have also focused heavily on the possible performance that could be achieved with the solution. There’s a good resilience story there, and it seems to be very service provider-friendly. Of course, everyone’s situation is different, and not everyone will get what they need from something like StorPool. But if you’re in the market for a distributed block storage system, and have a particular hankering to run it on your own, preferred, flavour of hardware, something like StorPool is certainly worthy of further investigation. If you want to dig in a little more, I recommend checking out the resources section on the StorPool website – it’s packed with useful information. And have a look at Ray’s article as well.

VMware – vExpert 2019


I’m very happy to have been listed as a vExpert for 2019. This is the seventh time that they’ve forgotten to delete my name from the list (if you think I’ll ever give up on that joke you are sadly mistaken). Read about it here, and more news about this year’s programme is coming shortly. Thanks again to Corey Romero and the rest of the VMware Social Media & Community Team for making this kind of thing happen. And thanks also to the vExpert community for being such a great community to be part of.

NetApp And The Space In Between

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

NetApp recently presented at Storage Field Day 18. You can see their videos from Storage Field Day 18 here, and download a PDF copy of my rough notes from here.


Bye, Dave

We were lucky enough to have Dave Hitz (now “Founder Emeritus” at NetApp) spend time with us on his last day in the office. I’ve only met him a few times but I’ve always enjoyed listening to his perspectives on what’s happening in the industry.

Cloud First?

In a previous life I worked in a government department architecting storage and virtualisation solutions for a variety of infrastructure scenarios. The idea, generally speaking, was that those solutions would solve particular business problems, or at least help to improve the processes to resolve those problems. At some point, probably late 2008 or early 2009, we started to talk about developing a “Cloud First” architecture policy, with the idea being that we would resolve to adopt cloud technologies where we could, and reduce our reliance on on-premises solutions as time passed. The beauty of working in enterprise environments is that things can take an awfully long time to happen, so that policy didn’t really come into effect until some years later.

So what does cloud first really mean? It’s possibly not as straightforward as having a “virtualisation first” policy. With the virtualisation first approach, there was a simple qualification process we undertook to determine whether a particular workload was suited to run on our virtualisation platform. This involved all the standard stuff, like funding requirements, security constraints, anticipated performance needs, and licensing concerns. We then pushed the workload one of two ways. With cloud though, there are a few more ways you can skin the cat, and it’s becoming more obvious to me that cloud means different things to different people. Some people want to push workloads to the cloud because they have a requirement to reduce their capital expenditure. Some people have to move to cloud because the CIO has determined that there needs to be a reduction in the workforce managing infrastructure activities. Some people go to cloud because they saw a cool demo at a technology conference. Some people go to cloud because their peers in another government department told them it would be easy to do. The common thread is that “people’s paths to the cloud can be so different”.

Can your workload even run in the cloud? Hitz gave us a great example of some stuff that just can’t (a printing press). The printing press needs to pump out jobs at a certain time of the day every day. It’s not going to necessarily benefit from elastic scalability for its compute workload. The workloads driving the presses would likely run a static workload.

Should it run in the cloud?

It’s a good question to ask. Most of the time, I’d say the answer is yes. This isn’t just because I work for a telco selling cloud products. There are a tonne of benefits to be had in running various, generic workloads in the cloud. Hitz suggests though, that the should it question is a corporate strategy question, and I think he’s spot on. When you embed “cloud first” in your infrastructure architecture, you’re potentially impacting a bunch of stuff outside of infrastructure architecture, including financial models, workforce management, and corporate security postures. It diens’t have to be a big deal, but it’s something that people sometimes don’t think about. And just because you start with that as your mantra, doesn’t mean you need to end up in cloud.

Does It Feel Cloudy?

Cloudy? It’s my opinion that NetApp’s cloud story is underrated. But, as Hitz noted, they’ve had the occasional misstep. When they first introduced Cloud ONTAP, Anthony Lye said it “didn’t smell like cloud”. Instead, Hitz told us he said it “feels like a product for storage administrators”. Cloudy people don’t want that, and they don’t want to talk to storage administrators. Some cloudy people were formerly storage folks, and some have never had the misfortune of managing over-provisioned midrange arrays at scale. Cloud comes in all different flavours, but it’s clear that just shoving a traditional on-premises product on a public cloud provider’s infrastructure isn’t really as cloudy as we’d like to think.


Bridging The Gap

NetApp are focused now on “finding the space between the old and the new, and understanding that you’ll have both for a long time”. And that’s what NetApp’s focusing on moving forward. They’re not just working on cloud-only solutions, and they have no plans to ditch their on-premises. Indeed, as Hitz noted in his presentation, “having good cloudy solutions will help them gain share in on-premises footprint”. It’s a good strategy, as the on-premises market will be around for some time to come (do you like how vague that is?). It’s been my belief for some time that companies, like NetApp, that can participate in both the on-premises and cloud market effectively will be successful.


Thoughts and Further Reading

So why did I clumsily paraphrase a How To Destroy Angels song title and ramble on about the good old days of my career in this article instead of waxing lyrical about Charlotte Brooks’s presentation on NetApp Data Availability Services? I’m not exactly sure. I do recommend checking out Charlotte’s demo and presentation, because she’s really quite good at getting the message across, and NDAS looks pretty interesting.

Perhaps I spent the time focusing on the “cloud first” conversation because it was Dave Hitz, and it’s likely the last time I’ll see him presenting in this kind of forum. But whether it was Dave or not, conversations like this one are important, in my opinion. It often feels like we’re putting the technology ahead of the why. I’m a big fan of cloud first, but I’m an even bigger fan of people understanding the impact that their technology decisions can have on the business they’re working for. It’s nice to see a vendor who can comfortably operate on both sides of the equation having this kind of conversation, and I think it’s one that more businesses need to be having with their vendors and their internal staff.

Cohesity Is (Data)Locked In

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Cohesity recently presented at Storage Field Day 18. You can see their videos from Storage Field Day 18 here, and download a PDF copy of my rough notes from here.


The Cohesity Difference?

Cohesity covered a number of different topics in its presentation, and I thought I’d outline some of the Cohesity features before I jump into the meat and potatoes of my article. Some of the key things you get with Cohesity are:

  • Global space efficiency;
  • Data mobility;
  • Data resiliency & compliance;
  • Instant mass restore; and
  • Apps integration.

I’m going to cover 3 of the 5 here, and you can check the videos for details of the Cohesity MarketPlace and the Instant Mass Restore demonstration.

Global Space Efficiency

One of the big selling points for the Cohesity data platform is the ability to deliver data reduction and small file optimisation.

  • Global deduplication
    • Modes: inline, post-process
  • Archive to cloud is also deduplicated
  • Compression
    • Zstandard algorithm (read more about that here)
  • Small file optimisation
    • Better performance for reads and writes
    • Benefits from deduplication and compression

Data Mobility

There’s also an excellent story when it comes to data mobility, with the platform delivering the following data mobility features:

  • Data portability across clouds
  • Multi-cloud replication and archival (1:many)
  • Integrated indexing and search across locations

You also get simultaneous, multi-protocol access and a comprehensive set of file permissions to work with.


But What About Archives And Stuff?

Okay, so all of that stuff is really cool, and I could stop there and you’d probably be happy enough that Cohesity delivers the goods when it comes to a secondary storage platform that delivers a variety of features. In my opinion, though, it gets a lot more interesting when you have a look at some of the archival features that are built into the platform.

Flexible Archive Solutions

  • Archive either on-premises or to cloud;
  • Policy driven archival schedule for long term data retention
  • Data an be retrieved to the same or a different Cohesity cluster; and
  • Archived data is subject to further deduplication.

Data Resiliency and Compliance – ensures data integrity

  • Erasure coding;
  • Highly available; and
  • DataLock and legal hold.

Achieving Compliance with File-level DataLock

In my opinion, DataLock is where it gets interesting in terms of archive compliance.

  • DataLock enables WORM functionality at a file level;
  • DataLock adheres to regulatory acts;
  • Can automatically lock a file after a period of inactivity;
  • Files can be locked manually by setting file attributes;
  • Minimum and maximum retention times can be set; and
  • Cohesity provides a unique RBAC role for Data Security administration.

DataLock on Backups

  • DataLock enables WORM functionality;
  • Prevent changes by locking Snapshots;
  • Applied via backup policy; and
  • Operations performed by Data Security administrators.


Ransomware Detection

Cohesity also recently announced the ability to look within Helios for Ransomware. The approach taken is as follows: Prevent. Detect. Respond.


There’s some good stuff built into the platform to help prevent ransomware in the first place, including:

  • Immutable file system
  • DataLock (WORM)
  • Multi-factor authentication


  • Machine-driven anomaly detection (backup data, unstructured data)
  • Automated alert


  • Scalable file system to store years worth of backup copies
  • Google-like global actionable search
  • Instant mass restore


Thoughts and Further Reading

The conversation with Cohesity got a little spirited in places at Storage Field Day 18. This isn’t unusual, as Cohesity has had some problems in the past with various folks not getting what they’re on about. Is it data protection? Is it scale-out NAS? Is it an analytics platform? There’s a lot going on here, and plenty of people (both inside and outside Cohesity) have had a chop at articulating the real value of the solution. I’m not here to tell you what it is or isn’t. I do know that a lot of the cool stuff with Cohesity wasn’t readily apparent to me until I actually had some stick time with the platform and had a chance to see some of its key features in action.

The DataLock / Security and Compliance piece is interesting to me though. I’m continually asking vendors what they’re doing in terms of archive platforms. A lot of them look at me like I’m high. Why wouldn’t you just use software to dump your old files up to the cloud or onto some cheap and deep storage in your data centre? After all, aren’t we all using software-defined data centres now? That’s certainly an option, but what happens when that data gets zapped? What if the storage platform you’re using, or the software you’re using to store the archive data, goes bad and deletes the data you’re managing with it? Features such as DataLock can help with protecting you from some really bad things happening.

I don’t believe that data protection data should be treated as an “archive” as such, although I think that data protection platform vendors such as Cohesity are well placed to deliver “archive-like” solutions for enterprises that need to retain protection data for long periods of time. I still think that pushing archive data to another, dedicated, tier is a better option than simply calling old protection data “archival”. Given Cohesity’s NAS capabilities, it makes sense that they’d be an attractive storage target for dedicated archive software solutions.

I like what Cohesity have delivered to date in terms of a platform that can be used to deliver data insights to derive value for the business. I think sometimes the message is a little muddled, but in my opinion some of that is because everyone’s looking for something different from these kinds of platforms. And these kinds of platforms can do an awful lot of things nowadays, thanks in part to some pretty smart software and some grunty hardware. You can read some more about Cohesity’s Security and Compliance story here,  and there’s a fascinating (if a little dated) report from Cohasset Associates on Cohesity’s compliance capabilities that you can access here. My good friend Keith Townsend also provided some thoughts on Cohesity that you can read here.

Storage Field Day 18 – (Fairly) Full Disclosure

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my notes on gifts, etc, that I received as a delegate at Storage Field Day 18. I’d like to point out that I’m not trying to play companies off against each other. I don’t have feelings one way or another about receiving gifts at these events (although I generally prefer small things I can fit in my suitcase). Rather, I’m just trying to make it clear what I received during this event to ensure that we’re all on the same page as far as what I’m being influenced by. Some presenters didn’t provide any gifts as part of their session – which is totally fine. I’m going to do this in chronological order, as that was the easiest way for me to take notes during the week. Whilst every delegate’s situation is different, my employer paid for 4 days and I took 1 day of unpaid leave to be at this event.



My wife kindly drove me to the domestic airport. I have some status with Qantas so I get a little bit of special treatment and lounge access. In Sydney I had some cheese, sausage and crackers and a Coopers Original Pale Ale in the lounge. I flew BNE – SYD – SFO in economy class, and this was covered by Gestalt IT.



Pavilion Data were kind enough to host me at the Warriors – Rockets game. We started at the Westin SFO with a few Lagunitas Pils beers before the game. We then caught a ride sharing service to Oakland Arena for the game. At the game we had a few Firestone 805 beers. After the game we took a car back to the Westin for a few more beers. I also had a Kobe beef burger (with American Kobe burger, caramelized onion, mushroom, cheese, and ginger gastrique sauce) at the hotel bar. I then shared a car back to my accommodation in Menlo Park. This was all covered by Pavilion Data. It was a great night, notwithstanding the Warriors losing the game.



On Monday morning I met with Pavilion Data at their offices. They kindly fed me some lunch from Chipotle. I had a taco and some rice and beans. I also had a chocolate croissant and a cup of Keurig pod coffee.



I had lunch with Georgiana Comsa of Silicon Valley PR and two of my friends at Refuge in Menlo Park. I had the Western Cheesesteak sandwich (with bacon, fried onions, provolone cheese, BBQ sauce, cilantro, and ranch dressing), and a litre of Radeberger Pils. I paid for Georgiana’s and my meals, and my friends paid for theirs. Georgiana covered the tip.

The Field Day delegates all met at our hotel on Tuesday afternoon. I had one Sam Adams Lager at the bar before dinner. I had two more Sam Adams at dinner, as well as a variety of dishes served “family style”, including:

  • Brussels with shaved brussel sprouts, pecorino, lemon, almonds, and cracked pepper;
  • Agnolotti with english peas, heirloom fingerlings, and fennel sofrito;
  • Bavette with 100% grassfed beef from marin sun farms, unfiltered olive oil, and fleur de sel;
  • Chicken with pasture raised marin sun farms chicken, green garlic, kamut, pickled mushrooms; and
  • Gelato – asian pear sorbet, parsnip gelato.

The food was great, and the parsnip gelato tasted surprisingly good. I received a lovely Kuala Lumpur panoramic photo book as part of the Yankee Gift Swap from Chin-Fah Heoh. Stephen also gave us a 3D-printed SFD18 souvenir.

I then had two Sam Adams beers at the bar after dinner.



Breakfast at the hotel was scrambled eggs, super crispy bacon, potato, and coffee. I picked up a ExploreVM sticker, a (now collectible) Greybeards On Storage sticker, and some TechUnplugged Podcast stickers at our pre-event meeting. We were all given a Gestalt IT clear bag with a few bits and pieces in it.

WekaIO gave each delegate a WekaIO-branded Qi charger, WekaIO-branded notepad with built-in 4GB USB drive, and a WekaIO sticker.

For lunch we ate at the hotel. I had a burger with beef, tomato, lettuce, cheese, mayo, pickles, sesame seed bun, and some coffee-flavoured mousse for dessert.

VAST Data gave me a sticker and t-shirt. StorPool gave me a t-shirt, sticker, notebook, screwdriver multi-tool, and carry bag. After the day’s sessions ended, we had a mixer (including dinner) at Olla Cocina in San Jose. I had a Carne Asada slider, some chips and guacamole, empanadas, and prawn skewers. I also had a few tacos (al pastor, carne asada, and chicken) as well as a few Modelo Especial beers. Back at the hotel I had a few more Sam Adams beers.



We had breakfast at Western Digital. I had sausage, scrambled eggs, bacon, fruit, juice, and coffee. WD also kindly provided each delegate with a 1TB WD Black drive SN750. We had lunch at NetApp with Dave Hitz. I had a San Pellegrino sparkling water, herb baked salmon, braised beef short rib stew, creamy mashed potatoes, and salad (Lolla Rossa lettuce & cucumber, balsamic vinaigrette). It was delicious.

We went to dinner at Faultline Brewing. I had a few Kolsch beers, along with:

  • Chips ‘N’ Onion Dip – House made kettle chips caramelized onion-stout dip;
  • Crispy Calamari – Fried green beans / fried onion strings / chipotle aioli; and
  • Smoked Sausage Sandwich – Pale ale beer brat / Swiss cheese / sauerkraut / beer mustard / pretzel-hoagie roll / kettle chips.

Faultline never disappoints. I had a Sam Adams at the hotel bar afterwards and retired early to try and get some decent sleep.



We had breakfast at the hotel. This was scrambled eggs, super crispy bacon, potato, and coffee. I should have had the fruit, but there you go. Cohesity gave us a nice warm jacket with custom embroidery, and some chocolate.

I had a few coffees during the Cohesity session. We also had lunch at Cohesity. This was a variety of Mediterranean food from Dish and Dash.

After the last session ended on Friday, we all went to Georgiana Comsa’s house for a garden party. I had a few Firestone 805 beers and gotten beaten at H-O-R-S-E. I then took a car to SFO (paid for by Gestalt IT) and flew back to BNE. It was a great week. Thanks again to Tech Field Day for having me, thanks to the other delegates for being super nice and smart, and thanks to the presenters for some really educational and engaging sessions.

Storage Field Day 18 – Day 0

Disclaimer: I recently attended Storage Field Day 18.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

This is just a quick post to share some thoughts on day zero at Storage Field Day 18. Thanks again Stephen and the team at Gestalt IT for having me back, making sure everything is running according to plan and for just being really very decent people. I’ve really enjoyed catching up with the people I’ve met before and meeting the new delegates. Look out for some posts related to the Tech Field Day sessions in the next few weeks. And if you’re in a useful timezone, check out the live streams from the event here, or the recordings afterwards.

Here’s the rough schedule for the next three days (all times are ‘Merican Pacific).

*Updated schedule below – Datera have moved from 8am Pacific to 1:30pm on Friday 1st.*

Wednesday, Feb 27 09:30-11:30 WekaIO Presents at Storage Field Day 18
Wednesday, Feb 27 13:00-15:00 VAST Data Presents at Storage Field Day 18
Wednesday, Feb 27 15:30-17:30 StorPool Presents at Storage Field Day 18
Thursday, Feb 28 09:00-12:00 Western Digital Presents at Storage Field Day 18
Thursday, Feb 28 13:00-15:00 NetApp Presents at Storage Field Day 18
Thursday, Feb 28 16:30-18:00 IBM Storage Presents at Storage Field Day 18
Friday, Mar 1 10:00-12:00 Cohesity Presents at Storage Field Day 18
Friday, Mar 1 13:30-14:30 Datera Presents at Storage Field Day 18

You could also follow along with the livestream here.

Brisbane VMUG – March 2019


The March 2019 edition of the Brisbane VMUG meeting will be held on Tuesday 26th March at Fishburners from 4pm – 6pm. It’s sponsored by Dell and promises to be a great afternoon.

Here’s the agenda:

  • VMUG Intro (by me)
  • VMware / Dell Presentation: Dell Factory Provisioning for Windows 10 with Workspace ONE (Pete Lindley)
  • VMware Presentation: VMware Education update and roadmap discussion (Shamus Hayes)
  • Q&A
  • Refreshments and drinks.

Dell have gone to great lengths to make sure this will be a fun and informative session and I’m really looking forward to hearing about some of the cool stuff you can do with Workspace ONE. You can find out more information and register for the event here. I hope to see you there. Also, if you’re interested in sponsoring one of these events, please get in touch with me and I can help make it happen.

Veeam Vanguard 2019

I was very pleased to get an email from Rick Vanover yesterday letting me know I was accepted as part of the Veeam Vanguard Program for 2019. This is my first time as part of this program, but I’m really looking forward to participating in it. Big shout out to Dilupa Ranatunga and Anthony Spiteri for nominating me in the first place, and for Rick and the team for having me as part of the program. Also, (and I’m getting a bit parochial here) special mention of the three other Queenslanders in the program (Rhys Hammond, Nathan Oldfield, and Chris Gecks). There’s going to be a lot of cool stuff happening with Veeam and in data protection generally this year and I can’t wait to get started. More soon.

Komprise Continues To Gain Momentum

I first encountered Komprise at Storage Field Day 17, and was impressed by the offering. I recently had the opportunity to take a briefing with Krishna Subramanian, President and COO at Komprise, and thought I’d share some of my notes here.




The primary reason for our call was to discuss Komprise’s Series C funding round of US $24 million. You can read the press release here. Some noteworthy achievements include:

  • Revenue more than doubled every single quarter, with existing customers steadily growing how much they manage with Komprise; and
  • Some customers now managing hundreds of PB with Komprise.


Key Verticals

Komprise are currently operating in the following key verticals:

  • Genomics and health care, with rapidly growing footprints;
  • Financial and Insurance sectors (5 out of 10 of the largest insurance companies in the world apparently use Komprise);
  • A lot of universities (research-heavy environments); and
  • Media and entertainment.


What’s It Do Again?

Komprise manages unstructured data over three key protocols (NFS, SMB, S3). You can read more about the product itself here, but some of the key features include the ability to “Transparently archive data”, as well as being able to put a copy of your data in another location (the cloud, for example).


So What’s New?

One of Komprise’s recent announcements was NAS to NAS migration.  Say, for example, you’d like to migrate your data from an Isilon environment to FlashBlade, all you have to do is set one as a source, and one as target. The ACLs are fully preserved across all scenarios, and Komprise does all the heavy lifting in the background.

They’re also working on what they call “Deep Analytics”. Komprise already aggregates file analytics data very efficiently. They’re now working on indexing metadata on files and exposing that index. This will give you “a Google-like search on all your data, no matter where it sits”. The idea is that you can find data using any combination of metadata. The feature is in beta right now, and part of the new funding is being used to expand and grow this capability.


Other Things?

Komprise can be driven entirely from an API, making it potentially interesting for service providers and VARs wanting to add support for unstructured data and associated offerings to their solutions. You can also use Komprise to “confine” data. The idea behind this is that data can be quarantined (if you’re not sure it’s being used by any applications). Using this feature you can perform staged deletions of data once you understand what applications are using what data (and when).



I don’t often write articles about companies getting additional funding. I’m always very happy when they do, as someone thinks they’re on the right track, and it means that people will continue to stay employed. I thought this was interesting enough news to cover though, given that unstructured data, and its growth and management challenges, is an area I’m interested in.

When I first wrote about Komprise I joked that I needed something like this for my garage. I think it’s still a valid assertion in a way. The enterprise, at least in the unstructured file space, is a mess based on the what I’ve seen in the wild. Users and administrators continue to struggle with the sheer volume and size of the data they have under their management. Tools such as this can provide valuable insights into what data is being used in your organisation, and, perhaps more importantly, who is using it. My favourite part is that you can actually do something with this knowledge, using Komprise to copy, migrate, or archive old (and new) data to other locations to potentially reduce the load on your primary storage.

I bang on all the time about the importance of archiving solutions in the enterprise, particularly when companies have petabytes of data under their purview. Yet, for reasons that I can’t fully comprehend, a number of enterprises continue to ignore the problem they have with data hoarding, instead opting to fill their DCs and cloud storage with old data that they don’t use (and very likely don’t need to store). Some of this is due to the fact that some of the traditional archive solution vendors have moved on to other focus areas. And some of it is likely due to the fact that archiving can be complicated if you can’t get the business to agree to stick to their own policies for document management. In just the same way as you can safely delete certain financial information after an amount of time has elapsed, so too can you do this with your corporate data. Or, at the very least, you can choose to store it on infrastructure that doesn’t cost a premium to maintain. I’m not saying “Go to work and delete old stuff”. But, you know, think about what you’re doing with all of that stuff. And if there’s no value in keeping the “kitchen cleaning roster May 2012.xls” file any more, think about deleting it? Or, consider a solution like Komprise to help you make some of those tough decisions.