Random Short Take #8

Here are a few links to some news items and other content that might be useful. Maybe.

NetApp Announces NetApp ONTAP AI

As a member of NetApp United, I had the opportunity to sit in on a briefing from Mike McNamara about NetApp‘s recently announced AI offering, the snappily named “NetApp ONTAP AI”. I thought I’d provide a brief overview here and share some thoughts.


The Announcement

So what is NetApp ONTAP AI? It’s a “proven” architecture delivered via NetApp’s channel partners. It’s comprised of compute, storage and networking. Storage is delivered over NFS. The idea is that you can start small and scale out as required.



  • NVIDIA GPU Cloud Deep Learning Stack
  • NetApp ONTAP 9
  • Trident, dynamic storage provisioner


  • Single point of contact support
  • Proven support model


[image courtesy of NetApp]


Thoughts and Further Reading

I’ve written about NetApp’s Edge to Core to Cloud story before, and this offering certainly builds on the work they’ve done with big data and machine learning solutions. Artificial Intelligence (AI) and Machine Learning (ML) solutions are like big data from five years ago, or public cloud. You can’t go to any industry event, or take a briefing from an infrastructure vendor, without hearing all about how they’re delivering solutions focused on AI. What you do with the gear once you’ve bought one of these spectacularly ugly boxes is up to you, obviously, and I don’t want to get in to whether some of these solutions are really “AI” or not (hint: they’re usually not). While the vendors are gushing breathlessly about how AI will conquer the world, if you tone down the hyperbole a bit, there’re still some fascinating problems being solved with these kinds of solutions.

I don’t think that every business, right now, will benefit from an AI strategy. As much as the vendors would like to have you buy one of everything, these kinds of solutions are very good at doing particular tasks, most of which are probably not in your core remit. That’s not to say that you won’t benefit in the very near future from some of the research and development being done in this area. And it’s for this reason that I think architectures like this one, and those from NetApp’s competitors, are contributing something significant to the ongoing advancement of these fields.

I also like that this is delivered via channel partners. It indicates, at least at first glance, that AI-focused solutions aren’t simply something you can slap a SKU on and sells 100s of. Partners generally have a better breadth of experience across the various hardware, software and services elements and their respective constraints, and will often be in a better position to spend time understanding the problem at hand rather than treating everything as the same problem with one solution. There’s also less chance that the partner’s sales people will have performance accelerators tied to selling one particular line of products. This can be useful when trying to solve problems that are spread across multiple disciplines and business units.

The folks at NVIDIA have made a lot of noise in the AI / ML marketplace lately, and with good reason. They know how to put together blazingly fast systems. I’ll be interested to see how this architecture goes in the marketplace, and whether customers are primarily from the NetApp side of the fence, from the NVIDIA side, or perhaps both. You can grab a copy of the solution brief here, and there’s an AI white paper you can download from here. The real meat and potatoes though, is the reference architecture document itself, which you can find here.

Come And Splash Around In NetApp’s Data Lake

Disclaimer: I recently attended Storage Field Day 15.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

NetApp recently presented at Storage Field Day 15. You can see videos of their presentation here, and download my rough notes from here.


You Say Day-ta, I Say Dar-ta

Santosh Rao (Senior Technical Director, Workloads and Ecosystems) took us through some of the early big data platform challenges NetApp are looking to address.


Early Generation Big Data Analytics Platform

These were designed to deliver initial analytics solutions and were:

  • Implemented as Proof of concept; and
  • Solved a point project need.

The primary considerations of these solutions were usually cost and agility. The focus was to:

  • Limit up front costs and get the system operational quickly; and
  • Scalability, availability, and governance were afterthoughts

A typical approach to this was to use cloud or commodity infrastructure. This ended up becoming the final architecture. The problem with this approach, according to NetApp, is that it lead to unpredictable behaviour as copies manifested. You’d end up with 3-5 replicas of data copied across lines of business and various functions. Not a great situation.


Early Generation Analytics Platform Challenges

Other challenges with this architecture included:

  • Unpredictable performance;
  • Inefficient storage utilisation;
  • Media and node failures;
  • Total cost of ownership;
  • Not enterprise ready; and
  • Storage and compute tied (creating imbalance).


Next Generation Data Pipeline

So what do we really need from a data pipeline? According to NetApp, the key is “Unified Insights across LoBs and Functions”. By this they mean:

  • A unified enterprise data lake;
  • Federated data sources across the 2nd and 3rd platforms;
  • In-place access to the data pipeline (copy avoidance);
  • Spanned across edge, core and cloud; and
  • Future proofed to allow shifts in architecture.

Another key consideration is the deployment. The first proof of concept is performed by the business unit, but it needs to scale for production use.

  • Scale edge, core and cloud as a single pipeline
  • Predictable availability
  • Governance, data protection, security on data pipeline

This provides for a lower TCO over the life of the solution.


Data Pipeline Requirements

We’re not just playing in the core any more, or exclusively in the cloud. This stuff is everywhere. And everywhere you look the requirements differ as well.


  • Massive data (few TB/device/day)
  • Real-time Edge Analytics / AI
  • Ultra Low Latency
  • Network Bandwidth
  • Smart Data Movement


  • Ultra high IO bandwidth (20 – 200+ GBps)
  • Ultra-low latency (micro – nanosecond)
  • Linear scale (1 – 128 node AI)
  • Overall TCO for 1-100+ PB


  • Cloud analytics, AI/DL/ML
  • Consume and not operate
  • Cloud vendor vs on-premises stack
  • Cost-effective archive
  • Need to avoid cloud lock-in

Here’s picture of what the data pipeline looks like for NetApp.

[Image courtesy of NetApp]


NetApp provided the following overview of what the data pipeline looks like for AI / Deep Learning environments. You can read more about that here.

[Image courtesy of NetApp]


What Does It All Mean?

NetApp have a lot of tools at their disposal, and a comprehensive vision for meeting the requirements of big data, AI and deep learning workloads from a number of different angles. It’s not just about performance, it’s about understanding where the data needs to be to be considered useful to the business. I think there’s a good story to tell here with NetApp’s Data Fabric, but it felt a little like there remains some integration work to do. Big data, AI and deep learning means different things to different people, and there’s sometimes a reluctance to change the way people do things for the sake of adopting a new product. NetApp’s biggest challenge will be demonstrating the additional value they bring to the table, and the other ways in which they can help enterprise succeed.

NetApp, like some of the other Tier 1 storage vendors, has a broad portfolio of products at its disposal. The Data Fabric play is a big bet on being able to tie this all together in a way that their competitors haven’t managed to do yet. Ultimately, the success of this strategy will rely on NetApp’s ability to listen to customers and continue to meet their needs. As a few companies have found out the hard way, it doesn’t matter how cool you think your idea is, or how technically innovative it is, if you’re not delivering results for the business you’re going to struggle to gain traction in the market. At this stage I think NetApp are in a good place, and hopefully they can stay there by continuing to listen to their existing (and potentially new) customers.

For an alternative perspective, I recommend reading Chin-Fah’s thoughts from Storage Field Day 15 here.

NetApp United – Happy To Be Part Of The Team


NetApp recently announced the 2018 list of NetApp United members and I’m happy to see I’m on the list. If you’re not familiar with the NetApp United program, it’s “NetApp’s global influencer program. Founded in 2017, NetApp United is a community of 140+ individuals united by their passion for great technology and the desire to share their expertise with the world”. One of the nice things about it is the focus on inclusiveness, community and knowledge sharing. I’m doing a lot more with NetApp than I have in the past and I’m really looking forward to the year ahead from both the NetApp United perspective and the broader NetApp view. You can read the announcement here and view the list of members. Chan also did a nice little write-up you can read here. And while United isn’t a football team, I will leave you with this great quote from the inimitable Eric Cantona – “When the seagulls follow the trawler, it is because they think sardines will be thrown into the sea“.

Brisbane VMUG – November 2017



The November edition of the Brisbane VMUG meeting will be held on Thursday 30th November at the Toobirds Bistro and Bar (127 Creek Street, Brisbane) from 4 – 6:30pm. It’s sponsored by HyTrust and NetApp and promises to be a great afternoon.

Here’s the jam-packed agenda:

  • Refreshments and drinks
  • VMUG Intro (by me)
  • VMware Presentation: vRealize Lifecycle Manager (Michael Francis, VCDX #42)
  • HyTrust Presentation: Regain Control of the Cloud (Kevin Middleton)
  • NetApp Presentation: What’s new @ NetApp? SolidFire, HCI & an irrational love of VVols
  • Skills and Career Progression (Claire O’Dwyer, Sydney VMUG Leader and Recruitment Specialist with FTS Resourcing)
  • Q&A
  • Refreshments and drinks

HyTrust and NetApp have gone to great lengths to make sure this will be a fun and informative session. I’m really looking forward to hearing about HyTrust’s take on protecting virtualised cloud infrastructure and virtual workloads. I’m also interested to hear more about NetApp’s HCI offering, what’s happening with SolidFire and their VVols integration. You can find out more information and register for the event here. I hope to see you there. Also, if you’re interested in sponsoring one of these events, please get in touch with me and I can help make it happen.

The Thing About NetApp HCI Is …

Disclaimer: I recently attended VMworld 2017 – US.  My flights were paid for by ActualTech Media, VMware provided me with a free pass to the conference and various bits of swag, and Tech Field Day picked up my hotel costs. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

You can view the video of NetApp‘s presentation here, and download a copy of my rough notes from here.


What’s In A Name?

There’s been some amount of debate about whether NetApp’s HCI offering is really HCI or CI. I’m not going to pick sides in this argument. I appreciate that words mean things and definitions are important, but I’d like to focus more on what NetApp’s offering delivers, rather than whether someone in Tech Marketing made the right decision to call this HCI. Let’s just say they’re closer to HCI than WD is to cloud.


Ye Olde Architectures (The HCI Tax)

NetApp spent some time talking about the “HCI Tax” – the overhead of providing various data services with first generation HCI appliances. Gabe touched on the impact of running various iterations of controller VMs, along with the increased memory requirements for services such as deduplication, erasure coding, compression, and encryption. The model for first generation HCI is simple – grow your storage and compute in lockstep as your performance requirements increase. The great thing with this approach is that you can start small and grow your environment as required. The problem with this approach is that you may only need to grow your storage, or you may only need to grow your compute requirement, but not necessarily both. Granted, a number of HCI vendors now offer storage-only nodes to accommodate this requirement, but NetApp don’t think the approach is as polished as it could be. The requirement to add compute as you add storage can also have a financial impact in terms of the money you’ll spend in licensing for CPUs. Whilst one size fits all has its benefits for linear workloads, this approach still has some problems.


The New Style?

NetApp suggest that their solution offers the ability to “scale on your terms”. With this you can

  • Optimise and protect existing investments;
  • Scale storage and compute together or independently; and
  • Eliminate the “HCI Tax”.

Note that only the storage nodes have disks, the compute nodes get blanks. The disks are on the front of the unit and the nodes are stateless. You can’t have different tiers of storage nodes as it’s all one cluster. It’s also BYO switch for connectivity, supporting 10/25Gbps. In terms of scalability, from a storage perspective you can scale as much as SolidFire can nowadays (around 100 nodes), and your compute nodes are limited by vSphere’s maximum configuration.

There are “T-shirt sizes” for implementation, and you can start small with as little as two blocks (2 compute nodes and 4 storage nodes). I don’t believe you mix t-shirt sizes in the same cluster. Makes sense if you think about it for more than a second.



Converged and hyper-converged are different things, and I think this post from Nick Howell (in the context of Cohesity as HCI) sums up the differences nicely. However, what was interesting for me during this presentation wasn’t whether or not this qualifies as HCI or not. Rather, it was about NetApp building on the strengths of SolidFire’s storage offering (guaranteed performance with QoS and good scale) coupled with storage / compute independence to provide customers with a solution that seems to tick a lot of boxes for the discerning punter.

Unless you’ve been living under a rock for the last few years, you’ll know that NetApp are quite a different beast to the company first founded 25 years ago. The great thing about them (and the other major vendors) entering the already crowded HCI market is that they offer choices that extend beyond the HCI play. For the next few years at least, there are going to be workloads that just may not go so well with HCI. If you’re already a fan of NetApp, chances are they’ll have an alternative solution that will allow you to leverage their capability and still get the outcome you need. Gabe made the excellent point that “[y]ou can’t go from traditional to cloud overnight, you need to evaluate your apps to see where they fit”. This is exactly the same with HCI. I’m looking forward to see how they go against the more established HCI vendors in the marketplace, and whether the market responds positively to some of the approaches they’ve taken with the solution.

NetApp Doesn’t Want You To Be Special (This Is A Good Thing)

Disclaimer: I recently attended Storage Field Day 13.  My flights, accommodation and other expenses were paid for by Tech Field Day and Pure Storage. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.


I had the good fortune of seeing Andy Banta present at Storage Field Day 13. He spoke about a number of topics, including the death of the specialised admin, and VMware Virtual Volumes integration with SoldFire. You can find a copy of my rough notes from NetApp’s presentation here. You can also find videos from his presentation here.

Changing Times

People have been thinking about the changing role of the specialised IT admin for a while now. The discussion has been traditionally focused on the server administrator’s place in a cloudy world, but the storage administrator’s role is coming under scrutiny in much the same fashion. The reasons for the changing landscape are mostly identical to those that impacted the server administrator’s role:

  • Architectures are becoming easier to manage
  • Tools are designed for rapid deployment, not constantly adjusting
  • The hardware is becoming commoditised
  • Software is defining the features and the admin duties

Storage requirements are more dynamic than before, with transient workloads being seen as more commonplace than the static loads once prevalent in the enterprise. The pace of change is also increasing.

According to NetApp, the key focus areas for operational staff have changed as expectations have evolved.

  • Traditional IT has focused on things being “Available and Reliable”
  • The virtualisation age gave us the opportunity to do more with less
  • The cloud age is causing things to happen faster; and
  • As-a-Service is driving the application evolution.

These new focus areas bring with them a new set of challenges though. As we move from the “legacy” DC to the new now, there are other things we have to consider.

Legacy Data Centre Next Generation Data Centre
Single Tenant Multi-tenant
Isolated Workloads Mixed Workloads
Dedicated Infrastructure Shared Infrastructure
Scale Up Scale Out
Pre-provisioned Capacity Capacity on Demand
Hardware Defined Software Defined
Project Based Self Service
Manual Administration Automation


In case you hadn’t realised it, we’re in a bit of a bad way in a lot of enterprises when it comes to IT operations. NetApp neatly identified what’s going wrong in terms of both business and operational limitations.

Business Limitations

  • Unpredictable application performance
  • Show response to changing business needs
  • Under utilisation of expensive resources

Operational Limitations

  • Storage policies tied to static capabilities
  • All virtual disks treated the same
  • Minimal visibility and control on array
  • Very hands on


The idea is to embrace the “New Evolution” which will improve the situation from both a business and operational perspective.

Business Benefits

  • Guarantee per-application performance
  • Immediately respond to changing needs
  • Scale to match utilisation requirements

Operational benefits

  • Dynamically match storage to application
  • Align virtual disk performance to workload
  • Fully automate control of storage resources


No One is Exempt

Operations is hard. No one misses being focused on server administration. With virtualisation administration there is higher value that can be had. NetApp argue that there are higher value activities that exist for the storage discipline as well. Andy summed it up nicely when he said that “[e]nabling through integrations is the goal”.

People like tuning in to events like Storage Field Day because the presenters and delegates often get deep into the technology to highlight exactly how widget X works and why product Y is super terrific. But there’s a lot of value to be had in understanding the context within which these products exist too. We run technology to serve applications that help businesses do business things. It doesn’t matter how fast the latest NVMe/F product is if the application it dishes up is platformed on Windows 2003 and SQL 2005. Sometimes it’s nice to talk about things that aren’t directly focused on technology to understand why a lot of us are actually here.

Ultimately, the cloud (in its many incantations) is having a big impact on the day job of a lot of people, as are rapid developments in key cloud technologies, such as storage, compute, virtualisation and software defined everything. It’s not only operations staff, but also architects, sales people, coffee shop owners, and all kinds of IT folks within the organisation that are coming to grips with the changing IT landscape. I don’t necessarily buy into the “everything is DevOps now and you should learn to code or die” argument, but I also don’t think the way we did things 10 years ago is not necessarily sustainable anywhere but in the largest and crustiest of enterprise IT shops.

NetApp have positioned this viewpoint because they want us to think that what they’re selling is going to help us transition from rock breakers to automation rock stars. And they’re not the first to think that they can help make it happen. Plenty of companies have come along and told us (for years it seems) that they can improve our lot and make all of our problems go away with some smart automation and a good dose of common sense. Unfortunately, people are still running businesses, and people are still making decisions on how technology is being deployed in the businesses. Which is a shame, because I’d much rather let scripts handle the bulk of the operational work and get on with cool stuff like optimising workloads to run faster and smarter and give more value back to the business. I’m also not saying that what NetApp is selling doesn’t work as they say it will. I’m just throwing in the people element as a potential stumbling block.

Is the role of the specialised storage administrator dead? I think it may be a little premature to declare it dead at this stage. But if you’re spending all of your time carving out LUNs by hand and manually zoning fabrics you should be considering the next step in your evolution. You’re not exempt. You’re not doing things that are necessarily special or unique. A lot of this stuff can be automated. And should be. This stuff is science, not wizardry. Let a program do the heavy lifting. You can focus on providing the right inputs. I’m not by any stretch saying that this is an easy transition. Nor do I think that a lot of people have the answers when confronted with this kind of change. But I think it is coming. While vendors like NetApp have been promising to make administration and management of their products easy for years, it feels like we’re a lot closer to that operational nirvana than we were a few years ago. Which I’m really pretty happy about, and you should be too. So don’t be special, at least not in an operational way.

[image courtesy of Stephen Foskett]

NetApp Aren’t Just a Pretty FAS

Disclaimer: I recently attended Storage Field Day 12.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.


Here are some notes from NetApp‘s presentation at Storage Field Day 12. You can view the video here and download my rough notes here. I made a joke during the presentation about Dave Hitz being lucky enough to sit next to me, but he’s the smart guy in this equation.

While I’ve not had an awful lot to do with NetApp previously, it’s not often I get to meet guys like Dave in real life. As such I found the NetApp presentation to be a tremendous experience. But enough about stars in my eyes. Arthur Lent spent some time covering off two technologies that I found intriguing: SnapCenter and Cloud Control for Microsoft Office 365.

[image courtesy of Tech Field Day]


SnapCenter Overview

SnapCenter is a key part of NetApp’s data protection strategy. You can read about this here. Here’s an overview on what was delivered with version 1.0.

End-to-end Data Protection

  • Simple, scalable, single interfaces to protect enterprise data (physical and virtualised) across the data fabric;
  • Meets SLAs easily by leveraging NTAP technologies;
  • Replaces traditional tape infrastructure with backup to the cloud; and
  • Extensible using user-created custom plug-ins.


Efficient In-place Copy Data Management

  • Leverages your existing NTAP storage infrastructure;
  • Provides visibility of copies across the data fabric; and
  • Enables reuse of copies for test/dev, DR, and analytics.


Accelerated application development

  • Transforms traditional IT to be more agile
  • Empowers application and database admins to self-serve
  • Enables DevOps and data lifecycle management for faster time to market

Sounds pretty good? There’s more though …


New with SnapCenter Version 2.0

  • End-to-end data protection for NAS file services from flash to disk to cloud (public or private);
  • Flexible, cost-effective tape replacement solution;
  • Integrated file catalog for simplified file search and recovery across the hybrid cloud; and
  • Automated protection relationship management and pre canned backup policies reduce management overhead.

SnapCenter custom plug-ins enable the creation and use of custom plugins. There are two community plug-ins available at release. Why use plugins?

  • Some mission critical applications or DBs are difficult to backup;
  • Custom plugins offer a  way to consistently backup almost anything;
  • Write the plugin once and distribute it to multiple hosts through SnapCenter;
  • Get all the SnapCenter benefits; and
  • A plugin only has the capabilities written into it.


Cloud Control for Microsoft Office 365

NetApp advised that this product would be “Available Soon”. I don’t know when that is, but you can read more about it here. NetApp says it offers a “[h]ighly scalable, multi-tenant SaaS offering for data protection, security, and compliance”. In short, it:

  • Is a SaaS offering to provide backup for Office 365 data: Exchange Online, SharePoint Online, OneDrive for Business;
  • Is an automated and simplified way to backup copies of customer’s critical data;
  • Provides flexibility – select your deployment model, archiving length, backup window;
  • Delivers search-and-browse features as well as granular recovery capabilities to find and restore lost data; and
  • Provides off-boarding capability to migrate users (mailboxes, files, folders) and site collections to on-premises.


Use Cases

  • Retain control of sensitive data as you move users, folders, mailboxes to O365;
  • Enable business continuity with fault-tolerant data protection;
  • Store data securely on NetApp at non-MS locations; and
  • Meet regulatory compliance with cloud-ready services.


Conclusion and Further Reading

In my opinion, the improvements in SnapCenter 2.0 demonstrate NetApp’s focus on improving some key elements of the offering, with the ability to use custom plugins being an awesome feature. I’m even more excited by Cloud Control for Office 365, simply because I’ve lost count of the number of enterprises that have shoved their email services up there (“low-hanging fruit” for cloud migration) and haven’t even considered how the hell they’re going to protect or retain the data in a useful way (“Doesn’t Microsoft do that for me?”). The amount of times people have simply overlooked some of the regulatory requirements on corporate email services is troubling, to say the least. If you’re an existing or potential NetApp customer this kind of product is something you should be investigating post haste.

Of course, I’ve barely begun to skim the surface of NetApp’s Data Fabric offering. As a relative newcomer, I’m looking forward to diving into this further in the near future. If you’re thinking of doing the same, I recommend you check out this white paper on NetApp Data Fabric Architecture Fundamentals for a great overview of what NetApp are doing in this space.

Storage Field Day – I’ll Be At Storage Field Day 12

In what can only be considered excellent news, I’ll be heading to the US in early March for another Storage Field Day event. If you haven’t heard of the very excellent Tech Field Day events, you should check them out. I’m looking forward to time travel and spending time with some really smart people for a few days. It’s also worth checking back on the Storage Field Day 12 website during the event (March 8 – 10) as there’ll be video streaming and updated links to additional content. You can also see the list of delegates and event-related articles that have been published.

I think it’s a great line-up of presenting companies this time around. There are a few I’m very familiar with and some I’ve not seen in action before.


It’s not quite a total greybeard convention this time around, but I think that’s only because of Jon‘s relentless focus on personal grooming. I won’t do the delegate rundown, but having met a number of these people I can assure the videos will be worth watching.

Here’s the rough schedule (all times are ‘Merican Pacific and may change).

Wednesday, March 8 10:00 – 12:00 StarWind Presents at Storage Field Day 12
Wednesday, March 8 13:00 – 15:00 Elastifile Presents at Storage Field Day 12
Wednesday, March 8 16:00 – 18:00 Excelero Presents at Storage Field Day 12
Thursday, March 9 08:00 – 10:00 Nimble Storage Presents at Storage Field Day 12
Thursday, March 9 11:00 – 13:00 NetApp Presents at Storage Field Day 12
Thursday, March 9 14:00 – 16:00 Datera Presents at Storage Field Day 12
Friday, March 10 09:00 – 10:00 SNIA Presents at Storage Field Day 12
Friday, March 10 10:30 – 12:30 Intel Presents at Storage Field Day 12

I’d like to publicly thank in advance the nice folks from Tech Field Day who’ve seen fit to have me back, as well as my employer for giving me time to attend these events. Also big thanks to the companies presenting. It’s going to be a lot of fun. Seriously.

Storage Field Day 7 – Day 1 – Catalogic Software

Disclaimer: I recently attended Storage Field Day 7.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the Catalogic Software presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Catalogic Software website that covers some of what they presented.



According to their website, “ECX is an intelligent copy data management [IDM] platform that allows you to manage, orchestrate and analyze your copy data lifecycle across your enterprise and cloud”. If you’ve ever delivered storage in an enterprise environment before, you’ll understand that copy data management (CDM) is something that can have a significant impact on your infrastructure, and it’s not always something people do well, or even understand.

Ed Walls, CEO of Catalogic, talked a bit about current challenges – growth, manageability, business agility. We’re drowning in a deluge of copy data, with most of these copies sit completely idle. This observation certainly aligns with my experience in a number of environments.

Catalogic’s IDM is a combination of your storage (currently only NetApp) and a CDM platform (provided via an agentless, downloadable VM). You can use this platform to provide “copy data leverage”, enabling orchestration and automation of your copy data. Catalogic also state that this enables you to:

  • Simplify business processes with ‘copy data’ / ‘use data’ workflows;
  • Extract more value from your copy data services;
  • Provide protection compliance / snapshots; and
  • File analytics / Search, Report and Analyse.

In addition to this, Catalogic spoke about ECX’s ability to provide:

  • Next-generation Data protection, with instant recovery and disaster recovery leveraging snap data;
  • Killer App for Hybrid Cloud, enabling business to leverage cloud “scale and economics”; and
  • Copy Data Analytics with snapshots, file analytics, protection compliance. This gives you the ability to search, report and analyse.

It’s not in-line, but rather uses public APIs to orchestrate. In this scenario, tape’s not dead, it’s just not used for operational recovery. You can use it for archive instead.



The basic architecture is as follows:

  • Layer 0 – OS Services (Linux)
  • Layer 1 – Core Services – NoSQL (MongoDB) amongst them, scheduler, reporting, dir, lic mgmt, index search, web, java / REST, DBMS (PostgreSQL), Messaging
  • Layer 2 – Management Services – account, policy, job, catalog, report, resource, event, alert, provision, search
  • Layer 3 – Policy-based Services – NTAP catalog, VMware catalog, NTAP CDM, VMware CDM
  • Layer 4 – Presentation Services

Here’s a picture that takes those dot points, and adds visualisation.




Catalogic went through a live demo with us, and it *looks* reasonably straightforward. A few things to note:

  • Configure – uses a provider model (one-time registration process for the NTAP controller or VMware)
  • ECX is an abstraction layer – workflow, notification, submit
  • Uses a site-based model
  • You can have a VMs and Templates or Datastore view




  • VM snapshots are quiesced sequentially
  • Creating trees of snapshots via workflow
  • Everything is driven via REST API

Is it a replacement for backup? No. But businesses are struggling with traditional backup and recovery methods. Combination of snapshots and tapes is appealing for some people. It “Doesn’t replace it, but reduces the dependency on backups”.

In my opinion, searching the catalogue is pretty cool. They don’t crack open the VMDK to catalogue yet, but it’s been requested by a lot of people and is on their radar.


Final Thoughts and Further Reading

There’s a lot to like about ECX in my opinion, although a number of delegates (myself included) were mildly disappointed that this is currently tied to NetApp. Catalogic, in their defence, are well aware of this as a limitation and are working really hard to broaden the storage platform support.

The cataloguing capability of the product looked great in the demo I saw, and I know I have a few customers who could benefit from a different approach to CDM. Or, more accurately, it would better is they had any approach at all.

Keith had some interesting thoughts on CDM as a potential precursor to data virtualisation here, as well as a preview post here – both of which are worth checking out.