Imanis Data and MDL autoMation Case Study

Background

I’ve covered Imanis Data in the past, but am the first to admit that their focus area is not something I’m involved with on a daily basis. They recently posted a press release covering a customer success story with MDL autoMation. I had the opportunity to speak with both Peter Smails from Imanis Data, as well as Eric Gutmann from MDL autoMation. Whilst I enjoy speaking to vendors about their successes in the market, I’m even more intrigued by customer champions and what they have to say about their experience with a vendor’s offering. It’s one thing to talk about what you’ve come up with as a product, and how you think it might work well in the real world. It’s entirely another thing to have a customer take the time to speak to people on your behalf and talk about how your product works for them. Ultimately, these are usually interesting conversations, and it’s always useful for me to hear about how various technologies are applied in the real world. Note that I spoke to them separately, so Gutmann wasn’t being pushed in a certain direction by Imanis Data – he’s just really enthusiastic about the solution.

 

The Case Study

The Customer

Founded in 2006, MDL autoMation (MDL) is “one of the automotive industry’s leaders in the application of IoT and SaaS-based technologies for process improvement, automated customer recognition, vehicle tracking and monitoring, personalised customer service and sales, and inventory management”. Gutmann explained to me that for them, “every single customer is a VIP”. There’s a lot of stuff happening on the back-end to make sure that the customer’s experience is an extremely smooth one. MongoDB provides the foundation for the solution. When they first deployed the environment, they used MongoDB Cloud Manager to protect the environment, but struggled to get it to deliver the results they required.

 

Key Challenges

MDL moved to another provider, and spent approximately six months with getting it running. It worked well at the time, and met their requirements, saving them money and delivering quick backup on-premises and quick restores. There were a few issues though, including the:

  • Cost and complexity of backup and recovery for 15-node, sharded, MongoDB deployment across three data centres;
  • Time and complexity associated with daily refresh to non-sharded QA test cluster (it would take 2 days to refresh QA); and
  • Inability to use Active Directory for user access control.

 

Why Imanis Data?

So what got Gutmann and MDL excited about Imanis Data? There were a few reasons that Eric outlined for me, including:

  • 10x backup storage efficiency;
  • 26x faster QA refresh time – incremental restore;
  • 95% reduction in number policies to manage – enterprise policy engine, the number of policies to manage was reduced from 40 to 2; and
  • Native integration with Active Directory.

It was cheaper again than the previous provider, and, as Gutmann puts it “[i]t took literally hours to implement the Imanis product”. MDL are currently protecting 1.6TB of data, and it takes 7 minutes every hour to backup any changes.

 

Conclusion and Further Reading

Data protection is a problem that everyone needs to deal with at some level. Whether you have “traditional” infrastructure delivering your applications, or one of those fancy new NoSQL environments, you still need to protect your stuff. There are a lot of built-in features with MongoDB to ensure it’s resilient, but keeping the data safe is another matter. Coupled with that is the fact that developers have relied on data recovery activities to get data in to quality assurance environments for years now. Add all that together and you start to see why customers like MDL are so excited when they come across a solution that does what they need it to do.

Working in IT infrastructure (particularly operations) can be a grind at times. Something always seems to be broken or about to break. Something always seems to be going a little bit wrong. The best you can hope for at times is that you can buy products that do what you need them to do to ensure that you can produce value for the business. I think Imanis Data have a good story to tell in terms of the features they offer to protect these kinds of environments. It’s also refreshing to see a customer that is as enthusiastic as MDL is about the functionality and performance of the product, and the engagement as a whole. And as Gutmann pointed out to me, his CEO is always excited about the opportunity to save money. There’s no shame in being honest about that requirement – it’s something we all have to deal with one way or another.

Note that neither of us wanted to focus on the previous / displaced solution, as it serves no real purpose to talk about another vendor in a negative light. Just because that product didn’t do what MDL wanted it to do, doesn’t mean that that product wouldn’t suit other customers and their particular use cases. Like everything in life, you need to understand what your needs and wants are, prioritise them, and then look to find solutions that can fulfil those requirements.

Imanis Data Overview and 4.0 Announcement

I recently had the opportunity to speak with Peter Smails and Jay Desai from Imanis Data. They provided me with an overview of what the company does and a view of their latest product announcement. I thought I’d share some of it here as I found it pretty interesting.

 

Overview

Imanis Data provides enterprise data management for Hadoop and NoSQL running on-premises or in the public cloud.

Data Management

A big part of the Imanis Data story revolves around the “three pillars” of data management, namely:

  • Protection – providing redundancy in case of a disaster;
  • Orchestration – moving data around for different use cases (eg. test and dev, cloud migration, archival); and
  • Automation – using machine learning to automate the data management functions, eg. Detecting anomalies (ThreatSense), SmartPolicies for backups based on RPO/RTO

The software itself is hardware-agnostic, and can run on any virtual, physical, or container-based platform. It can also runs on any cloud, and hence on any storage. You start with 3 nodes, and scale out from there. Imanis Data tell me that everything runs in parallel, and it’s agentless, using native APIs for the platforms. This is a big plus when it comes to protecting these kinds of workloads, as there’s usually a large number of hosts involved, and managing agents everywhere is a real pain.

It also delivers storage optimisation services, and supports erasure coding, compression, and content-aware deduplication. There’s a nice paper on the architecture that you can grab from here.

 

What’s New?

So what’s new with 4.0?

Any Point-in-time Recovery

Imanis Data now provides APITR for Couchbase, MongoDB, & Cassandra

  • APITR can be enabled at bucket level for Couchbase;
  • APITR can be enabled at repository level for Cassandra and MongoDB;
  • Aggressively collects transaction information from primary database; and
  • At time of recovery, user can pick a date & time.

ThreatSense

ThreatSense “learns” from human input and updates the anomaly model. It’s a smart way of doing malware and ransomware detection.

SmartPolicies

What?

  • Autonomous RPO-based backup powered by machine learning;
  • Machine learning model built based on cluster workloads and utilisation;
  • Model determines backup frequency & resource prioritisation;
  • Continuously adapts to meet required RPO; and
  • Provides guidance on required resources to achieve desired RPOs.

 

Thoughts

I do a lot with a number of data protection vendors in various on-premises and cloud incantations, but I’m the first to admit that my experience with protection mechanisms for things like NoSQL is non-existent. It seems like that’s not an uncommon problem, and Imanis Data has spent the last 5 or so years working on fixing that for folks.

I’m intrigued by the idea that policies could be applied to objects based on criteria beyond a standard RPO requirement. In the enterprise I frequently run into situations where the RPO is often at odds with the capabilities of the protection system, or clashing with some critical processing activity that happens at a certain time each night. Getting the balance right can be challenging at the best of times. Like most things related to automation, if the system can do what I need it to do in the time I need it to happen, I’m going to be happy. Particularly if I don’t need to do anything after I’ve set it to run.

Imanis Data seems to be offering up a pretty cool solution that scales well and does a lot of things that are important for protecting critical workloads. Imanis Data tell me they’re not interested in the relational side of things, and are continuing to focus on their core competency for the moment. It looks like pretty neat stuff and I’m looking forward to see what they come up with in the future.