Komprise Continues To Gain Momentum

I first encountered Komprise at Storage Field Day 17, and was impressed by the offering. I recently had the opportunity to take a briefing with Krishna Subramanian, President and COO at Komprise, and thought I’d share some of my notes here.

 

Momentum

Funding

The primary reason for our call was to discuss Komprise’s Series C funding round of US $24 million. You can read the press release here. Some noteworthy achievements include:

  • Revenue more than doubled every single quarter, with existing customers steadily growing how much they manage with Komprise; and
  • Some customers now managing hundreds of PB with Komprise.

 

Key Verticals

Komprise are currently operating in the following key verticals:

  • Genomics and health care, with rapidly growing footprints;
  • Financial and Insurance sectors (5 out of 10 of the largest insurance companies in the world apparently use Komprise);
  • A lot of universities (research-heavy environments); and
  • Media and entertainment.

 

What’s It Do Again?

Komprise manages unstructured data over three key protocols (NFS, SMB, S3). You can read more about the product itself here, but some of the key features include the ability to “Transparently archive data”, as well as being able to put a copy of your data in another location (the cloud, for example).

 

So What’s New?

One of Komprise’s recent announcements was NAS to NAS migration.  Say, for example, you’d like to migrate your data from an Isilon environment to FlashBlade, all you have to do is set one as a source, and one as target. The ACLs are fully preserved across all scenarios, and Komprise does all the heavy lifting in the background.

They’re also working on what they call “Deep Analytics”. Komprise already aggregates file analytics data very efficiently. They’re now working on indexing metadata on files and exposing that index. This will give you “a Google-like search on all your data, no matter where it sits”. The idea is that you can find data using any combination of metadata. The feature is in beta right now, and part of the new funding is being used to expand and grow this capability.

 

Other Things?

Komprise can be driven entirely from an API, making it potentially interesting for service providers and VARs wanting to add support for unstructured data and associated offerings to their solutions. You can also use Komprise to “confine” data. The idea behind this is that data can be quarantined (if you’re not sure it’s being used by any applications). Using this feature you can perform staged deletions of data once you understand what applications are using what data (and when).

 

Thoughts

I don’t often write articles about companies getting additional funding. I’m always very happy when they do, as someone thinks they’re on the right track, and it means that people will continue to stay employed. I thought this was interesting enough news to cover though, given that unstructured data, and its growth and management challenges, is an area I’m interested in.

When I first wrote about Komprise I joked that I needed something like this for my garage. I think it’s still a valid assertion in a way. The enterprise, at least in the unstructured file space, is a mess based on the what I’ve seen in the wild. Users and administrators continue to struggle with the sheer volume and size of the data they have under their management. Tools such as this can provide valuable insights into what data is being used in your organisation, and, perhaps more importantly, who is using it. My favourite part is that you can actually do something with this knowledge, using Komprise to copy, migrate, or archive old (and new) data to other locations to potentially reduce the load on your primary storage.

I bang on all the time about the importance of archiving solutions in the enterprise, particularly when companies have petabytes of data under their purview. Yet, for reasons that I can’t fully comprehend, a number of enterprises continue to ignore the problem they have with data hoarding, instead opting to fill their DCs and cloud storage with old data that they don’t use (and very likely don’t need to store). Some of this is due to the fact that some of the traditional archive solution vendors have moved on to other focus areas. And some of it is likely due to the fact that archiving can be complicated if you can’t get the business to agree to stick to their own policies for document management. In just the same way as you can safely delete certain financial information after an amount of time has elapsed, so too can you do this with your corporate data. Or, at the very least, you can choose to store it on infrastructure that doesn’t cost a premium to maintain. I’m not saying “Go to work and delete old stuff”. But, you know, think about what you’re doing with all of that stuff. And if there’s no value in keeping the “kitchen cleaning roster May 2012.xls” file any more, think about deleting it? Or, consider a solution like Komprise to help you make some of those tough decisions.

Imanis Data and MDL autoMation Case Study

Background

I’ve covered Imanis Data in the past, but am the first to admit that their focus area is not something I’m involved with on a daily basis. They recently posted a press release covering a customer success story with MDL autoMation. I had the opportunity to speak with both Peter Smails from Imanis Data, as well as Eric Gutmann from MDL autoMation. Whilst I enjoy speaking to vendors about their successes in the market, I’m even more intrigued by customer champions and what they have to say about their experience with a vendor’s offering. It’s one thing to talk about what you’ve come up with as a product, and how you think it might work well in the real world. It’s entirely another thing to have a customer take the time to speak to people on your behalf and talk about how your product works for them. Ultimately, these are usually interesting conversations, and it’s always useful for me to hear about how various technologies are applied in the real world. Note that I spoke to them separately, so Gutmann wasn’t being pushed in a certain direction by Imanis Data – he’s just really enthusiastic about the solution.

 

The Case Study

The Customer

Founded in 2006, MDL autoMation (MDL) is “one of the automotive industry’s leaders in the application of IoT and SaaS-based technologies for process improvement, automated customer recognition, vehicle tracking and monitoring, personalised customer service and sales, and inventory management”. Gutmann explained to me that for them, “every single customer is a VIP”. There’s a lot of stuff happening on the back-end to make sure that the customer’s experience is an extremely smooth one. MongoDB provides the foundation for the solution. When they first deployed the environment, they used MongoDB Cloud Manager to protect the environment, but struggled to get it to deliver the results they required.

 

Key Challenges

MDL moved to another provider, and spent approximately six months with getting it running. It worked well at the time, and met their requirements, saving them money and delivering quick backup on-premises and quick restores. There were a few issues though, including the:

  • Cost and complexity of backup and recovery for 15-node, sharded, MongoDB deployment across three data centres;
  • Time and complexity associated with daily refresh to non-sharded QA test cluster (it would take 2 days to refresh QA); and
  • Inability to use Active Directory for user access control.

 

Why Imanis Data?

So what got Gutmann and MDL excited about Imanis Data? There were a few reasons that Eric outlined for me, including:

  • 10x backup storage efficiency;
  • 26x faster QA refresh time – incremental restore;
  • 95% reduction in number policies to manage – enterprise policy engine, the number of policies to manage was reduced from 40 to 2; and
  • Native integration with Active Directory.

It was cheaper again than the previous provider, and, as Gutmann puts it “[i]t took literally hours to implement the Imanis product”. MDL are currently protecting 1.6TB of data, and it takes 7 minutes every hour to backup any changes.

 

Conclusion and Further Reading

Data protection is a problem that everyone needs to deal with at some level. Whether you have “traditional” infrastructure delivering your applications, or one of those fancy new NoSQL environments, you still need to protect your stuff. There are a lot of built-in features with MongoDB to ensure it’s resilient, but keeping the data safe is another matter. Coupled with that is the fact that developers have relied on data recovery activities to get data in to quality assurance environments for years now. Add all that together and you start to see why customers like MDL are so excited when they come across a solution that does what they need it to do.

Working in IT infrastructure (particularly operations) can be a grind at times. Something always seems to be broken or about to break. Something always seems to be going a little bit wrong. The best you can hope for at times is that you can buy products that do what you need them to do to ensure that you can produce value for the business. I think Imanis Data have a good story to tell in terms of the features they offer to protect these kinds of environments. It’s also refreshing to see a customer that is as enthusiastic as MDL is about the functionality and performance of the product, and the engagement as a whole. And as Gutmann pointed out to me, his CEO is always excited about the opportunity to save money. There’s no shame in being honest about that requirement – it’s something we all have to deal with one way or another.

Note that neither of us wanted to focus on the previous / displaced solution, as it serves no real purpose to talk about another vendor in a negative light. Just because that product didn’t do what MDL wanted it to do, doesn’t mean that that product wouldn’t suit other customers and their particular use cases. Like everything in life, you need to understand what your needs and wants are, prioritise them, and then look to find solutions that can fulfil those requirements.

Elastifile Announces Cloud File Service

Elastifile recently announced a partnership with Google to deliver a fully-managed file service delivered via the Google Cloud Platform. I had the opportunity to speak with Jerome McFarland and Dr Allon Cohen about the announcement and thought I’d share some thoughts here.

 

What Is It?

Elastifile Cloud File Service delivers a self-service SaaS experience, providing the ability to consume scalable file storage that’s deeply integrated with Google infrastructure. You could think of it as similar to Amazon’s EFS.

[image courtesy of Elastifile]

 

Benefits

Easy to Use

Why would you want to use this service? It:

  • Eliminates manual infrastructure management;
  • Provisions turnkey file storage capacity in minutes; and
  • Can be delivered in any zone, and any region.

 

Elastic

It’s also cloudy in a lot of the right ways you want things to be cloudy, including:

  • Pay-as-you-go, consumption-based pricing;
  • Flexible pricing tiers to match workflow requirements; and
  • The ability to start small and scale out or in as needed and on-demand.

 

Google Native

One of the real benefits of this kind of solution though, is the deep integration with Google’s Cloud Platform.

  • The UI, deployment, monitoring, and billing are fully integrated;
  • You get a single bill from Google; and
  • The solution has been co-engineered to be GCP-native.

[image courtesy of Elastifile]

 

What About Cloud Filestore?

With Google’s recently announced Cloud Filestore, you get:

  • A single storage tier selection, being Standard or SSD;
  • It’s available in-cloud only; and
  • Grow capacity or performance up to a tier capacity.

With Elastifile’s Cloud File Service, you get access to the following features:

  • Aggregates performance & capacity of many VMs
  • Elastically scale-out or -in; on-demand
  • Multiple service tiers for cost flexibility
  • Hybrid cloud, multi-zone / region and cross-cloud support

You can also use ClearTier to perform tiering between file and object without any application modification.

 

Thoughts

I’ve been a fan of Elastifile for a little while now, and I thought their 3.0 release had a fair bit going for it. As you can see from the list of features above, Elastifile are really quite good at leveraging all of the cool things about cloud – it’s software only (someone else’s infrastructure), reasonably priced, flexible, and scalable. It’s a nice change from some vendors who have focussed on being in the cloud without necessarily delivering the flexibility that cloud solutions have promised for so long. Coupled with a robust managed service and some preferential treatment from Google and you’ve got a compelling solution.

Not everyone will want or need a managed service to go with their file storage requirements, but if you’re an existing GCP and / or Elastifile customer, this will make some sense from a technical assurance perspective. The ability to take advantage of features such as ClearTier, combined with the simplicity of keeping it all under the Google umbrella, has a lot of appeal. Elastifile are in the box seat now as far as these kinds of offerings are concerned, and I’m keen to see how the market responds to the solution. If you’re interested in this kind of thing, the Early Access Program opens December 11th with general availability in Q1 2019. In the meantime, if you’d like to try out ECFS on GCP – you can sign up here.

Big Switch Announces AWS Public Cloud Monitoring

Big Switch Networks recently announced Big Mon for AWS. I had the opportunity to speak with Prashant Gandhi (Chief Product Officer) about the announcement and thought I’d share some thoughts here.

The Announcement

Big Switch describe Big Monitoring Fabric Public Cloud (it’s real product name) as “a seamless deep packet monitoring solution that enables workload monitoring within customer specified Virtual Private Clouds (VPCs). All components of the solution are virtual, with elastic scale-out capability based on traffic volumes.”

[image courtesy of Big Switch]

There are some real benefits to be had, including:

  • Complete AWS Visibility;
  • Multi-VPC support;
  • Elastic scaling; and
  • Consistent with the On-Prem offering.

Capabilities

  • Centralised packet and flow-based monitoring of all VPCs of a user account
  • Visibility-related traffic is kept local for security purposes and cost savings
  • Monitoring and security tools are centralised and tagged within the dedicated VPC for ease of configuration
  • Role-based access control enables multiple teams to operate Big Mon 
  • Supports centralised AWS VPC tool farm to reduce monitoring cost
  • Integrated with Big Switch’s Multi-Cloud Director for centralised hybrid cloud management

Thoughts and Further Reading

It might seem a little odd that I’m covering news from a network platform vendor on this blog, given the heavy focus I’ve had over the years on storage and virtualisation technologies. But the world is changing. I work for a Telco now and cloud is dominating every infrastructure and technology conversation I’m having. Whether it’s private or public or hybrid, cloud is everywhere, and networks are a bit part of that cloud conversation (much as it has been in the data centre), as is visibility into those networks. 

Big Switch have been around for under 10 years, but they’ve already made some decent headway with their switching platform and east-west monitoring tools. They understand cloud networking, and particularly the challenges facing organisations leveraging complicated cloud networking topologies. 

I’m the first guy to admit that my network chops aren’t as sharp as they could be (if you watched me setup some Google WiFi devices over the weekend, you’d understand). But I also appreciate that visibility is key to having control over what can sometimes be an overly elastic / dynamic infrastructure. It’s been hard to see traffic between availability zones, between instances, and contained in VPNs. I also like that they’ve focussed on a consistent experience between the on-premises offering and the public cloud offering. 

If you’re interested in learning more about Big Switch Networks, I also recommend checking out their labs.

Pure Storage Goes All In On Hybrid … Cloud

I recently had the opportunity to hear from Chadd Kenney about Pure Storage’s Cloud Data Services announcement and thought it worthwhile covering here. But before I get into that, Pure have done a little re-branding recently. You’ll now hear them referring to Cloud Data Infrastructure (their on-premises instances of FlashArray, FlashBlade, FlashStack) and Cloud Data Management (being their Pure1 instances).

 

The Announcement

So what is “Cloud Data Services”? It’s comprised of:

According to Kenney, “[t]he right strategy is and not or, but the enterprise is not very cloudy, and the cloud is not very enterprise-y”. If you’ve spent time in any IT organisation, you’ll see that there is, indeed, a “Cloud divide” in play. What we’ve seen in the last 5 – 10 years is a marked difference in application architectures, consumption and management, and even storage offerings.

[image courtesy of Pure Storage]

 

Cloud Block Store

The first part of the puzzle is probably the most interesting for those of us struggling to move traditional application stacks to a public cloud solution.

[image courtesy of Pure Storage]

According to Pure, Cloud Block Store offers:

  • High reliability, efficiency, and performance;
  • Hybrid mobility and protection; and
  • Seamless APIs on-premises and cloud.

Kenney likens building a Purity solution on AWS to the approach Pure took in the early days of their existence, when they took off the shelf components and used optimised software to make them enterprise-ready. Now they’re doing the same thing with AWS, and addressing a number of the shortcomings of the underlying infrastructure through the application of the Purity architecture.

Features

So why would you want to run virtual Pure controllers on AWS? The idea is that Cloud Block Store:

  • Aggregates performance and reliability across many cloud stores;
  • Can be deployed HA across two availability zones (using active cluster);
  • Is always thin, deduplicated, and compressed;
  • Delivers instant space-saving snapshots; and
  • Is always encrypted.

Management and Orchestration

If you have previous experience with Purity, you’ll appreciate the management and orchestration experience remains the same.

  • Same management, with Pure1 managing on-premises instances and instances in the cloud
  • Consistent APIs on-premises and in cloud
  • Plugins to AWS and VMware automation
  • Open, full-stack orchestration

Use Cases

Pure say that you can use this kind of solution in a number of different scenarios, including DR, backup, and migration in and between clouds. If you want to use ActiveCluster between AWS regions, you might have some trouble with latency, but in those cases other replication options are available.

[image courtesy of Pure Storage]

Not that Cloud Block Store is available in a few different deployment configurations:

  • Test/Dev – using a single controller instance (EBS can’t be attached to more than one EC2 instance)
  • Production – ActiveCluster (2 controllers, either within or across availability zones)

 

CloudSnap

Pure tell us that we’ve moved away from “disk to disk to tape” as a data protection philosophy and we now should be looking at “Flash to Flash to Cloud”. CloudSnap allows FlashArray snapshots to be easily sent to Amazon S3. Note that you don’t necessarily need FlashBlade in your environment to make this work.

[image courtesy of Pure Storage]

For the moment, this only being certified on AWS.

 

StorReduce for AWS

Pure acquired StorReduce a few months ago and now they’re doing something with it. If you’re not familiar with them, “StorReduce is an object storage deduplication engine, designed to enable simple backup, rapid recovery, cost-effective retention, and powerful data re-use in the Amazon cloud”. You can leverage any array, or existing backup software – it doesn’t need to be a Pure FlashArray.

Features

According to Pure, you get a lot of benefits with StorReduce, including:

  • Object fabric – secure, enterprise ready, highly durable cloud object storage;
  • Efficient – Reduces storage and bandwidth costs by up to 97%, enabling cloud storage to cost-effectively replace disk & tape;
  • Fast – Fastest Deduplication engine on the market. 10s of GiB/s or more sustained 24/7;
  • Cloud Native – Native S3 interface enabling openness, integration, and data portability. All Data & Metadata stored in object store;
  • Single namespace – Stores in a single data hub across your data centre to enable fast local performance and global data protection; and
  • Scalability – Software nodes scale linearly to deliver 100s of PBs and 10s of GBs bandwidth.

 

Thoughts and Further Reading

The title of this post was a little misleading, as Pure have been doing various cloud things for some time. But sometimes I give in to my baser instincts and like to try and be creative. It’s fine. In my mind the Cloud Block Store for AWS piece of the Cloud Data Services announcement is possibly the most interesting one. It seems like a lot of companies are announcing these kinds of virtualised versions of their hardware-based appliances that can run on public cloud infrastructure. Some of them are just encapsulated instances of the original code, modified to deal with a VM-like environment, whilst others take better advantage of the public cloud architecture.

So why are so many of the “traditional” vendors producing these kinds of solutions? Well, the folks at AWS are pretty smart, but it’s a generally well understood fact that the enterprise moves at enterprise pace. To that end, they may not be terribly well positioned to spend a lot of time and effort to refactor their applications to a more cloud-friendly architecture. But that doesn’t mean that the CxOs haven’t already been convinced that they don’t need their own infrastructure anymore. So the operations folks are being pushed to migrate out of their DCs and into public cloud provider infrastructure. The problem is that, if you’ve spent a few minutes looking at what the likes of AWS and GCP offer, you’ll see that they’re not really doing things in the same way that their on-premises comrades are. AWS expects you to replicate your data at an application level, for example, because those EC2 instances will sometimes just up and disappear.

So how do you get around the problem of forcing workloads into public cloud without a lot of the safeguards associated with on-premises deployments? You leverage something like Pure’s Cloud Block Store. It overcomes a lot of the issues associated with just running EC2 on EBS, and has the additional benefit of giving your operations folks a consistent management and orchestration experience. Additionally, you can still do things like run ActiveCluster between and within Availability Zones, so your mission critical internal kitchen roster application can stay up and running when an EC2 instance goes bye bye. You’ll pay a bit less or more than you would with normal EBS, but you’ll get some other features too.

I’ve argued before that if enterprises are really serious about getting into public cloud, they should be looking to work towards refactoring their applications. But I also understand that the reality of enterprise application development means that this type of approach is not always possible. After all, enterprises are (generally) in the business of making money. If you come to them and can’t show exactly how they’ save money by moving to public cloud (and let’s face it, it’s not always an easy argument), then you’ll find it even harder to convince them to undertake significant software engineering efforts simply because the public cloud folks like to do things a certain way. I’m rambling a bit, but my point is that these types of solutions solve a problem that we all wish didn’t exist but it does.

Justin did a great write-up here that I recommend reading. Note that both Cloud Block Store and StorReduce are in Beta with planned general availability in 2019.

Rubrik Announces Cloud Data Management 5.0 – Drops In A Shedload Of Enhancements

I recently had the opportunity to hear from Chris Wahl about Rubrik CDM 5.0 (codename Andes) and thought it worthwhile covering here.

 

Announcement Summary

  • Instant recovery for Oracle databases;
  • NAS Direct Archive to protect massive unstructured data sets;
  • Microsoft Office 365 support via Polaris SaaS Platform;
  • SAP-certified protection for SAP HANA;
  • Policy-driven protection for Epic EHR; and
  • Rubrik works with Rubrik Datos IO to protect NoSQL databases.

 

New Features and Enhancements

As you can see from the list above, there’s a bunch of new features and enhancements. I’ll try and break down a few of these in the section below.

Oracle Protection

Rubrik have had some level of capability with Oracle protection for a little while now, but things are starting to hot up with 5.0.

  • Simplified configuration (Oracle Auto Protection and Live Mount, Oracle Granular SLA Policy Assignments, and Oracle Automated Instance and Database Discovery)
  • Orchestration of operational and PiT recoveries
  • Increased control for DBAs

NAS Direct Archive

People have lots of data now. Like, a real lot. I don’t know how many Libraries of Congress exactly, but it can be a lot. Previously, you’d have to buy a bunch of Briks to store this data. Rubrik have recognised that this can be a bit of a problem in terms of footprint. With NAS Direct Archive, you can send the data to an “archive” target of your choice. So now you can protect a big chunk of data that goes through the Rubrik environment to end target such as object storage, public cloud, or NFS. The idea is to reduce the amount of Rubrik devices you need to buy. Which seems a bit weird, but their customers will be pretty happy to spend their money elsewhere.

[image courtesy of Rubrik]

It’s simple to get going, requiring a tick of a box to be configured. The metadata remains protected with the Rubrik cluster, and the good news is that nothing changes from the end user recovery experience.

Elastic App Service (EAS)

Rubrik now provides the ability to ingest DBs across a wider spectrum, allowing you to protect more of the DB-based applications you want, not just SQL and Oracle workloads.

SAP HANA Protection

I’m not really into SAP HANA, but plenty of organisations are. Rubrik now offer a SAP Certified Solution which, if you’ve had the misfortune of trying to protect SAP workloads before, is kind of a neat feature.

[image courtesy of Rubrik]

SQL Server Enhancements

There have been some nice enhancements with SQL Server protection, including:

  • A Change Block Tracking (CBT) filter driver to decrease backup windows; and
  • Support for group Volume Shadow Copy Service (VSS) snapshots.

So what about Group Backups? The nice thing about these is that you can protect many databases on the same SQL Server. Rather than process each VSS Snapshot individually, Rubrik will group the databases that belong to the same SLA Domain and process the snapshots as a batch group. There are a few benefits to this approach:

  • It reduces SQL Server overhead, as well as decreases the amount of time a backup requires to be completed; and
  • In turn, allowing customers to take more frequent backups of their databases delivering a lower RPO to the business.

vSphere Enhancements

Rubrik have done vSphere things since forever, and this release includes a few nice enhancements, including:

  • Live Mount VMDKs from a Snapshot – providing the option to choose to mount specific VMDKs instead of an entire VM; and
  • After selecting the VMDKs, the user can select a specific compatible VM to attach the mounted VMDKs.

Multi-Factor Authentication

The Rubrik Andes 5.0 integration with RSA SecurID will include RSA Authentication Manager 8.2 SP1+ and RSA SecurID Cloud Authentication Service. Note that CDM will not be supporting the older RADIUS protocol. Enabling this is a two-step process:

  • Add the RSA Authentication Manager or RSA Cloud Authentication Service in the Rubrik Dashboard; and
  • Enable RSA and associate a new or existing local Rubrik user or a new or existing LDAP server with the RSA Authentication Manager or RSA Cloud Authentication Service.

You also get the ability to generate API tokens. Note that if you want to interact with the Rubrik CDM CLI (and have MFA enabled) you’ll need these.

Other Bits and Bobs

There are a few other enhancements included, including:

  • Windows Bare Metal Recovery;
  • SLA Policy Advanced Configuration;
  • Additional Reporting and Metrics; and
  • Snapshot Retention Enhancements.

 

Thoughts and Further Reading

Wahl introduced the 5.0 briefing by talking about digital transformation as being, at its core, an automation play. The availability of a bunch of SaaS services can lead to fragmentation in your environment, and legacy technology doesn’t deal with with makes transformation. Rubrik are positioning themselves as a modern company, well-placed to help you with the challenges of protecting what can quickly become a complex and hard to contain infrastructure. It’s easy to sit back and tell people how transformation can change their business for the better, but these kinds of conversations often eschew the high levels of technical debt in the enterprise that the business is doing its best to ignore. I don’t really think that transformation is as simple as some vendors would have us believe, but I do support the idea that Rubrik are working hard to make complex concepts and tasks as simple as possible. They’ve dropped a shedload of features and enhancements in this release, and have managed to do so in a way that you won’t need to install a bunch of new applications to support these features, and you won’t need to do a lot to get up and running either. For me, this is the key advantage that the “next generation” data protection companies have over their more mature competitors. If you haven’t been around for decades, you very likely don’t offer support for every platform and application under the sun. You also likely don’t have customers that have been with you for 20 years that you need to support regardless of the official support status of their applications. This gives the likes of Rubrik the flexibility to deliver features as and when customers require them, while still focussing on keeping the user experience simple.

I particularly like the NAS Direct Archive feature, as it shows that Rubrik aren’t simply in this to push a bunch of tin onto their customers. A big part of transformation is about doing things smarter, not just faster. the folks at Rubrik understand that there are other solutions out there that can deliver large capacity solutions for protecting big chunks of data (i.e. NAS workloads), so they’ve focussed on leveraging other capabilities, rather than trying to force their customers to fill their data centres with Rubrik gear. This is the kind of thinking that potential customers should find comforting. I think it’s also the kind of approach that a few other vendors would do well to adopt.

*Update*

Here’re some links to other articles on Andes from other folks I read that you may find useful:

Cloudtenna Announces DirectSearch GA

I’ve covered Cloudtenna in the past and had the good fortune to chat with Aaron Ganek about the general availability of Cloudtenna’s universal search product – DirectSearch. I thought I’d share some of my thoughts here.

 

About Cloudtenna

Cloudtenna are focussed on delivering “[t]urn-key search infrastructure designed specifically for files”. If you think of Elasticsearch as being synonymous with log search, then you might also like to think of Cloudtenna delivering an equivalent capability with file search.

The Challenge

According to Cloudtenna, the problem is that “[e]nterprises can’t keep track of files that are pattered across on-premises, cloud, and SaaS apps” and traditional search is a one-size-fits-all solution. In Cloudtenna’s opinion though, file search requires personalised search that reflects things such as ACLs. It’s expensive and difficult to scale.

Cloudtenna’s Solution

So what do Cloudtenna do then? The key features are the ability to:

  • Efficiently ingress massive amounts of data
  • Understand and adhere to user permissions
  • Return queries in near real-time
  • Reduce index storage and compute costs

“DirectSearch” is now generally available, and allows for cross-silo search across services such as DropBox, Gmail, Slack, Confluence, and so on. It seems reasonably priced at $10 US per user per month. Note that users who sign-up before December 1st 2018 can get 3 months of a free trial with no credit card details required).

DirectSearch CORE

In parallel to the release of DirectSearch, Cloudtenna are also announcing DirectSearch CORE – delivered via an OEM Model. I asked Ganek where he thought this kind of solution was a good fit. He told me that he saw it falling into three main categories:

  • Digital workspace category – eg. VMware, Citrix. Companies that want to be able to connect files into virtual digital workspaces;
  • Storage space – large storage vendors with SMB and NFS solutions – they might want to provide a global namespace over those transports; and
  • SaaS collaboration – eg. companies delivering chat, bug tracking, word processing – unify those offerings and give a single view of files.

Cloudtenna describe DirectSearch CORE as a turn-key file search infrastructure offering:

  • Fast query latency;
  • ACL crunching;
  • Deduplication; and
  • Contextual intelligence.

ACLs

One of the big challenges with delivering a solution like DirectSearch is that every data source has its own permissions and ACL enforcement is a big challenge. Keep in mind that all of these different applications have their own version of authentication mechanisms, with some using open directory standards, and others doing proprietary stuff. And once you have authentication sorted out, you still need to ensure that users only get access to what they’re allowed to see. Cloudtenna tackle this challenge by ingesting “native ACLs” and normalising those ACLs with metadata.

 

Thoughts

Search is hard to do well. You want it to be quick, accurate, and easy to use. You also generally want it to be able to find stuff in all kinds of places. One of the problems with modern infrastructure is that we have access to a whole bunch of content repositories as part of our everyday corporate endeavours. I work with Slack, Dropbox, Box, OneDrive, SharePoint, file servers, Microsoft Teams, iMessage, email, and all kinds of systems as part of my job. I’m the first to admit that I don’t always have a good handle on where some stuff is. And sometimes I use the wrong system because it’s more convenient to access than the correct one is. Now multiply this problem out by the thousands of users in a decent-sized enterprise and you’ve got a recipe for disaster in terms of finding corporate knowledge in a timely fashion. Combine that with billions of files and you’re a passenger on Terry Tate’s pain train. Cloudtenna has quite a job on its hands in terms of delivering on the promise of “[b]ringing order to file chaos”, but if they can do that, it’ll be pretty cool. I’ll be signing up for a trial in the very near future and, if chaotic files aren’t your bag, then maybe you should give it a spin too.

Maxta Announces MxIQ

Maxta recently announced MxIQ. I had the opportunity to speak to Barry Phillips (Chief Marketing Officer) and Kiran Sreenivasamurthy (VP, Product Management) and thought I’d share some information from the announcement here. It’s been a while since I’ve covered Maxta, and you can read my previous thoughts on them here.

 

Introducing MxIQ

MxIQ is Maxta’s support and analytics solution and it focuses on four key aspects:

  • Proactive support through data analytics;
  • Preemptive recommendation engine;
  • Forecast capacity and performance trends; and
  • Resource planning assistance.

Historical data trends for capacity and performance are available, as well as metadata concerning cluster configuration, licensing information, VM inventory and logs.

Architecture

MxIQ is a server – client solution and the server component is currently hosted by Maxta in AWS. This can be decoupled from AWS and hosted in a private DC environment if customers don’t want their data sitting in AWS. The downside of this is that Maxta won’t have visibility into the environment, and you’ll lose a lot of the advantages of aggregated support data and analytics.

[image courtesy of Maxta]

There is a client component that runs on every node in the cluster in the customer site. Note that one agent in each cluster is active, with the other agents communicate with the active agent. From a security perspective, you only need to configure an outbound connection, as the server responds to client requests, but doesn’t initiate communications with the client. This may change in the future as Maxta adds increased functionality to the solution.

From a heartbeat perspective, the agent talks to the server every minute or so. If, for some reason, it doesn’t check in, a support ticket is automatically opened.

[image courtesy of Maxta]

Privileges

There are three privilege levels that are available with the MxIQ solution.

  • Customer
  • Partner
  • Admin

Note that the Admin (Maxta support) needs to be approved by the customer.

[image courtesy of Maxta]

The dashboard provides an easy to consume overview of what’s going on with managed Maxta clusters, and you can tell at a glance if there are any problems or areas of concern.

[image courtesy of Maxta]

 

Thoughts

I asked the Maxta team if they thought this kind of solution would result in more work for support staff as there’s potentially more information coming in and more support calls being generated. Their opinion was that, as more and more activities were automated, the workload would decrease. Additionally, logs are collected every four hours. This saves Maxta support staff time chasing environmental information after the first call is logged. I also asked whether the issue resolution was automated. Maxta said it wasn’t right now, as it’s still early days for the product, but that’s the direction it’s heading in.

The type of solution that Maxta are delivering here is nothing new in the marketplace, but that doesn’t mean it’s not valuable for Maxta and their customers. I’m a big fan of adding automated support and monitoring to infrastructure environments. It makes it easier for the vendor to gather information about how their product is being used, and it provides the ability for them to be proactive, and super responsive, to customer issues as the arise.

From what I can gather from my conversation with the Maxta team, it seems like there’s a lot of additional functionality they’ll be looking to add to the product as it matures. The real value of the solution will increase over time as customers contribute more and more telemetry data and support to the environment. This will obviously improve Maxta’s ability to respond quickly to support issues, and, potentially, give them enough information to avoid some of the more common problems in the first place. Finally, the capacity planning feature will no doubt prove invaluable as customers continue to struggle with growth in their infrastructure environments. I’m really looking forward to seeing how this product evolves over time.

NVMesh 2 – A Compelling Sequel From Excelero

The Announcement

Excelero recently announced NVMesh 2 – the next iteration of their NVMesh product. NVMesh is a software-only solution designed to pool NVMe-based PCIe SSDs.

[image courtesy of Excelero]

Key Features

There are three key features that have been added to NVMesh.

  • MeshConnect – adding support for traditional network technologies TCP/IP and Fibre Channel, giving NVMesh the widest selection of supported protocols and fabrics of software-defined storage platforms along with already supported InfiniBand, RoCE v2, RDMA and NVMe-oF.
  • MeshProtect – offering flexible protection levels for differing application needs, including mirrored and parity-based redundancy.
  • MeshInspect – with performance analytics for pinpointing anomalies quickly and at scale.

Performance

Excelero have said that NVMesh delivers “shared NVMe at local performance and 90+% storage efficiency that helps further drive down the cost per GB”.

Protection

There’s also a range of protection options available now. Excelero tell me that you can start at level 0 (no protection, lowest latency) all the way to “MeshProtect 10+2 (distributed dual parity)”. This allows customers to “choose their preferred level of performance and protection. [While] Distributing data redundancy services eliminates the storage controller bottleneck.”

Visibility

One of my favourite things about NVMesh 2 is the MeshInspect feature, with a “built-in statistical collection and display, stored in a scalable NoSQL database”.

[image courtesy of Excelero]

 

Thoughts And Further Reading

Excelero emerged form stealth mode at Storage Field Day 12. I was impressed with their offering back then, and they continue to add features while focussing on delivering top notch performance via a software-only solution. It feels like there’s a lot of attention on NVMe-based storage solutions, and with good reason. These things can go really, really fast. There are a bunch of startups with an NVMe story, and the bigger players are all delivering variations on these solutions as well.

Excelero seem well placed to capitalise on this market interest, and their decision to focus on a software-only play seems wise, particularly given that some of the standards, such as NVMe over TCP, haven’t been fully ratified yet. This approach will also appeal to the aspirational hyperscalers, because they can build their own storage solution, source their own devices, and still benefit from a fast software stack that can deliver performance in spades. Excelero also supports a wide range of transports now, with the addition of NVMe over FC and TCP support.

NVMesh 2 looks to be smoothing some of the rougher edges that were present with version 1, and I’m pumped to see the focus on enhanced visibility via MeshInspect. In my opinion these kinds of tools are critical to the uptake of solutions such as NVMesh in both the enterprise and cloud markets. The broadening of the connectivity story, as well as the enhanced resiliency options, make this something worth investigating. If you’d like to read more, you can access a white paper here (registration required).

Vembu BDR Suite 4.0 Is Coming

Disclaimer

Vembu are a site sponsor of PenguinPunk.net. They’ve asked me to look at their product and write about it. I’m in the early stages of evaluating the BDR Suite in the lab, but thought I’d pass on some information about their upcoming 4.0 release. As always, if you’re interested in these kind of solutions, I’d encourage you to do your own evaluation and get in touch with the vendor, as everyone’s situation and requirements are different. I can say from experience that the Vembu sales and support staff are very helpful and responsive, and should be able to help you with any queries. I recently did a brief article on getting started with BDR Suite 3.9.1 that you can download from here.

 

New Features

So what’s coming in 4.0?

Hyper-V Cluster Backup

Vembu will support backing up VMs in a Hyper-V cluster and, even if VMs configured for backup are moved from one host to another, the incremental backup will continue to happen without any interruption.

Shared VHDx Backup

Vembu now supports backup of the shared VHDx of Hyper-V.

CheckSum-based Incrementals

Vembu uses CBT for incremental backups. And for some CBT failure cases they will be using CheckSum for the incremental to happen without any interruption.

Credential Manager

No need to enter credentials every time, Vembu Credential Manager now allows you to manage the credentials of the host and the VMs running in it. This will be particularly handy if you’re doing a lot of application-aware backup job configuration.

 

Thoughts

I had a chance to speak with Vembu about the product’s functionality. There’s a lot to like in terms of breadth of features. I’m interested in seeing how 4.0 goes when it’s released and hope to do a few more articles on the product then. If you’re looking to evaluate the product, this evaluator’s guide is as good place as any to start. As an aside, Vembu are also offering 10% off their suite this Halloween (until November 2nd) – see here for more details.

For a fuller view of what’s coming in 4.0, you can read Vladan‘s coverage here.