Cohesity Basics – Excluding VMs Using Tags

I’ve been doing some work with Cohesity in our lab and thought it worth covering some of the basic features that I think are pretty neat. In this edition of Cohesity Basics, I thought I’d quickly cover off how to exclude VMs from protection jobs based on assigned tags. In this example I’m using version 6.0.1b_release-20181014_14074e50 (a “feature release”).

 

Process

The first step is to find the VM in vCenter that you want to exclude from a protection job. Right-click on the VM and select Tags & Custom Attributes. Click on Assign Tag.

In the Assign Tag window, click on the New Tag icon.

Assign a name to the new tag, and add a description if that’s what you’re into.

In this example, I’ve created a tag called “COH-Test”, and put it in the “Backup” category.

Now go to the protection job you’d like to edit.

Click on the Tag icon on the right-hand side. You can then select the tag you created in vCenter. Note that you may need to refresh your vCenter source for this new tag to be reflected.

When you select the tag, you can choose to Auto Protect or Exclude the VM based on the applied tags.

If you drill in to the objects in the protection job, you can see that the VM I wanted to exclude from this job has been excluded based on the assigned tag.

 

Thoughts

I’ve written enthusiastically about Cohesity’s Auto Protect feature previously. Sometimes, though, you need to exclude VMs from protection jobs. Using tags is a quick and easy way to do this, and it’s something that your virtualisation admin team will be happy to use too.

Imanis Data Overview and 4.0 Announcement

I recently had the opportunity to speak with Peter Smails and Jay Desai from Imanis Data. They provided me with an overview of what the company does and a view of their latest product announcement. I thought I’d share some of it here as I found it pretty interesting.

 

Overview

Imanis Data provides enterprise data management for Hadoop and NoSQL running on-premises or in the public cloud.

Data Management

A big part of the Imanis Data story revolves around the “three pillars” of data management, namely:

  • Protection – providing redundancy in case of a disaster;
  • Orchestration – moving data around for different use cases (eg. test and dev, cloud migration, archival); and
  • Automation – using machine learning to automate the data management functions, eg. Detecting anomalies (ThreatSense), SmartPolicies for backups based on RPO/RTO

The software itself is hardware-agnostic, and can run on any virtual, physical, or container-based platform. It can also runs on any cloud, and hence on any storage. You start with 3 nodes, and scale out from there. Imanis Data tell me that everything runs in parallel, and it’s agentless, using native APIs for the platforms. This is a big plus when it comes to protecting these kinds of workloads, as there’s usually a large number of hosts involved, and managing agents everywhere is a real pain.

It also delivers storage optimisation services, and supports erasure coding, compression, and content-aware deduplication. There’s a nice paper on the architecture that you can grab from here.

 

What’s New?

So what’s new with 4.0?

Any Point-in-time Recovery

Imanis Data now provides APITR for Couchbase, MongoDB, & Cassandra

  • APITR can be enabled at bucket level for Couchbase;
  • APITR can be enabled at repository level for Cassandra and MongoDB;
  • Aggressively collects transaction information from primary database; and
  • At time of recovery, user can pick a date & time.

ThreatSense

ThreatSense “learns” from human input and updates the anomaly model. It’s a smart way of doing malware and ransomware detection.

SmartPolicies

What?

  • Autonomous RPO-based backup powered by machine learning;
  • Machine learning model built based on cluster workloads and utilisation;
  • Model determines backup frequency & resource prioritisation;
  • Continuously adapts to meet required RPO; and
  • Provides guidance on required resources to achieve desired RPOs.

 

Thoughts

I do a lot with a number of data protection vendors in various on-premises and cloud incantations, but I’m the first to admit that my experience with protection mechanisms for things like NoSQL is non-existent. It seems like that’s not an uncommon problem, and Imanis Data has spent the last 5 or so years working on fixing that for folks.

I’m intrigued by the idea that policies could be applied to objects based on criteria beyond a standard RPO requirement. In the enterprise I frequently run into situations where the RPO is often at odds with the capabilities of the protection system, or clashing with some critical processing activity that happens at a certain time each night. Getting the balance right can be challenging at the best of times. Like most things related to automation, if the system can do what I need it to do in the time I need it to happen, I’m going to be happy. Particularly if I don’t need to do anything after I’ve set it to run.

Imanis Data seems to be offering up a pretty cool solution that scales well and does a lot of things that are important for protecting critical workloads. Imanis Data tell me they’re not interested in the relational side of things, and are continuing to focus on their core competency for the moment. It looks like pretty neat stuff and I’m looking forward to see what they come up with in the future.

Zerto Announces ZVR 6.5

Zerto recently announced version 6.5 of their Zero Virtual Replication (ZVR) product and I had the opportunity to speak with Steve Blow and Caroline Seymour about the announcement.

 

Announcement

More Multi-cloud

Zerto 6.5 includes features that will accelerate this adoption, specifically:

Backup Capabilities

Zerto’s Long Term Retention feature has also been enhanced. You now have the ability to do incremental backups – effectively deliver forever incremental capability – with synthetic fulls as required. There’s also:

  • Support for Microsoft Data Box Edge using standard storage protocols; and
  • The ability to recover individual VMs out of Virtual Protection Groups.

Analytics

Zerto have worked hard to improve their analytics capabilities, providing:

  • Data for quarterly reports, including SLA compliance;
  • Troubleshooting of monthly data anomalies;
  • Enhanced data about VMs including journal size, throughput, IOPS and WAN; and
  • Cloud Service Provider Client Organisational Filter with enhanced visibility to create customer reports and automatically deliver real-time analysis to clients.

 

Events

Zerto have been busy at Microsoft’s Ignite event recently, and are also holding “IT Resilience Roadshow” events in the U.S. and Europe in the next few months in collaboration with Microsoft. There’s a Zerto+Azure workshop being held at each event, as well as the ability to sit for “Zerto+Azure Specialist” Certification. The workshop will give you the opportunity to use Zerto+Azure to:

  • Create a Disaster Recovery environment in Azure;
  • Migrate End of Life Windows Server 2008/SQL Server 2008 to Azure;
  • Migrate your on-premises data centre to Azure; and
  • Move or protect Linux and other workloads to Azure.

 

Thoughts

I’ve been a fan of Zerto for some time. They’ve historically done a lot with DR solutions and are now moving nicely beyond just DR into “IT Resilience”, with a solution that aims to incorporate a range of features. Zerto have also been pretty transparent with the market in terms of their vision for version 7. There’s an on-demand webinar you can register for that will provide some further insights into what that will bring. I’m a fan of their multi-cloud strategy, and I’m looking forward to seeing that continue to evolve.

I like it when companies aren’t afraid to show their hand a little. Too often companies focus on keeping these announcements a big secret until some special event or arbitrary date in a marketing team’s calendar. I know that Zerto haven’t quite finished version 7 yet, but they have been pretty upfront about the direction they’re trying to head in and some of the ways they’re intending to get there. In my opinion this is a good thing, as it gives their customer base time to prepare, and an idea of what capabilities they’ll be able to leverage in the future. Ultimately, Zerto are providing a solution that is geared up to help protect critical infrastructure assets and move data around to where you need it to be (whether it is planned or not). Zerto seem to understand that the element of surprise isn’t really what their customers are in to when looking at these types of solutions. It isn’t always about being the first company to offer this or that capability. Instead, it should be about offering capabilities that actually work reliably.

Hyper-Veeam

Disclaimer: I recently attended VeeamON Forum Sydney 2018My flights and accommodation were paid for by Veeam. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

I recently had the opportunity to attend VeeamON Forum in Sydney courtesy of Veeam. I was lucky enough to see Dave Russell‘s keynote speech, and also fortunate to spend some time chatting with him in the afternoon. Dave was great to talk to and I thought I’d share some of the key points here.

 

Hyper All of the Things

If you scroll down Veeam’s website you’ll see mention of a number of different “hyper” things, including hyper-availability. Veeam are keen to position themselves as an availability company, with their core focus being on making data you need recoverable, at the time when you need it to be recoverable.

Hyper-critical

Russell mentioned that data has become “hyper-critical” to business, with the likes of:

  • GDPR compliance;
  • PII data retention;
  • PCI compliance requirements;
  • Customer data; and
  • Financial records, etc.

Hyper-growth

Russell also spoke about the hyper-growth of data, with all kinds of data (including structured, unstructured, application, and Internet of things data) is also growing at a rapid clip.

Hyper-sprawl

This explosive growth of data has also lead to the “hyper-sprawl” of data, with your data now potentially living in any or all of the following locations:

  • SaaS-based solutions
  • Private cloud
  • Public cloud

 

Five Stages of Intelligent Data Management

Russell broke down Intelligent Data Management (IDM) into 5 stages.

Backup

A key part of any data management strategy is the ability to backup all workloads and ensure they are always recoverable in the event of outages, attack, loss or theft.

Aggregation

The ability to cope with data sprawl, as well as growth, means you need to ensure protection and access to data across multiple clouds to drive digital services and ensure continuous business operations.

Visibility

It’s not just about protecting vast chunks of data in multiple places though. You also need to look at the requirement to “improve management of data across multi-clouds with clear, unified visibility and control into usage, performance issues and operations”.

Orchestration

Orchestration, ideally, can then be used to “[s]eamlessly move data to the best location across multi-clouds to ensure business continuity, compliance, security and optimal use of resources for business operations”.

Automation

The final piece of the puzzle is automation. According to Veeam, you can get to a point where the “[d]ata becomes self-managing by learning to backup, migrate to ideal locations based on business needs, secure itself during anomalous activity and recover instantaneously”.

 

Thoughts

Data growth is not a new phenomenon by any stretch, and Veeam obviously aren’t the first to notice that protecting all this staff can be hard. Sprawl is also becoming a real problem in all types environments. It’s not just about knowing you have some unstructured data that can impact workflows in a key application. It’s about knowing which cloud platform that data might reside in. If you don’t know where it is, it makes it a lot harder to protect, and your risk profile increases as a result. It’s not just the vendors banging on about data growth through IoT either, it’s a very real phenomena that is creating all kinds of headaches for CxOs and their operations teams. Much like the push in to public cloud by “shadow IT” teams, IoT solutions are popping up in all kinds of unexpected places in the enterprise and making it harder to understand exactly where the important data is being kept and how it’s protected.

Veeam are talking a very good game around intelligent data management. I remember a similar approach being adopted by a three-letter storage company about a decade ago. They lost their way a little under the weight of acquisitions, but the foundation principles seem to still hold water today. Dave Russell obviously saw quite a bit at Gartner in his time there prior to Veeam, so it’s no real surprise that he’s pushing them in this direction.

Backup is just the beginning of the data management problem. There’s a lot else that needs to be done in order to get to the “intelligent” part of the equation. My opinion remains that a lot of enterprises are still some ways away from being there. I also really like Veeam’s focus on moving from policy-based through to a behaviour-based approach to data management.

I’ve been aware of Veeam for a number of years now, and have enjoyed watching them grow as a company. They’re working hard to make their way in the enterprise now, but still have a lot to offer the smaller environments. They tell me they’re committed to remaining a software-only solution, which gives them a certain amount of flexibility in terms of where they focus their R & D efforts. There’s a great cloud story there, and the bread and butter capabilities continue to evolve. I’m looking to see what they have coming over the next 12 months. It’s a relatively crowded market now, and it’s only going to get more competitive. I’ll be doing a few more articles in the next month or two focusing on some of Veeam’s key products so stay tuned.

Dell EMC News From VMworld US 2018

I’m not at VMworld US this year, but I had the opportunity to be briefed by Sam Grocott (Dell EMC Cloud Strategy) on some of Dell EMC‘s key announcements during the event, and thought I’d share some of my rough notes and links here. You can read the press release here.

TL;DR?

It is a multi-cloud world. Multi-cloud requires workload mobility. The market requires a consistent experience between on-premises and off-premises. Dell EMC are doing some more stuff around that.

 

Cloud Platforms

Dell EMC offer a number of engineered systems to run both IaaS and cloud native applications.

VxRail

Starting with vSphere 6.7, Dell EMC are saying they’re delivering “near” synchronous software releases between VMware and VxRail. In this case that translates to a less than 30 Day delta between releases. There’s also support for:

VxRack SDDC with VMware Cloud Foundation

  • Support for latest VCF releases – VCF 2.3.2, and future proof for next generation VMware cloud technologies
  • Alignment with VxRail hardware options – P, E, V series VxRail models, now including Storage Dense S-series
  • Configuration flexibility

 

Cloud-enabled Infrastructure

Focus is on the data

  • Cloud data mobility;
  • Cloud data protection;
  • Cloud data services; and
  • Cloud control.

Cloud Data Protection

  • DD Cloud DR – keep copies of VM data from on-premises DD to public cloud and orchestrate failover of workloads to the cloud
  • Data Protection Suite – use cloud storage for backup and retention
  • Cloud Snapshot Manager – Backup and recovery for public cloud workloads (Now MS Azure)
  • Data Domain virtual edition running in the cloud

DD VE 4.0 Enhancements

  • KVM support added for DD VE on-premises
  • In-cloud capacity expanded to 96TB (was 16TB)
  • Can run in AWS, Azure and VMware Cloud

Cloud Data Services

Dell EMC have already announced services such as:

And now you can get Dell EMC UnityVSA Cloud Edition.

UnityVSA Cloud Edition

[image courtesy of Dell EMC]

  • Up to 256TB file systems
  • VMware Cloud on AWS

CloudIQ

  • No cost, SaaS offering
  • Predictive analytics – intelligently project capacity and performance
  • Anomaly detection – leverage ML to pinpoint deviations
  • Proactive health – identify risks before they impact the environment

Enhancements include:

Data Domain Cloud Tier

There are some other Data Domain related enhancements, including new AWS support (meaning you can have a single vendor for Long Term Retention).

ECS

ECS enhancements have also been announced, with a 50%+ increase in storage capacity and compute.

 

Thoughts

As would be expected from a company with a large portfolio of products, there’s quite a bit happening on the product enhancement front. Dell EMC are starting to get that they need to be on-board with those pesky cloud types, and they’re also doing a decent job of ensuring their private cloud customers have something to play with as well.

I’m always a little surprised by vendors offering “Cloud Editions” of key products, as it feels a lot like they’re bolting on something to the public cloud when the focus could perhaps be on helping customers get to a cloud-native position sooner. That said, there are good economic reasons to take this approach. By that I mean that there’s always going to be someone who thinks they can just lift and shift their workload to the public cloud, rather than re-factoring their applications. Dell EMC are providing a number of ways to make this a fairly safe undertaking, and products like Unity Cloud Edition provide some nice features such as increased resilience that would be otherwise lacking if the enterprise customer simply dumped its VMs in AWS as-is. I still have hope that we’ll stop doing this as an industry in the near future and embrace some smarter ways of working. But while enterprises are happy enough to spend their money on doing things like they always have, I can’t criticise Dell EMC for wanting a piece of the pie.

Cohesity Announces Helios

I recently had the opportunity to hear from Cohesity (via a vExpert briefing – thanks for organising this TechReckoning!) regarding their Helios announcement and thought I’d share what I know here.

 

What Is It?

If we’re not talking about the god and personification of the Sun, what are we talking about? Cohesity tells me that Helios is a “SaaS-based data and application orchestration and management solution”.

[image courtesy of Cohesity]

Here is the high-level architecture of Helios. There are three main features:

  • Multi-cluster management – Control all your Cohesity clusters located on-premises, in the cloud or at the edge from a single dashboard;
  • SmartAssist – Gives critical global operational data to the IT admin; and
  • Machine Learning Engine – Gives IT Admins machine driven intelligence so that they can make an informed decision.

All of this happens when Helios collects, anonymises, aggregates, and analyses globally available metadata and gives actionable recommendations to IT Admins.

 

Multi-cluster Management

Multi-cluster management is just that: the ability to manage more than one cluster through a unified UI. The cool thing is that you can rollout policies or make upgrades across all your locations and clusters with a single click. It also provides you with the ability to monitor your Cohesity infrastructure in real-time, as well as being able to search and generate reports on the global infrastructure. Finally, there’s an aggregated, simple to use dashboard.

 

SmartAssist

SmartAssist is a feature that provides you with the ability to have smart management of SLAs in the environment. The concept is that if you configure two protection jobs in the environment with competing requirements, the job with the higher SLA will get priority. I like this idea as it prevents people doing silly things with protection jobs.

 

Machine Learning

The Machine Learning part of the solution provides a number of things, including insights into capacity consumption. And proactive wellness? It’s not a pitch for some dodgy natural health product, but instead gives you the ability to perform:

  • Configuration validations, preventing you from doing silly things in your environment;
  • Blacklist version control, stopping known problematic software releases spreading too far in the wild; and
  • Hardware health checks, ensuring things are happy with your hardware (important in a software-defined world).\

 

Thoughts and Further Reading

There’s a lot more going on with Helios, but I’d like to have some stick time with it before I have a lot more to say about it. People are perhaps going to be quick compare this with other SaaS offerings, but I think they might be doing some different things, with a bit of a different approach. You can’t go five minutes on the Internet without hearing about how ML is changing the world. If nothing else, this solution delivers a much needed consolidated view of the Cohesity environment. This seems like an obvious thing, but probably hasn’t been necessary until Cohesity landed the type of customers that had multiple clusters installed all over the place.

I also really like the concept of a feature like SmartAssist. There’s only so much guidance you can give people before they have to do some thinking for themselves. Unfortunately, there are still enough environments in the wild where people are making the wrong decision about what priority to place on jobs in their data protection environment. SmartAssist can do a lot to take away the possibility that things will go awry from an SLA perspective.

You can grab a copy of the data sheet here, and read a blog post by Raj Dutt here. El Reg also has some coverage of the announcement here.

Random Short Take #7

Here are a few links to some random things that I think might be useful, to someone. Maybe.

Rubrik Announces Polaris Radar

Polaris?

I’ve written about Rubrik’s Polaris offering in the past, with GPS being the first cab off the rank.  You can think of GPS as the command and control platform, offering multi-cloud control and policy management via the Polaris SaaS framework. I recently had the opportunity to hear from Chris Wahl about Radar and thought it worthwhile covering here.

 

The Announcement

Rubrik announced recently (fine, a few weeks ago) that Polaris Radar is now generally available.

 

The Problem

People don’t want to hear about the problem, because they already know what it is and they want to spend time hearing about how the vendor is going to solve it. I think in this instance, though, it’s worth re-iterating that security attacks happen. A lot. According to the Cisco 2017 Annual Cybersecurity Report ransomware attacks are growing by more than 350% annually. It’s Rubrik’s position that security is heavily focused on the edge, with firewalls and desktop protection being the main tools deployed. “Defence in depth is lopsided”, with a focus on prevention, not necessarily the recovery. According to Wahl, “it’s hard to bounce back fast”.

 

What It Does

So what does Radar do (in the context of Rubrik Polaris)? The idea is that it is increasing the intelligence to know when you get hit, and helping you to recover faster. The goal of Radar is fairly straightforward, with the following activities being key to the solution:

  • Detection – identify all strains of ransomware;
  • Analysis – understand impact of an attack; and
  • Recovery – restore as quickly as possible.

Radar achieves this by:

  • Detecting anomalies – leverage insights on suspicious activity to accelerate detection;
  • Analysing threat impact – spend less time discovering which applications and files were impacted; and
  • Accelerating recovery – minimise downtime by simplifying manual processes into just a few clicks.

 

How?

Rubrik tell me they use (drumroll please) Machine Learning for detection. Is it really machine learning? That doesn’t really matter for the purpose of this story.

[image courtesy of Rubrik]

The machine learning model learns the baseline behaviour, detects anomalies and alerts as they come in. So how does that work then?

1. Detect anomalies – apply machine learning on application metadata to detect and alert unusual change activity with protected data, such as ransomware.

What happens post anomaly detection?

  • Email alert is sent to user
  • Radar inspects snapshot for encryption
  • Results uploaded to Polaris
  • User informed of results (via the Polaris UI)

2. Analyse threat impact – Visualise how an attack impacted the system with a detailed view of file content changes at the time of the event.

3. Accelerate recovery – Select all impacted resources, specify the desired location, and restore the most recent clean versions with a few clicks. Rubrik automates the rest of the restore process.

 

Thoughts and Further Reading

I think there’s a good story to tell with Polaris. SaaS is an accessible way of delivering features to the customer base without the angst traditionally associated with appliance platform upgrades. Data security should be a big part of data protection. After all, data protection is generally critical to recovery once there’s been a serious breach. We’re no longer just protecting against users inside the organisation accidentally deleting large chunks of data, or having to recover from serious equipment failures. Instead, we’re faced with the reality that a bunch of idiots with bad intentions are out to wreck some of our stuff and make a bit of coin on the side. The sooner you know something has gone awry, the quicker you can hopefully recover from the problem (and potentially re-evaluate some of your security). Being attacked shouldn’t be about being ashamed, but it should be about being able to quickly recover and get on with whatever your company does to make its way in the world. With this in mind, I think that Rubrik are on the right track.

You can grab the data sheet from here, and Chris has an article worth checking out here. You can also register to access the Technical Overview here.

Rubrik Basics – SLA Domains

I’ve been doing some work with Rubrik in our lab and thought it worth covering some of the basic features that I think are pretty neat. In this edition of Rubrik Basics, I thought I’d quickly cover off Service Level Agreements (SLA) Domains – one of the key tenets of the Rubrik architecture.

 

The Defaults

Rubrik CDM has three default local SLA Domains. Of course, they’re named after precious metals. There’s something about Gold that people seem to understand better than calling things Tier 0, 1 and 2. The defaults are Gold, Silver, and Bronze. The problem, of course, is that people start to ask for Platinum because they’re very important. The good news is you can create SLA Domains and call them whatever you want. I created one called Adamantium. Snick snick.

Note that these policies have the archival policy and the replication policy disabled, don’t have a Snapshot Window configured, and do not set a Take First Full Snapshot time. I recommend you leave the defaults as they are and create some new SLA Domains that align with what you want to deliver in your enterprise.

 

Service Level Agreement

There are two components to the SLA Domain. The first is the Service Level Agreement, which defines a number of things, including the frequency of snapshot creation and their retention. Note that you can’t go below an hour for your snapshot frequency (unless I’ve done something wrong here). You can go berserk with retention though. Keep those “kitchen duty roster.xls” files for 25 years if you like. Modern office life can be gruelling at times.

A nice feature is the ability to configure a Snapshot Window. The idea is that you can enforce time periods where you don’t perform operations on the systems being protected by the SLA Domain. This is handy if you’ve got systems that run batch processing or just need a little time to themselves every day to reflect on their place in the world. Every systems needs a little time every now and then.

If you have a number of settings in the SLA, the Rubrik cluster creates snapshots to satisfy the smallest frequency that is specified. If the Hourly rule has the smallest frequency, it works to that. If the Daily rule has the smallest frequency, it works to that, and so on. Snapshot expiration is determined by the rules you put in place combined with their frequency.

 

Remote Settings

The second page of the Create SLA Domain window is where you can configure the remote settings. I wrote an article on setting up Archival Locations previously – this is where you can take advantage of that. One of the cool things about Rubrik’s retention policy is that you can choose to send a bunch of stuff to an off-site location and keep, say, 30 days of data on Brik. The idea is that you don’t then have to invest in a tonne of Briks, so to speak, to satisfy your organisation’s data protection retention policy.

 

Thoughts

If you’ve had the opportunity to test-drive Rubrik’s offering, you’ll know that everything about it is pretty simple. From deployment to ongoing operation, there aren’t a whole lot of nerd knobs to play with. It nonetheless does the job of protecting the workloads you point it at. A lot of the complexity normally associated with data protection is masked by a fairly simple model that will hopefully make data protection a little more appealing for the average Joe or Josie responsible for infrastructure operations.

Rubrik, and a number of other solution vendors, are talking a lot about service levels and policy-driven data protection. The idea is that you can protect your data based on a service catalogue type offering rather than the old style of periodic protection that was offered with little flexibility (“We backup daily, we keep it 90 days, and sometimes we keep the monthly tape for longer”). This strikes me as an intuitive way to deliver data protection capabilities, provided that your business knows what they want (or need) from the solution. That’s always the key to success – understanding what the business actually needs to stay in business. You can do a lot with modern data protection offerings. Call it SLA-based, talk about service level objectives, makes t-shirts with policy-driven on them and hand them out to your executives. But unless you understand what’s important for your business to stay in business when there’s a problem, then it won’t really matter which solution you’ve chosen.

Chris Wahl wrote some neat posts (a little while ago) on SLAs and their challenges on the Rubrik blog that you can read here and here.

Dell EMC Announces IDPA DP4400

Dell EMC announced the Integrated Data Protection Appliance (IDPA) at Dell EMC World in May 2017. They recently announced a new edition to the lineup, the IDPA DP4400. I had the opportunity to speak with Steve Reichwein about it and thought I’d share some of my thoughts here.

 

The Announcement

Overview

One of the key differences between this offering and previous IDPA products is the form factor. The DP4400 is a 2RU appliance (based on a PowerEdge server) with the following features:

  • Capacity starts at 24TB, growing in increments of 12TB, up to 96TB useable. The capacity increase is done via licensing, so there’s no additional hardware required (who doesn’t love the golden screwdriver?)
  • Search and reporting is built in to the appliance
  • There are Cloud Tier (ECS, AWS, Azure, Virtustream, etc) and Cloud DR options (S3 at this stage, but that will change in the future)
  • There’s the IDPA System Manager (Data Protection Central), along with Data Domain DD/VE (3.1) and Avamar (7.5.1)

[image courtesy of Dell EMC]

It’s hosted on vSphere 6.5, and the whole stack is referred to as IDPA 2.2. Note that you can’t upgrade the components individually.

 

Hardware Details

Storage Configuration

  • 18x 12TB 3.5″ SAS Drives (12 front, 2 rear, 4 mid-plane)
    • 12TB RAID1 (1+1) – VM Storage
    • 72TB RAID6 (6+2) – DDVE File System Spindle-group 1
    • 72TB RAID6 (6+2) – DDVE File System Spindle-group 2
  • 240GB BOSS Card
    • 240GB RAID1 (1+1 M.2) – ESXi 6.5 Boot Drive
  • 1.6TB NVMe Card
    • 960GB SSD – DDVE cache-tier

System Performance

  • 2x Intel Silver 4114 10-core 2.2GHz
  • Up to 40 vCPU system capacity
  • Memory of 256GB (8x 32GB RDIMMs, 2667MT/s)

Networking-wise, the appliance has 8x 10GbE ports using either SFP+ or Twinax. There’s a management port for initial configuration, along with an iDRAC port that’s disabled by default, but can be configured if required. If you’re using Avamar NDMP accelerator nodes in your environment, you can integrate an existing node with the DP4400. Note that it supports one accelerator node per appliance.

 

Put On Your Pointy Hat

One of the nice things about the appliance (particularly if you’ve ever had to build a data protection environment based on Data Domain and Avamar) is that you can setup everything you need to get started via a simple to use installation wizard.

[image courtesy of Dell EMC]

 

Thoughts and Further Reading

I talked to Steve about what he thought the key differentiators were for the DP4400. He talked about:

  • Ecosystem breadth;
  • Network bandwidth; and
  • Guaranteed dedupe ratio (55:1 vs 5:1?)

He also mentioned the capability of a product like Data Protection Central to manage an extremely large ROBO environment. He said these were some of the opportunities where he felt Dell EMC had an edge over the competition.

I can certainly attest to the breadth of ecosystem support being a big advantage for Dell EMC over some of its competitors. Avamar and DD/VE have also demonstrated some pretty decent chops when it comes to bandwidth-constrained environments in need of data protection. I think it’s great the Dell EMC are delivering these kinds of solutions to market. For every shop willing to go with relative newcomers like Cohesity or Rubrik, there are plenty who still want to buy data protection from Dell EMC, IBM or Commvault. Dell EMC are being fairly upfront about what they think this type of appliance will support in terms of workload, and they’ve clearly been keeping an eye on the competition with regards to usability and integration. People who’ve used Avamar in real life have been generally happy with the performance and feature set, and this is going to be a big selling point for people who aren’t fans of NetWorker.

I’m not going to tell you that one vendor is offering a better solution than the others. You shouldn’t be making strategic decisions based on technical specs and marketing brochures in any case. Some environments are going to like this solution because it fits well with their broader strategy of buying from Dell EMC. Some people will like it because it might be a change from their current approach of building their own solutions. And some people might like to buy it because they think Dell EMC’s post-sales support is great. These are all good reasons to look into the DP4400.

Preston did a write-up on the DP4400 that you can read here. The IDPA DP4400 landing page can be found here. There’s also a Wikibon CrowdChat on next generation data protection being held on August 15th (2am on the 16th in Australian time) that will be worth checking out.