Exablox Isn’t Just Pretty Hardware

Disclaimer: I recently attended Storage Field Day 10.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

exablox-logo-black

Before I get started, you can find a link to my raw notes on Exablox‘s presentation here. You can also see videos of the presentation here.  You can find a preview post from Chris M. Evans here.

 

It’s Not Just the Hardware

I waxed lyrical about the Exablox hardware platform after seeing it at Storage Field Day 7. But while the OneBlox hardware is indeed pretty cool (you can see the specifications here), the cloud-based monitoring platform, OneSystem, is really the interesting bit.

According to Exablox, the “OneSystem application is used to combine OneBlox appliances into Rings as well as configuring shares, user access, and remote replication”. It’s the mechanism used for configuration, as well as monitoring, alerting and reporting.

OneSystem is built on a cloud-based, multi-tenant architecture. There’s nothing to install for organisations, VARs, and MSPs. Although if you feel a bit special about how your data is treated, there is an optional, private OneSystem deployment available for on-premises management. Exablox pride themselves on the “world-class” support they provide to customers, with a customer-first culture being one of the dominant themes when talking to them about support capability. Some of the other benefits of the OneSystem approach is:

  • The ability to globally manage OneBlox anywhere; and
  • Deliver seamless OneBlox software upgrades.

Exablox also provide 24×7 proactive monitoring, providing insight into, amongst other things:

  • Storage utilisation and analysis;
  • Storage health and alerts; and
  • OneBlox drive health.

The cool thing about this platform is that it offers the ability to configure custom storage policies and simple scaling for individual applications. In this manner you can configure the following data services on a “per application” basis:

  • Variable or fixed-length deduplication;
  • Compression on/off;
  • Continuous data protection on/off and retention; and
  • Remote replication on/off.

 

I Want My Data Everywhere

While the OneBlox ring is currently limited to 7 systems per cluster, you can have two or more (up to 10) clusters operating in a mesh for replication. You can then conceivably have a whole bunch of different data protection schemes in place depending on what you need to protect and where you need it protected. The great thing is that, with the latest version of OneSystem, you can have a one-to-many replication relationship between directories as well. This kind of flexibility is really neat in my opinion. Note that replication is asynchronous.

SFD10_Exablox_Mutli-siteReplication

 

Further Reading and Final Thoughts

If you’ve read any of my recent posts on the likes of Pure, Nimble and Tintri, it would feel like everyone and their dog is into cloud-based monitoring and analytics systems for storage platforms. This is in no way a bad thing, and something that I’m glad we’re seeing become a prevalent feature with these “modern” storage architectures. We store a whole bunch of data on these things. And sometimes it’s even data that is vital to the success of the various business endeavours we undertake on a daily basis. So it’s great to see vendors are taking this requirement seriously. It also helps somewhat that people are a little more comfortable with the concept of keeping information in “the cloud”. This certainly helps the vendors control the end user experience form a support viewpoint, rather than relyin on arcane systems deployed across multiple VMs that invariably fail at the time you need to dig into the data to find out what’s really going on in the environment.

Exablox have come up with a fairly unique approach to scale-out NAS, and I’m keen to see where they take it from here. Features such as remote replication and the continuing maturity of the OneSystem platform make me think that they’re gearing up to push things a little beyond the BYO drives SMB space. I’ll be interested to see just how that plays out.

Ray Lucchesi did a thorough write-up on Exablox that you can read here, while Francesco Bonetti did a great write-up here. Exablox has also published a technical overview of OneBlox and OneSystem that is worth checking out.

 

Storage Field Day 7 – Day 3 – Exablox

Disclaimer: I recently attended Storage Field Day 7.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the Exablox presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Exablox website that covers some of what they presented.

Brief Overview

Exablox was founded in 2010 and launched publicly in April 2013. There are two key elements to their solution:

  • OneBlox – scale-out storage for the enterprise, offering converged storage for primary and backup / archival data; and
  • OneSystem – manage on-premises storage exclusively from anywhere, providing visibility, control, and security without cost / complexity of traditional management

Here’s a photo of Tad Hunt (CTO and Co-founder) showing us the internals of the Exablox appliance.

IMG_1214_11

 

Architecture

Exablox started the presentation by talking about what we want from storage re-imagined (my words, not theirs):

  • Scale out;
  • Deduplication;
  • Snapshots;
  • Replication;
  • Be simple yet powerful; and
  • Be managed from everywhere.

The Exablox approach is not your father’s standard storage presentation play. Instead of providing block storage via SMB / NFS, or object storage via APIs, it instead presents file protocols via the front-end and services these with object storage on the back-end.

exablox-architecture-diagram

Technology Vision

Exablox’s approach revolves around software-defined storage (SDS) and storage management, with the following goals:

  • Manage the policy, not the technology;
  • SDS “wrapped in tin” for the mid market;
  • Eliminate complexity;
  • Plug-and-play; and
  • Next generation features.

They deliver NAS features atop object storage:

  • Without metadata servers;
  • Without bolt-on NAS gateways;
  • Without separate data and metadata servers; and
  • To scale capacity, performance, or resilience: just add a node.

 

Technology Benefits

Exablox say they can create scale-out NAS and object clusters atop mixed media – HDD, SSD, Shingled drives. This approach delivers the benefits of object storage technology to traditional applications:

  • By using standard file protocols; and
  • eliminating forklift upgrades – single namespace across the scale of the cluster.

They also use “RAID-free” data protection:

  • Self-healing from multiple drive and node failures;
  • Rebalancing time proportional to the quantity of objects on the failed drive;
  • Mix and match drive types, capacities, technologies; and
  • Introduce next generation drives without long validation cycles.

This provides the ability to scale capacity from TB to PB easily, whilst also offering:

  • Zero configuration expansion; and
  • Manage from anywhere capability.

Exablox say they are able to support all NAS workloads well. Whereas other object stores are designed primarily for large files, a OneBlox 3308 can handle 1B objects. All nodes perform all functions: storage, control, NAS interface, with a node being a single failure domain.

 

Hardware Notes and Thoughts

For the purposes of this post, I wanted to focus on the OneBlox appliance. While the OneSystem architecture is super neat, I still get a bit of a nerd tingle when I see some nice hardware. (BTW if Exablox want me test one long-term I’d be happy to oblige).

Exablox claims to be the sole provider of the following features in a single storage solution:

  • Scale-out deduplication;
  • Scale-out, continuous snapshots;
  • Scale-out, RAID-less capacity;
  • Scale-out, site-to-site disaster recovery; and
  • Bring any drive – one at a time at retail pricing.

They also support auto-clustering, with each node adding:

  • Capacity;
  • Performance; and
  • Resiliency.

The Exablox 3308 appliance:

  • Is seriously bloody quiet;
  • Uses 100W under peak load;
  • Has 8 * 3.5” drive bays, supporting up to 48 raw TB; and
  • Can use a mix of SATA & SAS drives.

Here is a picture of some appliances on a rack.

IMG_1213_cropped

Further Reading

I was impressed with the strategy presented to me by Exablox, and the apparent ease of deployment and overall design of the appliance seemed great on the surface. I’d like to be clear that I haven’t used these in the wild, nor have I had any view of any benchmark data, so I can’t comment as to the effective performance of these devices. Like most things in storage, your mileage might vary. But I will say they seem quite inexpensive for what they do, and I recommend taking a more detailed look at them.

I also recommend you check out Keith’s preview post on Exablox.  For a different perspective on the hardware, have a look at Storage Review’s take on things as well.

EMC – RecoverPoint for Virtual Machines

EMC announced RecoverPoint for VMs last week, and I thought I’d do a quick summary post / highlights for those who missed it.

Firstly, an overview from EMC can be found here. You can get the datasheet here. And you can watch an overview video of the features here.

Audience

Secondly, it’s important to understand where EMC is pitching this product. Both the traditional RecoverPoint appliance and the RecoverPoint Virtual Edition have been aimed at storage admins. RecoverPoint

  • protects LUNs;
  • is managed through Unisphere;
  • is deployed on physical hardware appliances, using embedded storage array splitters in VMAX, VNX, and VPLEX; and
  • supports over 50 storage systems, including EMC and 3rd party arrays using the VPLEX splitter.

RecoverPoint Virtual Edition removes the need for dedicated EMC hardware appliances.  RP VE

  • protects storage LUNs;
  • is managed through Unisphere;
  • is deployed as virtual appliances on existing ESXi servers, and uses the embedded array splitter in VNX; and
  • it currently only supports EMC VNX.

RPVM1

So what about RP for VMs? RP for VMs

  • protects at the VM level;
  • is fully managed through vCenter;
  • is deployed as a virtual appliance on existing ESXi servers;
  • has an embedded I/O splitter within the vSphere kernel; and
  • is storage agnostic and supports any SAN, vSAN, NAS or DAS storage arrays on VMware’s HCL.

RPVM2

It’s critical to note that this is a completely separate product from RecoverPoint – there is no upgrade, no downgrade and no interoperability with the existing RP products.

It does support both VMDKs and RDMs (this is a good thing).

Architecture

It’s comprised of:

  • a VMware vCenter plug-in;
  • a RecoverPoint write-splitter embedded in vSphere; and
  • virtual appliances

Here’s a picture that shows the different elements.

RPVM3

Deployment

The splitters are deployed as VIBs, while the appliances come in OVF format. Management is performed using a plug-in via the vCenter Web UI.

Licensing

The RecoverPoint for Virtual Machines product uses a VM-based licensing model and is priced per VM (starting at a minimum of 15 VMs). Note that there is no transfer of licenses between the RecoverPoint and the RecoverPoint for Virtual Machines products.

Summary

This is going to be a handy product for people looking for a contained appliance, with flexible deployment options, that will provide synchronous replication performance (if required and subject to certain constraints). I’m looking forward to taking it for a spin.

EMC – HeatMap Analyzer – Basics – Part 3 – Charts

This is part 3 of a series where I will go into a little more detail on how you use HeatMap Analyzer. In this article I will look at the different sort of charts that can be drawn using the Analyzer component.

 

Line Chart

HMA_Basics_Charts_001

This is a standard Line chart. Line style charts (Line, Stacked and Line by Day) are all comprised of several sections – the main chart (upper) and the control section (lower). The control section allows you to zoom and pan the chart. If, for example, you are viewing several months worth of performance data (as above) then analysing specific points in time becomes quite difficult. Dragging the sliders in the control section closer together will allow you to view the data in greater detail.

 

HMA_Basics_Charts_002

Hovering over a data point will display a tool-tip with the details of that point.

 

Line by Day Chart

The Line by Day chart breaks down a standard Line chart and overlays each day. This can be useful in finding time related trends in data sets. For example, looking at a particular SP’s utilization over the past 3 weeks might look something like this:

 

HMA_Basics_Charts_003

Trying to work out time based trends in this data is visually difficult, however if we chart it using a Line by Day chart it allows you to analyze the data more intuitively. You could make the following conclusions:

1. Utilization over the weekend is generally lower.

2. There is consistently a peak in utilization at 5 am.

3.  The highest peak during business hours is at 9 am.

4. Utilization of the array begins to increase from around 6 pm.

 

HMA_Basics_Charts_004

 

Bear in mind that charting multiple objects in the one chart will potentially make analysis more difficult, and while tool tips are shown when you hover over a data point, the time will be correct however the date is all converted back to 1st January 2000.

 

Stacked Chart

 

HMA_Basics_Charts_005

A stacked chart aggregates the data points for the selected objects, and shows the total of the attribute charted for a particular time. Only certain types of Attributes can be charted in a Stacked chart (and Pie charts). For example it doesn’t make a lot of sense to draw a Stacked chart of SP utilization. However, where the unit of measurement is “aggregatable” like Total Bandwidth (measured in MB/s) or Total Throughput ( measured in IOPS ), it can be charted this way.

 

If you really want to be able to chart these other unit types using Stacked charts, you can modify the units aggregatable flag in the Configuration tab under the “Attribute Units”  section, removing the unit and re-adding it  with the aggregatable flag ticked. This should enable this chart type for those attributes.

 

Pie Chart

 

HMA_Basics_Charts_006

When selecting a Pie Chart, you are unable to select which objects are to be drawn (all objects are included in the chart calculations). Specifying Max Slices will limit the chart to display n-1 objects, the objects that are shown will be ordered largest to smallest, and all other objects will be combined under the “Other” slice.

 

Distribution Chart

 

HMA_Basics_Charts_007

A distribution chart allows you to determine the percentage of data sets that fall within particular bands.  The above chart shows that 46% of SP A’s response times are in the 0-3 ms band, whereas 45% of SP B’s response times are in the 6-9ms band.  When drawing a Distribution chart leaving the Minimum and Maximum blank will leave the script to determine the  ranges that will contain the entire data set. Using the Minimum and Maximum you can change the range that is charted, and changing the interval will change the number of bands displayed (by default 10).

 

HMA_Basics_Charts_008

Changing the Minimum to 0, Maximum to 20 would result in the above chart, breaking the distribution down into smaller bands

Table

HMA_Basics_Charts_009

This option displays a text table of the Attribute and Objects that you have selected, you can select all the data and paste it into you favorite analytic tool

 

Connectivity Graphic

HMA_Basics_Charts_010

Connectivity charts show the hierarchical connectivity of the array (currently this chart is only available for CLARiiON arrays)

Front End Ports → Storage Processor →LUN → MetaLUN Component → RAID Group / Pool  → Private RAID Group → Disk

Modifying the depth field will change the depth to which the chart is drawn. Modifying the Attribute field will change the Attribute that is represented on the chart (Connectivity is a special attribute that weights each link the same). Link size is relative to the average for the selected attribute for the particular object, where the object does not expose that attribute (i.e. Ports don’t expose Utilization) will show a thin link.

NOTE: Arrays with more complex configurations and larger number of disks don’t necessarily display very clearly in this format.

 

HMA_Basics_Charts_011

 

In Part 4 we will look at the Configuration tab and how to Automate NAR file collection.

EMC – HeatMap Analyzer – Basics – Part 2 – Filtering

This is part 2 of a series where I will go into a little more detail on how you use HeatMap Analyzer.

 

HMA_Basics_Filtering_001

 

Filtering

Being able to filter and order objects can be useful, especially when there are a lot of them and you are trying to find where a problem may be originating.

Expanding the Filter option will display the following window.

 

HMA_Basics_Filtering_002

 

This can be broken into four main sections. Each of these sections is effectively AND together when the filter is executed.

Filter by Attribute:  If this section is enabled it allows you to retrieve the top|bottom “n” objects ordered by avg|min|max attribute.

Object Type: For objects that expose a “Type” attribute (currently this is only LUNs and Disks). This option allows you to show only specific Object Types. For example if you only wanted to view Public RAID Group LUNs and Public Pool LUNs  then selecting these options will filter out all other types of LUNs. NOTE only object types are are present in the array will be available in the list, and the label that particular Object Types are given may vary depending on the version of Flare code.

Include: This option allows you to include objects based on whether the extended data for that object contain the text entered in this field, this may include Name, Owner, Type, Host (depending on what extended data is available for that object). This can be a comma separated list and is case insensitive. NOTE leading or trailing spaces will be included in the search.

Exclude: This option allows you to exclude objects based on whether the extended data for that object contains the text entered in this field, this may include Name, Owner, Type, Host (depending on what extended data is available for that object). This can be a comma separated list and is case insensitive. NOTE leading or trailing spaces will be included in the search.

 

So the following filter selections would display the Top 4 by average Queue Length Public Pool LUNs where the extended information includes SP A and excludes Hypervisor.

 

HMA_Basics_Filtering_003

 

Once you have filtered the object list you are able to chart the objects against any of the attributes that are exposed by that object type.

 

In Part 3 we will look at the different chart types and what they may be used for.

EMC – HeatMap Analyzer – Basics – Part 1 – Processing a NAR file

Mat has agreed to do some posts on the basics of using the EMC HMA. In this episode, he’s looking at manually loading NAR files and other cool stuff. Enjoy.

 

This is part 1 of a series where I will go into a little more detail on how you use HeatMap Analyzer. In this article we go through the process of manually loading a NAR file, basic charting and HeatMaps.

At this point you should have the HeatMap Analyzer appliance configured and you may be wondering what you can begin to do with it. I wrote the original HeatMap tool so I could visualize what was occurring over an entire array or parts thereof at any one point in time. This evolution of that script is HMA. It  now allows you to look closer at individual components of the array and while this is something that can be done with the current EMC Analyzer tool set, it is limited by the amount of data that you can load at any one time.

 

Manually Load NAR File

Point your browser at the appliance http://1.1.1.1/HMA.html and go to the Configuration tab.

 

HMA_Basics_Processing_001

 

Expand the “Manual Load NAR Files” section and select the “Choose Files” button. Here you can select one or more NAR files from a single or multiple arrays.  Depending on the number of “objects” in the NAR file (disks, LUNs, Pools, Ports, etc), processing of each NAR file may take a reasonable amount of time. As a rough guide for every 100 objects in the NAR file it will take about 30 seconds to process (assuming the standard 300 data points for each object).  Some of my CX4-960 arrays that have 3000 odd objects take 10-15 minutes to process each file. NOTE: if you are processing large NAR files or lots of smaller NAR files then the browser may currently timeout waiting for the processing to finish. If this is the case you can monitor the process through the Server Status tab or from the command line of the appliance (see below).

 

HMA_Basics_Processing_002

Note: you should let all processing complete before you begin to draw HeatMaps or charts under the Analyzer tab, if you don’t you may get some unexpected results due to the non-granular locking that SQLite uses while updating the database. 

Monitoring the NAR file load process

If you want to monitor the processes that are working on your NAR file(s) there are some tools available on the Server Status tab.

For basic process monitoring expand the “Process Monitoring” section, and tick the “Auto Update” option. If you are currently processing a NAR file you will see the ProcessNAR.pl script running.

 

HMA_Basics_Processing_003

 

For more detailed monitoring you can view the logs created by each script. On the Server Status tab expand the “Logs Viewer” section, select the log file that you want to monitor and select “Auto Update” and “Auto Scroll”. You may want to combine this with the basic process monitoring to get a better picture of what is occurring.

 

HMA_Basics_Processing_004

Note: it’s worthwhile turning off Auto Update after you have completed using these tools as leaving it running will place unnecessary load on the appliance

HeatMaps

To draw HeatMaps for a particular array go to the HeatMaps tab.

 

HMA_Basics_Processing_005

 

Array: If you have multiple arrays you can select which array that you want to draw a HeatMap for.

Array Type: This gives a description of the array type (currently only CLARiiON and Data Domain types are supported).

Start/End: This is the Start and End date of the HeatMap. NOTE: Selecting a long duration for the HeatMap may take a while to process, and it is not recommended that you draw a HeatMap that spans more than several days without increasing the Average Interval.

Avg Interval: The number of seconds between each chart point.

Object List: This is a list of objects that you can select attributes from to chart and may include.

All LUNs / Storage Processors / Disks / RAID Group / Port / Thin Pool / CPU / Asynchronous Mirror / Snap Session.

The object that are shown will depend on what you have configured on your array (if you don’t have any Asynchronous Mirrors defined then you wont see this option).

Selecting one of these drop down lists will show a list of attributes that can be charted for this object.

 

HMA_Basics_Processing_006

 

Select an attribute and it will be added to the list of object/attributes that will be drawn in the HeatMap. Select which object / attribute combinations that you want to display in the HeatMap and then press the “Draw HeatMap” button. The HeatMap configuration pane can be hidden by clicking on the top arrow in the top left corner.

 

HMA_Basics_Processing_007

 

The way the HeatMap works is by defining a minimum, a maximum and a skew for each object and attribute. These can be modified in the Configuration Tab under the HeatMap Attribute Normalization.

Analyzing the Data

For more detailed analysis of individual objects in the array go to the Analyzer tab. On this tab you will initially see three panes. The first left hand pane allows you to configure and draw charts for various objects, the second upper right pane gives global details for the array and the third lower right pane gives you details for the selected object.

 

HMA_Basics_Processing_008

 

Array: If you have multiple arrays you can select which array that you want to draw a HeatMap for.

Array Type: This gives a description of the array type (currently only CLARiiON and Data Domain types are supported).

Object Type: The type of object to chart (ie Storage Processor, LUN, Disk, etc)

Start/End: This is the Start and End date of the chart. .

Filter: Filter options that allow you to include / exclude objects based on set criteria (I’ll go into more detail about this later).

Object: The object(s) that you want to chart.

Attribute: The metric that you want to chart for the object, each object type exposes a different set of attributes.

Chart dependent options:

Line Charts, Stacked Charts and Tables

Avg Interval: The number of seconds to average the data points out to, leaving as zero or blank will process the data as is.

Pie Charts

Max Slices: The Maximum number of slices to draw in a Pie chart. If there are more than “n-1” objects  then extra objects will be grouped under “Other”.

Distribution Charts

Minimum: The minimum number to chart.

Maximum: The maximum number to chart.

Intervals: The number of intervals to chart.

Chart Type:

Line: A standard line chart, multiple object can be selected and charted against one another.

Line by Day: A standard line chart where each day is stacked against each other, while you can select multiple object it’s not recommended as it makes the chart difficult to interpret.

Stacked: A stacked line chart, this option is available where the attribute being graphed is able to be aggregated (ie MB, KB or GB rather than % Utilization), again multiple objects can be selected.

Pie: A pie chart, this will draw a pie chart for all of the objects for the selected “Object Type”.

Distribution: A bar chart showing the % of values that fall between certain ranges.

Table: A text table of the data.

Connectivity: A Connectivity / SanKey chart. This show how each of the objects in the array are connected. It is only avaliable for CLARiiON devices.

Once you have selected the Object Type, Object(s) and Attribute that you want to chart, click the “Graph” button and chart will be drawn for that selection.

HMA_Basics_Processing_009

 

Depending on the type of chart drawn you may have various controls that allow you to explore the data more closely. Line style charts have the slider control at the bottom which allow you to focus on a subset of the data. Tables allow you to select all the data so it can be copied. Connectivity charts allow you to control the depth of the chart. And the attribute that controls the weight of each of the connections, all chart types except Tables allow you to save the chart as a JPEG.

 

In the next article I will go into more detail regarding Filtering and the different Chart Types

 

EMC – HeatMap Analyzer v0.1 Now Available

It’s alive. Mat has been coding like crazy and enhancing the HeatMap script and turning it into like, an appliance kind of thing. You can grab it from the Utilities page and it comes in two parts – the core code and third-party scripts package. While the combined package size is small, it saves redistributing stuff that hasn’t changed. In any case, download it, give it a spin and let us know your thoughts. Obviously, it’s still a bit ugly, and still a bit version 0.1, but that’s what you get for free. Tell your friends.