Welcome to my semi-regular, random news post in a short format. This is #27. You’d think it would be hard to keep naming them after basketball players, and it is. None of my favourite players ever wore 27, but Marvin Barnes did surface as a really interesting story, particularly when it comes to effective communication with colleagues. Happy holidays too, as I’m pretty sure this will be the last one of these posts I do this year. I’ll try and keep it short, as you’ve probably got stuff to do.
This story of serious failure on El Reg had me in stitches.
I really enjoyed this article by Raj Dutt (over at Cohesity’s blog) on recovery predictability. As an industry we talk an awful lot about speeds and feeds and supportability, but sometimes I think we forget about keeping it simple and making sure we can get our stuff back as we expect.
Speaking of data protection, I wrote some articles for Druva about, well, data protection and things of that nature. You can read them here.
There have been some pretty important CBT-related patches released by VMware recently. Anthony has provided a handy summary here.
Everything’s an opinion until people actually do it, but I thought this research on cloud adoption from Leaseweb USA was interesting. I didn’t expect to see everyone putting their hands up and saying they’re all in on public cloud, but I was also hopeful that we, as an industry, hadn’t made things as unclear as they seem to be. Yay, hybrid!
Backblaze has done a nice job of talking about data protection and cloud storage through the lens of Star Wars.
This tip on removing particular formatting in Microsoft Word documents really helped me out recently. Yes I know Word is awful.
Someone was nice enough to give me an acknowledgement for helping review a non-fiction book once. Now I’ve managed to get a character named after me in one of John Birmingham’s epics. You can read it out of context here. And if you’re into supporting good authors on Patreon – then check out JB’s page here. He’s a good egg, and his literary contributions to the world have been fantastic over the years. I don’t say this just because we live in the same city either.
Disclaimer: I was recently a guest at Nimble Storage‘s Predictive Flash Platform announcement. My flights, accommodation and other expenses were paid for by Nimble Storage. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
Nimble Storage recently invited me to attend their Predictive Flash Platform launch event (2016.02.23) in San Francisco. You can download a copy of my raw notes here. I’ll be doing a more detailed disclaimer post in the near future [update – you can find that here], and I hope to be diving into some of the tech behind this announcement in further detail in the next little while.
Timeless Storage – Nimble Storage aims to deliver extremely high levels of customer satisfaction.
I’ve waxed lyrical about InfoSight previously, and remain a big fan of the product. My favourite quote comes from Rod Bagg – “InfoSight collects and analyses more sensor data points every four hours, than there are starts in the galaxy”. Which is pretty cool stuff. Nimble Storage uses InfoSight to
Prevent issues and downtime;
Deliver cross-stack root cause analysis of issues; and
Predict future needs and future planning.
Nimble Storage tell me that 9/10 issues are detected by them before customers know about them. They also say that it’s less than 1 minute of hold time before you get to speak to a Level 3 support engineer when there is a problem. I’ve spoken to a few customers over time, and all of them have told me that the customer experience has been nothing but stellar.
Unified Flash Fabric
I had a chance to talk to Dan Leary, VP of Products, Solutions and Alliances, about what Unified Flash Fabric really was. The key element of the solution is that it provides a logical mechanism to tie together up to four All Flash and Adaptive Flash arrays into a single architecture with common data services. The key here is that NimbleOS is common across the platforms, so you can mix and match.This also provides the ability to “Scale-to-fit” – providing the customer with flexible and non-disruptive scalability. In terms of scale up, you can add disk as required, whilst also adding the ability to non-disruptively upgrade the controllers to add CPU and memory as required. You can also scale out with up to 4 arrays managed as one. In my opinion, the most interesting use case here is data mobility, with Nimble Storage providing the capability to move data from an adaptive system to the all flash system in the same cluster, then remove the adaptive system from the cluster without downtime. Here’s an image from the Nimble Storage website that provides an illustration of how you might want to move your applications about.
If anyone has real-world experience with this data mobility technology (sure, it’s probably a bit early) I’d be happy to buy you a beverage to learn more about how it’s worked for you.
Timeless Storage sounds a little like Pure Storage’s Evergreen Storage approach. Customers seem to be fed up with bleeding cash every few years, so it’s nice to see the likes of Pure Storage and Nimble Storage coming up with these types of new approaches.
Nimble Storage state that the “Timeless Guarantee provides investment protection and upgrade certainty”. The crux of the programme is:
All-inclusive software licensing
Flat support prices in years 4 and 5
Option for new controller after 3 years
Capex or storage-on-demand
Only pay for the storage you use
Scale up or down to meet demand
Yep. Four new models, to be precise. You can view a PDF of the data sheet here. Here’s a photo of Suresh Vasudevan and Varun Mehta unveiling the array at the launch. I love that Varun looks so happy. If you ever get a chance to sit down with him, take the time. He’s wonderful to talk to, super smart and a mad gadget guy in his spare time.
Here’s a picture of what one of the new arrays looks like.
The new arrays have been designed for cost-optimised 3D-NAND through the use of advanced flash endurance management, large-scale coalescing and integrated hot-sparing. You can read more about the Samsung PM863 Series SSDs here. As a result of this approach, Nimble Storage claims that it provides for:
A 7 year SSD lifespan;
Increased performance; and
20% more useable capacity (relative to other systems on the market).
One of the cool things that has been introduced as part of the AF-series array is the new Nimble Storage Dual-Flash Carrier (DFC), with the capacity of each slot doubled to a total of 48 SSDs per array and expansion shelf. Each individual SSD is hot swappable and can be installed or removed from the DFC independently. Nimble Storage has also “qualified five Samsung PM863 Series SSDs, ranging in capacity from 240GB to 4TB” across the AF-Series.
Nimble Storage were careful to use the term “effective capacity” a number of times, with arrays shipping with 503TB of RAW storage being positioned as having 2PB of (marketing) capacity. The good news is that Nimble have worked in a number of data reduction features (variable block deduplication, variable block compression, zero pattern elimination) that they say leads to 5x or more data reduction. I spoke to one of their customers, Justin Giardina (CTO of iland) and he confirmed that their beta testing of the platform had yielded some very positive results. As always, there are a tonne of variables that can impact your success with deduplication, so if you’re betting the farm on this, it’s best to be conservative, and talk to your local Nimble Storage folks or partner about what you really need to get the job done.
Nimble Storage have been pretty focussed on “non-stop availability”, and have introduced a couple of new features to support this goal:
Triple+ parity RAID – tolerates three simultaneous drive failures plus intra-drive protection and integrated sparing
Integrated data protection – SmartSnap and SmartReplicate
SmartSecure Encryption – application-granular encryption and secure data shredding
Further Reading and Final Thoughts
You can read Vipin’s thoughts on the announcement here, while Stephen has a comprehensive write-up here, and El Reg covered it here. You can read a good blog post by Suresh that summarises it all nicely here. A few press releases have been made available as well, and you can check them out here, here, and here. A common reaction to the news of Nimble Storage’s announcement has been “well, it’s about time”. There’s a lot of noise in the AFA market, which is why I think that software like InfoSight makes the Nimble Storage solution a lot more interesting. If you’re in the market for an AFA (even if you don’t need it), I recommend having a chat to Nimble Storage.
Disclaimer: I recently attended Storage Field Day 8. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
NexGen Storage presented recently at SFD8 and while the video footage above covered off on some flash basics, when the camera stopped they talked to us about their latest array offering – the N5.
NexGen claim that the N5 is the first multi-tier, AFA with Storage Quality-of-Service (QoS). They were recently awarded US Patent 9,176,708 for Storage QoS, so they might be on to something. NexGen say that the array can prioritise performance-hungry workloads for more predictable performance while providing enhanced flash management for both performance and endurance
Putting a bunch of different types of flash, including PCIe flash (NVMe “ready”), SSDs and RAM in a box is pointless if you can’t do something sensible with it. This is where NexGen’s Dynamic QoS comes into play. As you’re probably aware, QoS is about putting priorities to work targets with automated throttling. The idea is you want things to work a certain way without necessarily having everything suffer because of a noisy neighbour or IOPS hungry VM. The array comes with both preconfigured policies and the ability to manage performance minimums.
The QoS priorities offer the following features:
Adaptive BW throttling;
Adaptive queueing placement;
Real-time, always on; and
Prioritised active caching.
Closing Thoughts and Further Reading
NexGen Storage have been around a while and certainly have some good pedigree and experience behind them. I’m interested to see how these new arrays perform in the real world, because they certainly look the goods on paper. It was only a matter of time before someone took a “hybrid” approach to AFAs and gave some configuration options back to the end user. I can’t comment on how effective the implementation is, but I think it’s worthy of further investigation. Finally, you can read more about the new product at Storage Review.
I’ve been doing some design work for a few customers and thought I’d put together a brief post on some considerations when deploying XtremIO. I don’t want to go into the pros and cons of the product, nor am I really interested in discussing better / worse alternatives. Let’s just assume you’ve made the decision to go down that track. So what do you need to know before it lobs up in your data centre? As always, I recommend checking EMC’s support site as they have some excellent site planning and installation documentation. There’s also a pretty good introductory whitepaper here.
The core hardware in the XtremIO solution is the X-Brick. I’ve included a glamour shot below from EMC’s website for reference. Each X-Brick is comprised of:
One 2U DAE, containing 25 eMLC SSDs (400GB, 800GB or 1600GB SSD options);
Two redundant power supply units (PSUs);
Two redundant SAS interconnect modules;
Two BBUs (per cluster); and
Two 1RU Storage Controllers (redundant storage processors).
A one X-Brick configuration has the controllers directly connected via InfiniBand, whilst other configurations require 2 IB switches. X-Brick clusters can be deployed in combinations of 1, 2, 4, 6 or 8 X-Bricks.
The latest release (4.0) has been generally available since 30 June 2015, and now supports:
“Generation 3” hardware (the 40TB X-Bricks);
Larger clusters (up to 8 20TB X-Bricks in a cluster); and
You can optionally deploy the XMS (XtremIO Management Server) on a VM rather than physically. There are a few things you need to be mindful of if you go down this route.
The virtual XMS VM should have the following configuration:
2 vCPUs; and
The virtual XMS VM should have a single 900GB disk (thin provisioned). Note that 200GB of disk capacity is pre-allocated following the cluster initialization. This should be provisioned on a RAID-protected storage. Note that shared storage used should not originate from an XtremIO cluster.
The virtual XMS should be located in the same LAN as the XtremIO cluster.
The deployed virtual XMS Shares memory resource allocation is set to High. As such, the virtual XMS is given high priority on memory allocation when required. If you’re using a non-standard shares memory allocation, this should be adjusted post-deployment.
In The Data Centre
The following table shows the required rack space depending on the number of X-Bricks in the cluster.
Power and Cabling
From a cabling perspective, your friendly EMC installation person will take care of that. There’s very good guidance on the EMC support site, depending on your access level. Keep in mind that you’ll want your PDUs in the rack to come via diverse circuits to ensure a level of resiliency.
In terms of power consumption, the table below provides guidance on maximum power usage depending on the number of X-Bricks you deploy.
From a connectivity perspective, you’ll need to account for both FC and IP resources. Each controller has two FC front-end ports and two iSCSI ports that you can present for block storage access. You’ll also need an IP address for each controller (so two per X-Brick), along with at least one for the XMS. For monitoring, the latest version of the platform supports EMC’s Secure Remote Services (ESRS), so you can incorporate it into your existing solution if required.
Should you decide to go down the XtremIO track, there a few things to look out for, primarily around planning your data centre space. It’s a nice change that you don’t have to get too bogged down in details about the actual configuration of the storage itself. But ensuring that you’ve planned for suitable space, power and management will make things even easier.
Disclaimer: I recently attended Storage Field Day 7. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
For each of the presentations I attended at SFD7, there are a few things I want to include in the post. Firstly, you can see video footage of the Kaminario presentation here. You can also download my raw notes from the presentation here. Finally, here’s a link to the Kaminario website that covers some of what they presented.
Dani Golan, CEO of Kaminario, gave us a quick overview of the company. They’ve recently launched the 5th generation of their all-flash array (AFA), with the majority (80%) of customers being in the midrange (rev $100m – $5B).
The entry level for the solution is 20TB, with the average capacity being between 50 and 150TB. The largest implementation runs to 1.5PB.
Use cases are primarily:
VDI / Virtualisation;
Kaminario state that they’re balanced across all verticals and offer general purpose storage.
Kaminario state that architecture is key. I think we’re all agreed on that point. Kaminario’s design goals are to:
scale easily and cost-efficiently; and
provide the lowest overhead on the storage system to fulfil the customer’s needs.
Kaminario want to offer capacity, performance and flexibility. They do this by offering scale up and scale out.
Customers want somewhere in between best $/capacity and best $/performance.
The K2 basic building block (K-blocks, not 2K blocks) is:
Off the shelf hardware;
2x K-nodes (1U server);
SSD Shelf (24 SSDs – 2RU); and
SSD expansion shelf (24 SSDs – 2RU).
Here’s a diagram of the K2 scale up model.
And here’s what it looks like when you scale out.
I want to do both! Sure, here’s what scale up and out looks like.
In the K2 scale-out architecture:
Data is spread across all nodes;
Metadata is spread across all nodes;
Provides the ability to mix and match different generations of servers and SSDs;
Offers global deduplication; and
Provides resiliency for multiple simultaneous failures.
Data is protected against block (nodes and storage) failure, but the system will go down to secure the data.
As for metadata scalability, modern data reduction means fine grain metadata:
Pointer per 4KB of addressable; and
Signature per 4KB of unique data.
According to Kaminario, reducing the metadata footprint is crucial.
The adaptive block size architecture means less pointers;
Deduplication with weak hash reduces signature footprint; and
Density per node is critical.
K-RAID is Kaminario’s interpretation of RAID 6, and works thusly:
2P + Q – 2 R5 groups, single Q parity on them;
Fully rotating, RAID is fully balanced;
Fully automatic, no manual configuration; and
High utilisation (87.5%), no dedicated spares.
The K2 architecture also offers the following data reduction technologies:
Global and adaptive;
Selective – can be turned off per volume; and
Weak hash and compare – low MD and CPU footprint, fits well with flash.
Adaptive block size – large chunks are stored contiguously, each 4k compressed separately;
Standard LZ4 algorithm; and
Optimized zero elimination.
From a resiliency perspective, K2 supports:
Two concurrent SSD failures per shelf;
Consistent, predictable and high performance under failure; and
Fast SSD firmware upgrades.
The architecture currently scales to 8 K-Blocks, with the sweet spot being around 2 – 4 K-Blocks. I strongly recommend you check out the Kaminario architecture white paper – it’s actually very informative.
Final Thoughts and Further Reading
I first came across Kaminario at VMworld last year, and I liked what they had to say. Their presentation at SFD7 backs that up for me, along with the reading I’ve done and the conversations I’ve had with people from the company. I like the approach, but I think they have a bit of an uphill battle to crack what seems to be a fairly congested AFA market. With a little bit more marketing, they might yet get there. Yes, I said more marketing. While we all like to criticise the marketing of products by IT vendors, I think it’s still a fairly critical piece of the overall solution puzzle, particularly when it comes to getting in front of customers who want to spend money. But that’s just my view. In any case, Enrico did a great write-up on Kaminario – you can read it here. I also recommend checking out Keith’s preview blog of Kaminario.