Ever since the VNX2 was announced, customers have asked me about using deduplication with their configs. I did an article on it around the time of the product announcement but have been meaning to talk a bit more about it for some time. But before I do, check out Joel Cason’sgreat post on this. Anyway, here’s a brief article listing some of the caveats and things to look out for with block deduplication. A few of my clients have used this feature in the field, and have learnt the hard way that if you don’t follow EMC’s guidelines, you may have a sub-optimal experience. Most of the information here has been taken from the “EMC VNX2 Deduplication and Compression” which can be downloaded here.
If you’re running a workload with more than 30% writes, compression and deduplication may be a problem. EMC state that, “[f]or applications requiring consistent and predictable performance, EMC recommends using Thick pool LUNs or Classic LUNs. If Thin LUN performance is not acceptable, then do not use Block Deduplication”. I can’t stress this enough – know your workload!
Block deduplication is done on a per pool LUN basis. EMC recommended that deduplication be enabled at the time of LUN creation. If you enable it on an existing LUN, the LUN is migrated into the deduplication container using a background process. The data must reside in the container before the deduplication process can run on the dataset.
There is only one deduplication container per storage pool. This is where your deduplicated data is stored. When a deduplication container is created, the SP owning the container needs to be determined. The container owner is matched to the Allocation Owner of the first deduplicated LUN within the pool. As a result of this process, EMC recommends that all LUNs with Block Deduplication enabled within the same pool should be owned by the same SP. This can be a big problem in smaller environments where you’ve only deployed one pool.
There’s a bit more to consider, particularly if you’re looking at leveraging compression as well. But if you can’t get past these first few considerations, it’s likely that the VNX2’s version of deduplication on primary storage is probably not for you. Read the whitepaper – it’s readily accessible and fairly clear about what can and can’t be achieved within the constraints of the product.
EMC World is just around the corner and, as is their wont, EMC are kicking off early with a few cheeky product announcements. I don’t have a lot to say about the VNXe, as I don’t do much in that space, but a lot of people might find this recent announcement of interest. If press releases aren’t your thing, here is a marketing slide you might enjoy instead.
The cool thing about this is that the baby is getting the features of the bigger model, namely the FAST Suite, thin provisioning, file dedupe and MCx. Additionally, a processor speed improvement will help with the overall performance of the device. There’s a demo simulator you can check out here.
EMC also announced a new feature for VNX called D@RE, or Data-At-Rest-Encryption. This should be available as an NDU in Q3 2014. I hope to have more info on that in the future.
Finally, Project Liberty was announced. This is basically EMC’s virtualised VNX, and I’ll have more on that in the near future.
And if half-arsed blog posts aren’t your thing, I urge you to check out Jason Gaudreau’s post covering the same announcement. It’s a lot more coherent and useful.
EMC have also breathlessly announced the introduction of “Multi-Core Everything” or MCx as an Operating Environment replacement for FLARE. I thought I’d spend a little time going through what that means, based on information I’ve been provided by EMC.
MCx is a redesign of a number of components, providing functionality and performance improvements for:
Multi-Core Cache (MCC);
Multi-Core RAID (MCR);
Multi-Core FAST Cache (MCF); and
Active / Active data access.
As I mentioned in my introductory post, the OE has been redesigned to spread the operations across each core – providing linear scaling across cores. Given that FLARE pre-dated multi-core x86 CPUs, it seems like this has been a reasonable thing to implement.
EMC are also suggesting that this has enabled them to scale out within the dual-node architecture, providing increased performance, at scale. This is a response to a number of competitors who have suggested that the way to scale out in the mid-range is to increase the number of nodes. How well this scales in real-world scenarios remains to be seen.
With MCC, the cache engine has been modularized to take advantage of all the cores available in the system. There is now also no requirement for manually separate space for Read and Write Cache, meaning no management overhead in
ensuring the cache is working in the most effective way regardless of the IO mix. Interestingly, cache is dynamically assigned, allocating space on-the-fly for reads, writes, thin metadata, etc depending on the needs of the system at the time.
The larger overall working cache provides mutual benefit between reads and writes. Note also that data is not discarded from cache after a write destage (this greatly improves cache hits). The new caching model employs intelligent monitoring of pace of disk writes to avoid forced flushing and works on a standardized 8KB physical page size.
MCR introduces some much-needed changes to the historically clumsy way in which disks were managed in the CLARiiON / VNX. Of particular note is the handling of hot spares. Instead of having to define disks as permanent hot spares, the sparing function is now more flexible and any unassigned drive in the system can operate as a hot spare.
There are three policies for hot spares:
No hot spares; and
The “Recommended” policy implements the same sparing model as today (1 spare per 30 drives). Note, however, that if there is an unused drive that can be used as a spare (even if you have the “No Hot Spare” policy set) and there is a fault in a RAID group, the system will use that drive. Permanent sparing also means that when a drive has been used in a sparing function, that drive is then a permanent part of the RAID Group or Pool so there is no need to re-balance and copy back the spare drive to a new drive. The cool thing about this is that it reduces the exposure and performance overhead of sparing operations. If the concept of storage pools that spread all over the place didn’t freak out the RAID Group huggers in your storage team, then the idea of spares becoming permanent might just do it. There will be a CLI command available to copy the spare back to the replaced drive, so don’t freak out too much.
What I like the look of, having been stuck “optimising” CLARiiONs previously, is the idea of “Portable drives”, where drives can be physically removed from a RAID group and relocated to another disk shelf. Please note that you can’t use this feature to migrate RAID Groups to other VNX systems, you’ll need to use more traditional migration methods to achieve this.
Multi-Core FAST Cache
According to EMC, MCF is all about increased efficiency and performance. They’ve done this by moving MCC above the FAST Cache driver – all DRAM cache hits are processed without the need to check whether a block resides in FAST Cache, saving this CPU cycle overhead on all IOs. There’s also a faster initial warm-up for FAST Cache, resulting in better initial performance. Finally, instead of requiring 3 hits, if the FAST Cache is not full, it will perform more like a standard extension of cache and load data based on a single hit. Once the Cache is 80% full it reverts to the default 3-hit promotion policy.
Symmetric Active / Active
Real symmetric? Not that busted-arse ALUA? Sounds good to me. With Symmetric Active/Active, EMC are delivering true concurrent access via both SPs (no trespassing required – hooray!). However, this is currently only supported for Classic LUNs. It does have benefits for Pool LUNs by removing the performance impacts of trespassing the allocation owner. Basically, there’s a stripe locking service in lpay that’s providing access to the LUN for each SP, with the CMI being used for communication between the SPs. Here’s what LUN Parallel Access looks like.
And that’s about it. Make no mistake, MCx sounds like it should really improve performance for EMC’s mid-range. Remember though, given some of the fundamental changes outlined here, it’s unlikely that this will work on previous-generation VNX models.
Otherwise titled, “about time”. I heard about this a few months ago, and have had a number of technical briefings from people inside EMC since then. For a number of reasons EMC weren’t able to publicly announce it until now. Anyway, for the official scoop, head on over to EMC’s Speed to Lead site. In this post I thought I’d cover off on some of the high-level speeds and feeds. I hope to have a some time in the near future to dive in a little deeper on some of the more interesting architectural changes.
As far as the hardware goes, EMC have refreshed the VNX midrange line with the following models:
The 5200 is positioned just above the 5100, the 5400 above the 5300, and so forth. The VNX8000 is the biggest yet, and, while initially shipping with 1000 drives, will eventually ship with a 1500-spindle capability. The SPs all use Sandy Bridge chips, with EMC heavily leveraging multi-core. The 8000 will sport dual-socket, 8-core SPs and 128GB of RAM. You’ll also find the sizing of these arrays is based on 2.5″ drives, and that a number of 2.5″ and 3.5″ drives will be available with these models (highlights being 4TB 3.5″ NL-SAS and 1.2TB 2.5″ SAS drives). Here’s a better picture with the max specs for each model.
Multi-core is a big part of the new VNX range, with MCx described as a project that redesigns the core Block OE stack to improve performance, reliability and longevity.
Instead of FLARE using one core per component, MCx is able to evenly distribute workloads across cores, giving improved utilisation across the SP.
The key benefit of this architecture is that you’ll get improved performance with the other tech in the array, namely FAST VP and Deduplication. The multi-core optimisations also extend to RAID management, Cache management and FAST Cache. I hope to be able to do some more detailed posts on these.
FAST-VP has also received some attention, with the chunk size being reduced from 1GB down to 256MB. The big news, however, is the introduction of fixed-block deduplication, with data being segmented in 8KB chunks, deduped at the pool level (not across pools), and turned on or off at the LUN level. I’ll be doing a post shortly on how it all works.
Hopefully that’s given you a starting point from which to investigate this announcement further. As always, if you’re after more information, talk to your local EMC people.