Welcome to Random Short Take #87. Happy Fête Nationale du 14 juillet to those who celebrate. Let’s get random.
I always enjoy it when tech vendors give you a little peak behind the curtain, and Dropbox excels at this. Here is a great article on how Dropbox selects data centre sites. Not every company is operating at the scale that Dropbox is, but these kinds of articles provide useful insights nonetheless. Even if you just skip to the end and follow this process when making technology choices:
Identify what you need early.
Understand what’s being offered.
Validate the technical details.
Physically verify each proposal.
Negotiate.
I haven’t used NetWorker for a while, but if you do, this article from Preston on what’s new in NetWorker 19.9 should be of use to you.
In VMware Cloud on AWS news, vCenter Federation for VMware Cloud on AWS is now live. You can read all about it here.
Familiar with Write Once, Read Many (WORM) storage? This article from the good folks at Datadobi on WORM retention made for some interesting reading. In short, keeping everything for ever is really a data management strategy, and it could cost you.
Speaking of data management, check out this article from Chin-Fah on data management and ransomware – it’s an alternative view very much worth considering.
Mellor wrote an article on Pixar and VAST Data’s collaboration. And he did one on DreamWorks and NetApp for good measure. I’m fascinated by media creation in general, and it’s always interesting to see what the big shops are using as part of their infrastructure toolkit.
JB put out a short piece highlighting some AI-related content shenanigans over at Gizmodo. The best part was the quoted reactions from staff – “16 thumbs down emoji, 11 wastebasket emoji, six clown emoji, two face palm emoji and two poop emoji.”
Finally, the recent Royal Commission into the “Robodebt” program completed and released a report outlining just how bad it really was. You can read Simon’s coverage over at El Reg. It’s these kinds of things that make you want to shake people when they come up with ideas that are destined to cause pain.
Verity ES recently announced its official company launch and the commercial availability of its Verity ES data eradication enterprise software solution. I had the opportunity to speak to Kevin Enders about the announcement and thought I’d briefly share some thoughts here.
From Revert to Re-birth?
Revert, a sister company of Verity ES, is an on-site data eradication service provider. It’s also a partner for a number of Storage OEMs.
The Problem
The folks at Revert have had an awful lot of experience with data eradication in big enterprise environments. With that experience, they’d observed a few challenges, namely:
The software doing the data eradication was too slow;
Eradicating data in enterprise environments introduced particular requirements at high volumes; and
Larger capacity HDDs and SDDs were a real problem to deal with.
The Real Problem?
Okay, so the process to get rid of old data on storage and compute devices is a bit of a problem. But what’s the real problem? Organisations need to get rid of end of life data – particularly from a legal standpoint – in a more efficient way. Just as data growth continues to explode, so too does the requirement to delete the old data.
The Solution
Verity ES was spawned to develop software to solve a number of the challenges Revert were coming across in the field. There are two ways to do it:
Eliminate the data destructively (via device shredding / degaussing); or
Why eradicate? It’s a sustainable approach, enables residual value recovery, and allows for asset re-use. But it nonetheless needs to be secure, economical, and operationally simple to do. How does Verity ES address these requirements? It has Product Assurance Certification from ADISA. It’s also developed software that’s more efficient, particularly when it comes to those troublesome high capacity drives.
[image courtesy of Verity ES]
Who’s Buying?
Who’s this product aimed at? Primarily enterprise DC operators, hyperscalers, IT asset disposal companies, and 3rd-party hardware maintenance providers.
Thoughts
If you’ve spent any time on my blog you’ll know that I write a whole lot about data protection, and this is probably one of the first times that I’ve written about data destruction as a product. But it’s an interesting problem that many organisations are facing now. There is a tonne of data being generated every day, and some of that data needs to be gotten rid of, either because it’s sitting on equipment that’s old and needs to be retired, or because legislatively there’s a requirement to get rid of the data.
The way we tackle this problem has changed over time too. One of the most popular articles on this blog was about making an EMC CLARiiON CX700 useful again after EMC did a certified erasure on the array. There was no data to be found on the array, but it was able to be repurposed as lab equipment, and enjoyed a few more months of usefulness. In the current climate, we’re all looking at doing more sensible things with our old disk drives, rather than simply putting a bullet in them (except for the Feds – but they’re a bit odd). Doing this at scale can be challenging, so it’s interesting to see Verity ES step up to the plate with a solution that promises to help with some of these challenges. It takes time to wipe drives, particularly when you need to do it securely.
I should be clear that this data doesn’t go out and identify what data needs to be erased – you have to do that through some other tools. So it won’t tell you that a bunch of PII is buried in a home directory somewhere, or sitting in a spot it shouldn’t be. It also won’t go out and dig through your data protection data and tell you what needs to go. Hopefully, though, you’ve got tools that can handle that problem for you. What this solution does seem to do is provide organisations with options when it comes to cost-effective, efficient data eradication. And that’s something that’s going to become crucial as we continue to generate data, need to delete old data, and do so on larger and larger disk drives.
Datadobi recently announced StorageMAP – a “solution that provides a single pane of glass for organizations to manage unstructured data across their complete data storage estate”. I recently had the opportunity to speak with Carl D’Halluin about the announcement, and thought I’d share some thoughts here.
The Problem
So what’s the problem enterprises are trying to solve? They have data all over the place, and it’s no longer a simple activity to work out what’s useful and what isn’t. Consider the data on a typical file / object server inside BigCompanyX.
[image courtesy of Datadobi]
As you can see, there’re all kinds of data lurking about the place, including data you don’t want to have on your server (e.g. Barry’s slightly shonky home videos), and data you don’t need any more (the stuff you can move down to a cheaper tier, or even archive for good).
What’s The Fix?
So how do you fix this problem? Traditionally, you’ll try and scan the data to understand things like capacity, categories of data, age, and so forth. You’ll then make some decisions about the data based on that information and take actions such as relocating, deleting, or migrating it. Sounds great, but it’s frequently a tough thing to make decisions about business data without understanding the business drivers behind the data.
[image courtesy of Datadobi]
What’s The Real Fix?
The real fix, according to Datadobi, is to add a bit more automation and smarts to the process, and this relies heavily on accurate tagging of the data you’re storing. D’Halluin pointed out to me that they don’t suggest you create complex tags for individual files, as you could be there for years trying to sort that out. Rather, you add tags to shares or directories, and let the StorageMAP engine make recommendations and move stuff around for you.
[image courtesy of Datadobi]
Tags can represent business ownership, the role of the data, any action to be taken, or other designations, and they’re user definable.
[image courtesy of Datadobi]
How Does This Fix It?
You’ll notice that the process above looks awfully similar to the one before – so how does this fix anything? The key, in my opinion at least, is that StorageMAP takes away the requirement for intervention from the end user. Instead of going through some process every quarter to “clean up the server”, you’ve got a process in place to do the work for you. As a result, you’ll hopefully see improved cost control, better storage efficiency across your estate, and (hopefully) you’ll be getting a little bit more value from your data.
Thoughts
Tools that take care of everything for you have always had massive appeal in the market, particularly as organisations continue to struggle with data storage at any kind of scale. Gone are the days when your admins had an idea where everything on a 9GB volume was stored, or why it was stored there. We now have data stored all over the place (both officially and unofficially), and it’s becoming impossible to keep track of it all.
The key things to consider with these kinds of solutions is that you need to put in the work with tagging your data correctly in the first place. So there needs to be some thought put into what your data looks like in terms of business value. Remember that mp4 video files might not be warranted in the Accounting department, but your friends in Marketing will be underwhelmed if you create some kind of rule to automatically zap mp4s. The other thing to consider is that you need to put some faith in the system. This kind of solution will be useless if folks insist on not deleting anything, or not “believing” the output of the analytics and reporting. I used to work with customers who didn’t want to trust a vendor’s automated block storage tiering because “what does it know about my workloads?”. Indeed. The success of these kind of intelligence and automation tools relies to a certain extent on folks moving away from faith-based computing as an operating model.
But enough ranting from me. I’ve covered Datadobi a bit over thelast few years, and it makes sense that all of these announcements have finally led to the StorageMAP product. These guys know data, and how to move it.
USB-C? Thunderbolt? Whatever it’s called, getting stuff to connect properly to your shiny computers with very few useful ports built-in can be a problem. This article had me giggling and crying at the same time.
Backblaze has come through with the goods again, with this article titled “How to Talk to Your Family About Backups“. I talk to my family all the time about backups (and recovery), and it drives them nuts.
I loved this article from Preston on death in the digital age. It’s a thorough exploration not only of what happens to your data when you shuffle off, but also some of the challenges associated with keeping your digital footprint around after you go.
I’ve had the good fortune of having Sandeep on some calls with my customers, and he knows a thing or two about things that go ping. His latest blog post on VMware Cloud on AWS Advanced Networking and Routing features made for some interesting reading.
Finally, if you’re looking at all-flash as an option for your backup infrastructure, it’s worth checking out Chris’s article here. The performance benefits (particularly with recovery) are hard to argue with, but at scale the economics may still be problematic.
Whenever I read articles about home Internet connectivity, I generally chuckle in Australian and move on. But this article from Jeff Geerling on his experience with Starlink makes for interesting reading, if only for the somewhat salty comments people felt the need to leave after the article was published. He nonetheless brings up some great points about challenges with the service, and I think the endless fawning over Musk as some kind of tech saviour needs to stop.
In the “just because you can, doesn’t mean you should” category is this article from William Lam, outlining how to create a VMFS datastore on a USB device. It’s unsupported, but it strikes me that this is just the kind of crazy thing that might be useful to folks trying to move around VMs at the edge.
Karen Lopez is a really smart person, and this article over at Gestalt IT is more than just the “data is the new oil” schtick we’ve been hearing for the past few years.
Speaking of Pure Storage, Kyndryl and Pure Storage have announced a global alliance. You can read more on that here.
Mike Preston wrote a brief explainer on S3 Object Lock here. I really enjoy Mike’s articles, as I find he has a knack for breaking down complex topics into very simple to digest and consume pieces.
Remember when the movies and TV shows you watched had consistent aspect ratios? This article from Tom Andry talks about how that’s changed quite a bit in the last few years.
I’m still pretty fresh in my role, but in the future I hope to be sharing more news and articles about VMware Cloud on AWS. In the meantime, check out this article from Greg Vinton, where he covers some of his favourite parts of what’s new in the platform.
In unrelated news, this is the last week to vote for the #ITBlogAwards. You can cast your vote here.
Welcome to Random Short Take #51. A few players have worn 51 in the NBA including Lawrence Funderburke (I remember the Ohio State team wearing grey Nikes on TV and thinking that was a really cool sneaker colour – something I haven’t been able to shake over 25 years later). My pick is Boban Marjanović though. Let’s get random.
Folks don’t seem to spend much time making sure the fundamentals are sound, particularly when it comes to security. This article from Jess provides a handy list of things you should be thinking about, and doing, when it comes to securing your information systems. As she points out, it’s just a starting point, but I think it should be seen as a bare minimum / entry level set of requirements that you could wrap around most environments out in the wild.
Could there be a new version of AIX on the horizon? Do I care? Not really. But I do sometimes yearn for the “simpler” times I spent working on a myriad of proprietary open systems, particularly when it came to storage array support.
StorCentric recently announced Nexsan Assureon Cloud Edition. You can read the press release here.
Speaking of press releases, Zerto continues to grow its portfolio of cloud protection technology. You can read more on that here.
Spectro Cloud has been busy recently, and announced supporting for management of existing Kubernetes deployments. The news on that can be found here.
Are you a data hoarder? I am. This article won’t help you quit data, but it will help you understand some of the things you can do to protect your data.
So you’ve found yourself with a publicly facing vCenter? Check out this VMware security advisory, and get patching ASAP. vCenter is the only thing you need to be patching either, but hopefully you knew that already.
John Birmingham is one of my favourite writers. Not just for his novels with lots of things going bang, but also for his blog posts about food. And things of that nature.
Disclaimer: I recently attended Storage Field Day 21. My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event. Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.
David Flynn kicked off the presentation from Hammerspace talking about storageless data. Storageless data? What on earth is that, then? Ultimately your data has to live on storage. But this all about consumption side abstraction. Hammerspace doesn’t want you to care about how your application maps to servers, or how it maps to storage. It’s more of a data-focussed approach to storage than we’re used to, perhaps. Some of the key requirements of the solution are as follows:
The agent needs to run on everything – virtual, physical, containers – it can’t be bound to specific hardware
Needs to be multi-vendor and support multi-protocol
Presumes metadata
Make data into a routed resource
Deliver objective-based orchestration
The trick is that you have to be able to do all of this without killing the benefits of the infrastructure (performance, reliability, cost, and management). Simple, huh?
Stitching It Together
A key part of the Hammerspace story is the decoupling of the control plane and the data plane. This allows it to focus on getting the data where it needs to be, from edge to cloud, and over whatever protocol it needs to be done over.
[image courtesy of Hammerspace]
Other Notes
Hammerspace officially supports 8 sites at the moment, and the team have tested the solution with 32 sites. It uses an eventually consistent model, and the Global Namespace is global per share, providing flexible deployment options. Metadata replication can be setup to be periodic – and customised at each site. You always rehydrate the data and serve it locally over NAS via SMB or NFS.
Licensing Notes
Hammerspace is priced on capacity (data under management). You can also purchase it via the AWS Marketplace. Note that you can access up to 10TB free on the public cloud vendors (AWS, GCP, Azure) from a Hammerspace perspective.
Thoughts and Further Reading
I was fortunate to have a followup session with Douglas Fallstrom and Brendan Wolfe to revisit the Hammerspace story, ask a few more questions, and check out some more demos. I asked Fallstrom about the kind of use cases they were seeing in the field for Hammerspace. One popular use case was for disaster recovery. Obviously, there’s a lot more to doing DR than just dumping data in multiple locations, but it seems that there’s appetite for this very thing. At a high level, Hammerspace is a great choice for getting data into multiple locations, regardless of the underlying platform. Sure, there’s a lot more that needs to be done once it’s in another location, or when something goes bang. But from the perspective of keeping things simple, this one is up there.
Fallstrom was also pretty clear with me that this isn’t Primary Data 2.0, regardless of the number of folks that work at Hammerspace with that heritage. I think it’s a reasonable call, given that Hammerspace is doubling down on the data story, and really pushing the concept of a universal file system, regardless of location or protocol.
So are we finally there in terms of data abstraction? It’s been a problem since computers became common in the enterprise. As technologists we frequently get caught up in the how, and not as much in the why of storage. It’s one thing to say that I can scale this to this many Petabytes, or move these blocks from this point to that one. It’s an interesting conversation for sure, and has proven to be a difficult problem to solve at times. But I think as a result of this, we’ve moved away from understanding the value of data, and data management, and focused too much on the storage and services supporting the data. Hammerspace has the noble goal of moving us beyond that conversation to talking about data and the value that it can bring to the enterprise. Is it there yet in terms of that goal? I’m not sure. It’s a tough thing to be able to move data all over the place in a reliable fashion and still have it do what it needs to do with regards to performance and availability requirements. Nevertheless I think that the solution does a heck of a lot to remove some of the existing roadblocks when it comes to simplified data management. Is serverless compute really a thing? No, but it makes you think more about the applications rather than what they run on. Storageless data is aiming to do the same thing. It’s a bold move, and time will tell whether it pays off or not. Regardless of the success or otherwise of the marketing team, I’m thinking that we’ll be seeing a lot more innovation coming out of Hammerspace in the near future. After all, all that data isn’t going anywhere any time soon. And someone needs to take care of it.
DMS is being positioned as a suite of “data cloud services” by StorCentric, with a focus on:
Data migration;
Data consistency; and
Data operation.
It has the ability to operate across heterogeneous storage, clouds, and protocols. It’s a software solution based on subscription licensing and uses a policy-driven engine to manage data in the enterprise. It can run on bare-metal or as a VM appliance. Object storage platform / cloud support if fairly robust, with AWS, Backblaze B2, and Wasabi, amongst others, all being supported.
[image courtesy of StorCentric]
Use Cases
There are a number of scenarios where a solution like DMS makes sense. You might have a bunch of NFS storage on-premises, for example, and want to move it to a cloud storage target using S3. Another use case cited involved collaboration across multiple sites, with the example being a media company creating content in three places, and working in different time zones, and wanting to move the data back to a centralised location.
Big Ideas
Speaking to StorCentric about the announcement, it was clear that there’s a lot more on the DMS roadmap. Block storage is something the team wants to tackle, and they’re also looking to deliver analytics and ransomware alerting. There’s also a strong desire to provide governance as well. For example, if I want to copy some data somewhere and keep it for 10 years, I’ll configure DMS to take care of that for me.
Thoughts and Further Reading
Data management means a lot of things to a lot of people. Storage companies often focus on moving blocks and files from one spot to another, but don’t always do a solid job of capturing data needs to be stored where it does. Or how, for that matter. There’s a lot more to data management than keeping ones and zeroes in a safe place. But it’s not just about being able to move data from one spot to another. It’s about understanding the value of your data, and understanding where it needs to be to deliver the most value to your organisation. Whilst it seems like DMS is focused primarily on moving data from one spot to another, there’s plenty of potential here to develop a broader story in terms of data governance and mobility. There’s built-in security, and the ability to apply levels of data governance to data in various locations. The greater appeal here is also the ability to automate the movement of data to different places based on policy. This policy-driven approach becomes really interesting when you start to look at complicated collaboration scenarios, or need to do something smart with replication or data migration.
Ultimately, there are a bunch of different ways to get data from one point to another, and a bunch of different reasons why you might need to do that. The value in something like DMS is the support for heterogeneous storage platforms, as well as the simple to use GUI support. Plenty of data migration tools come with extremely versatile command line interfaces and API support, but the trick is delivering an interface that is both intuitive and simple to navigate. It’s also nice to have a few different use cases met with one tool, rather than having to reach into the bag a few different times to solve very similar problems. StorCentric has a lot of plans for DMS moving forward, and if those plans come to fruition it’s going to form a very compelling part of the typical enterprise’s data management toolkit. You can read the press release here.
I recently had the opportunity to take a briefing with Jeff Braunstein and Susan Merriman from Spectra Logic (one of those rare occasions where getting your badge scanned at a conference proves valuable), and thought I’d share some of my notes here.
BlackPearl Family
Spectra Logic sell a variety of products, but this briefing was focused primarily on the BlackPearl series. Braunstein described it as a “gateway” device, with both NAS and object front end interfaces, and backend capability that can move data to multiple types of archives.
[image courtesy of Spectra Logic]
It’s a hardware box, but at its core the value is in the software product. The idea is that the BlackPearl acts as a disk cache, and you configure policies to send the data to one or more storage targets. The cool thing is that it supports multiple retention policies, and these can be permanent too. By that I mean you could spool one copy to tape for long term storage, and have another copy of your data sit on disk for 90 days (or however long you wanted).
Local vs Remote Storage
Local
There are a few different options for local storage, including BlackPearl Object Storage Disk, functioning as “near line archive”. This is configured with 107 enterprise quality SATA drives, (and they’re looking at introducing 16TB drives next month), providing roughly 1.8PB RAW capacity. They function as power-down archive drives (using the drive spin down settings), and delivers a level of resilience and reliability by using ZFS as the file system,. There are also customer-configurable parity settings. Alternatively, you can pump data to Spectra Tape Libraries, for those of you who still want to use tape as a storage format.
Remote Storage Targets
In terms of remote storage targets, BlackPearl can leverage either public cloud, or other BlackPearl devices as replication targets. Replication to BlackPearl can be one way or bi-directional. Public Cloud support is available via Amazon S3 (and S3-like products such as Cloudian and Wasabi), and MS Azure. There is a concept of data immutability in the product, and you can turn on versioning to prevent your data management applications (or users) from accidentally clobbering your data.
Braunstein also pointed out that tape generations evolve, and BlackPearl has auto-migration capabilities. You can potentially have data migrate transparently from tape to tape (think LTO-6 to LTO-7), tape to disk, and tape to cloud.
[image courtesy of Spectra Logic]
In terms of how you leverage BlackPearl, some of that is dependent on the workflows you have in place to move your data. This could be manual, semi-automated, or automated (or potentially purpose built into existing applications). There’s a Spectra S3 RESTful API, and there’s heaps of information on developer.spectralogic.com on how to integrate BlackPearl into your existing applications and media workflows.
Thoughts
If you’re listening to the next-generation data protection vendors and big box storage folks, you’d wonder why companies such as Spectra Logic still focus on tape. It’s not because they have a rich heritage and deep experience in the tape market (although they do). There are plenty of use cases where tape still makes sense in terms of its ability to economically store large amounts of data in a relatively secure (off-line if required) fashion. Walk into any reasonably sized film production house and you’ll still see tape in play. From a density perspective (and durability), there’s a lot to like about tape. But BlackPearl is also pretty adept at getting data from workflows that were traditionally file-based and putting them on public cloud environments (the kind of environments that heavily leverage object storage interfaces). Sure, you can pump the data up to AWS yourself if you’re so inclined, but the real benefit of the BlackPearl approach, in my opinion, is that it’s policy-driven and fully automated. There’s less chance that you’ll fat finger the transfer of critical data to another location. This gives you the ability to focus on your core business, and not have to worry about data management.
I’ve barely scratched the surface of what BlackPearl can do, and I recommend checking out their product site for more information.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.