Intel – It’s About Getting The Right Kind Of Fast At The Edge

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Intel recently presented at Storage Field Day 22. You can see videos of the presentation here, and download my rough notes from here.

 

The Problem

A lot of countries have used lockdowns as a way to combat the community transmission of COVID-19. Apparently, this has led to an uptick in the consumption of streaming media services. If you’re somewhat familiar with streaming media services, you’ll understand that your favourite episode of Hogan’s Heroes isn’t being delivered from a giant storage device sitting in the bowels of your streaming media provider’s data centre. Instead, it’s invariably being delivered to your device from a content delivery network (CDN) device.

 

Content Delivery What?

CDNs are not a new concept. The idea is that you have a bunch of web servers geographically distributed delivering content to users who are also geographically distributed. Think of it as a way to cache things closer to your end users. There are many reasons why this can be a good idea. Your content will load faster for users if it resides on servers in roughly the same area as them. Your bandwidth costs are generally a bit cheaper, as you’re not transmitting as much data from your core all the way out to the end user. Instead, those end users are getting the content from something close to them. You can potentially also deliver more versions of content (in terms of resolution) easily. It can also be beneficial in terms of resiliency and availability – an outage on one part of your network, say in Palo Alto, doesn’t need to necessarily impact end users living in Sydney. Cloudflare does a fair bit with CDNs, and there’s a great overview of the technology here.

 

Isn’t All Content Delivery The Same?

Not really. As Intel covered in its Storage Field Day presentation, there are some differences with the performance requirements of video on demand and live-linear streaming CDN solutions.

Live-Linear Edge Cache

Live-linear video streaming is similar to the broadcast model used in television. It’s basically programming content streamed 24/7, rather than stuff that the user has to search for. Several minutes of content are typically cached to accommodate out-of-sync users and pause / rewind activities. You can read a good explanation of live-linear streaming here.

[image courtesy of Intel]

In the example above, Intel Optane PMem was used to address the needs of live-linear streaming.

  • Live-linear workloads consume a lot of memory capacity to maintain a short-lived video buffer.
  • Intel Optane PMem is less expensive than DRAM.
  • Intel Optane PMem has extremely high endurance, to handle frequent overwrite.
  • Flexible deployment options – Memory Mode or App-Direct, consuming zero drive slots.

With this solution they were able to achieve better channel and stream density per server than with DRAM-based solutions.

Video on Demand (VoD)

VoD providers typically offer a large library of content allowing users to view it at any time (e.g. Netflix and Disney+). VoD servers are a little different to live-linear streaming CDNs. They:

  • Typically require large capacity and drive fanout for performance / failure domains; and
  • Have a read-intensive workload, with typically large IOs.

[image courtesy of Intel]

 

Thoughts and Further Reading

I first encountered the magic of CDNs years ago when working in a data centre that hosted some Akamai infrastructure. Windows Server updates were super zippy, and it actually saved me from having to spend a lot of time standing in the cold aisle. Fast forward about 15 years, and CDNs are being used for all kinds of content delivery on the web. With whatever the heck this is is in terms of the new normal, folks are putting more and more strain on those CDNs by streaming high-quality, high-bandwidth TV and movie titles into their homes (except in backwards places like Australia). As a result, content providers are constantly searching for ways to tweak the throughput of these CDNs to serve more and more customers, and deliver more bandwidth to those users.

I’ve barely skimmed the surface of how CDNs help providers deliver content more effectively to end users. What I did find interesting about this presentation was that it reinforced the idea that different workloads require different infrastructure solutions to deliver the right outcomes. It sounds simple when I say it like this, but I guess I’ve thought about streaming video CDNs as being roughly the same all over the place. Clearly they aren’t, and it’s not just a matter of jamming some SSDs in one RU servers and hoping that your content will be delivered faster to punters. It’s important to understand that Intel Optane PMem and Intel Optane 3D NAND can give you different results depending on what you’re trying to do, with PMem arguably giving you better value for money (per GB) than DRAM. There are some great papers on this topic available on the Intel website. You can read more here and here.

Fujifilm Object Archive – Not Your Father’s Tape Library

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Fujifilm recently presented at Storage Field Day 22. You can see videos of the presentation here, and download my rough notes from here.

 

Fujifilm Overview

You’ve heard of Fujifilm before, right? They do a whole bunch of interesting stuff – batteries, cameras, copiers. Nami Matsumoto, Director of DMS Marketing and Operations, took us through some of Fujifilm’s portfolio. Fujifilm’s slogan is “Value From Innovation”, and it certainly seems to be looking to extract maximum value from its $1.4B annual spend on research and development. The Recording Media Products Division is focussed on helping “companies future proof their data”.

[image courtesy of Fujifilm]

 

The Problem

The challenge, as always (it seems), is that data growth continues apace while budgets remain flat. As a result, both security and scalability are frequently sacrificed when solutions are deployed in enterprises.

  • Rapid data creation: “More than 59 Zettabytes (ZB) of data will be created, captured, copied, and consumed in the world this year” (IDC 2020)
  • Shift from File to Object Storage
  • Archive Market – 60 – 80%
  • Flat IT budgets
  • Cybersecurity concerns
  • Scalability

 

Enter The Archive

FUJIFILM Object Archive

Chris Kehoe, Director of DMS Sales and Engineering, spent time explaining what exactly FUJIFILM Object Archive was. “Object Archive is an S3 based archival tier designed to reduce cost, increase scale and provide the highest level of security for long-term data retention”. In short, it:

  • Works like Amazon S3 Glacier in your DC
  • Simply integrates with other object storage
  • Scales on tape technology
  • Secure with air gap and full chain of custody
  • Predictable costs and TCO with no API or egress fees

Workloads?

It’s optimised to handle the long-term retention of data, which is useful if you’re doing any of these things:

  • Digital preservation
  • Scientific research
  • Multi-tenant managed services
  • Storage optimisation
  • Active archiving

What Does It Look Like?

There are a few components that go into the solution, including a:

  • Storage Server
  • Smart cache
  • Tape Server

[image courtesy of Fujifilm]

Tape?

That’s right, tape. The tape library supports LTO7, LTO8, TS1160. The data is written using “OTFormat” specification (you can read about that here). The idea is that it packs a bunch of objects together so they get written efficiently.  

[image courtesy of Fujifilm]

Object Storage Too

It uses an “S3-compatible” API – the S3 server is built on Zenko inside (Scality). From an object storage perspective, it works with Cloudian HyperStore, Caringo Swarm, NetApp StorageGRID, Scality Ring. It also has Starfish and Tiger Bridge support.

Other Notes

The product starts at 1PB of licensing. You can read the Solution Brief here. There’s an informative White Paper here. And there’s one of those nice Infographic things here.

Deployment Example

So what does this look like from a deployment perspective? One example was a typical primary storage deployment, with data archived to an on-premises object storage platform (in this case NetApp StorageGRID). When your archive got really “cold”, it would be moved to the Object Archive.

[image courtesy of Fujifilm]

[image courtesy of Fujifilm]

 

Thoughts

Years ago, when a certain deduplication storage appliance company was acquired by a big storage slinger, stickers with “Tape is dead, get over it” were given out to customers. I think I still have one or two in my office somewhere. And I think the sentiment is spot on, at least in terms of the standard tape library deployments I used to see in small to mid to large enterprise. The problem that tape was solving for those organisations at the time has largely been dealt with by various disk-based storage solutions. There are nonetheless plenty of use cases where tape is still considered useful. I’m not going to go into every single reason, but the cost per GB of tape, at a particular scale, is hard to beat. And when you want to safely store files for a long period of time, even offline? Tape, again, is hard to beat. This podcast from Curtis got me thinking about the demise of tape, and I think this presentation from Fujifilm reinforced the thinking that it was far from on life support – at least in very specific circumstances.

Data keeps growing, and we need to keep it somewhere, apparently. We also need to think about keeping it in a way that means we’re not continuing to negatively impact the environment. It doesn’t necessarily make sense to keep really old data permanently online, despite the fact that it has some appeal in terms of instant access to everything ever. Tape is pretty good when it comes to relatively low energy consumption, particularly given the fact that we can’t yet afford to put all this data on All-Flash storage. And you can keep it available in systems that can be relied upon to get the data back, just not straight away. As I said previously, this doesn’t necessarily make sense for the home punter, or even for the small to midsize enterprise (although I’m tempted now to resurrect some of my older tape drives and see what I can store on them). It really works better at large scale (dare I say hyperscale?). Given that we seem determined to store a whole bunch of data with the hyperscalers, and for a ridiculously long time, it makes sense that solutions like this will continue to exist, and evolve. Sure, Fujifilm has sold something like 170 million tapes worldwide. But this isn’t simply a tape library solution. This is a wee bit smarter than that. I’m keen to see how this goes over the next few years.

Infrascale Puts The Customer First

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Infrascale recently presented at Storage Field Day 22. You can see videos of the presentation here, and download my rough notes from here.

 

Infrascale and Customer Experience

Founded in 2011, Infrascale is headquartered is in Reston, Virginia, with around 170 employees and offices in the Ukraine and India as well. As COO Brian Kuhn points out in the presentation, the company is “[a]ll about customers and their data”. Infrascale’s vision is “to be the most trusted data protection provider”.

Build Trust via Four Ps

Predictable

  • Reliable connections, response time, product
  • Work side by side like a dependable friend

Personal

  • People powered – partners, not numbers
  • Your success is our success

Proficient

  • Support and product experts with the right tools
  • Own the issue from beginning to end

Proactive

  • Onboarding, outreach to proactively help you
  • Identify issues before they impact your business

“Human beings dealing with human beings”

 

Product Portfolio

Infrascale Cloud Application Backup (ICAB)

SaaS Backup

  • Backup Microsoft 365, Google Workspace, Salesforce, Box, and Dropbox
  • Recover individual items (mail, file, or record) or entire mailboxes, folders, or databases
  • Close the retention gap between the SaaS provider and corporate, legal, and / or regulatory policy

Infrascale Cloud Backup (ICB)

Endpoint Backup

  • Backup desktop, laptop, or mobile devices directly to the cloud – wherever you work
  • Recover data in seconds – and with ease
  • Optimised for branch office and remote / home workers
  • Provides ransomware detection and remediation

Infrascale Backup and Disaster Recovery (IBDR)

Backup and DR / DRaaS for Servers

  • Backup mission-critical servers to both an on-premises and bootable cloud appliance
  • Boot ready in ~2 minutes (locally or in the cloud)
  • Restore system images or files / folders
  • Optimised for VMware and Hyper-V VMs and Windows bare metal

 

Digging Deeper with IBDR

What Is It?

Infrascale describes IBDR as a hybrid-cloud solution, with hardware and software on-premises, and service infrastructure in the cloud. In terms of DR as a service, Infrascale provides the ability to backup and replicate your data to a secondary location. In the event of a disaster, customers have the option to restore individual files and folders, or the entire infrastructure if required. Restore locations are flexible as well, with a choice of on-premises or in the cloud. Importantly, you also have the ability to failback when everything’s sorted out.

One of the nice features of the service is unlimited DR and failover testing, and there are no fees attached to testing, recovery, or disaster failover.

Range

The IBDR solution also comes in a few different versions, as the table below shows.

[image courtesy of Infrascale]

The appliances are also available in a range of shapes and sizes.

[image courtesy of Infrascale]

Replication Options

In terms of replication, there are multiple destinations available, and you can fairly easily fire up workloads in the Infrascale cloud if need be.

[image courtesy of Infrascale]

 

Thoughts and Further Reading

Anyone who’s worked with data protection solutions will understand that it can be difficult to put together a combination of hardware and software that meets the needs of the business from a commercial, technical, and process perspective – particularly when you’re starting at a small scale and moving up from there. Putting together a managed service for data protection and disaster recovery is possibly harder still, given that you’re trying to accommodate a wide variety of use cases and workloads. And doing this using commercial off-the-shelf offerings can be a real pain. You’re invariably tied to the roadmap of the vendor in terms of features, and your timeframes aren’t normally the same as your vendor (unless you’re really big). So there’s a lot to be said for doing it yourself. If you can get the software stack right, understand what your target market wants, and get everything working in a cost-effective manner, you’re onto a winner.

I commend Infrascale for the level of thought the company has given to this solution, its willingness to work with partners, and the fact that it’s striving to be the best it can in the market segment it’s targeting. My favourite part of the presentation was hearing the phrase “we treat [data] like it’s our own”. Data protection, as I’ve no doubt rambled on about before, is hard, and your customers are trusting you with getting them out of a pickle when something goes wrong. I think it’s great that the folks at Infrascale have this at the centre of everything they’re doing. I get the impression that it’s “all care, all responsibility” when it comes to the approach taken with this offering. I think this counts for a lot when it comes to data protection and DR as a service offerings. I’ll be interested to see how support for additional workloads gets added to the platform, but what they’re doing now seems to be enough for many organisations. If you want to know more about the solution, the resource library has some handy datasheets, and you can get an idea of some elements of the recommended retail pricing from this document.

Komprise – It’s About Data, Not Storage

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Komprise recently presented at Storage Field Day 22. You can see their videos from Storage Field Day 22 here, and download a PDF copy of my rough notes from here.

 

The Age Of Data, Not Storage

It’s probably been the age of data for some time now, but I couldn’t think of a catchy heading. One comment from the Komprise folks during the presentation that really stood out to me was “Data outlives its storage infrastructure”. If I think back ten years to how I thought about managing data movement, it was certainly tied to the storage platform hosting the data, rather than what the data did. Whenever I had to move from one array to the next, or one protocol to another, I wasn’t thinking in terms of where the data would necessarily be best placed to serve the business. Generally speaking, I was approaching the problem in terms of getting good performance for blocks and files, but rarely was I thinking in terms of the value of the data to the business. Nowadays, it seems that there’s an improved focus on getting the “[d]ata in the right place at the right time – not just for efficiency – but to extract maximum value”. We’re no longer thinking about data in terms of old stuff living on slow storage, and fresh bits living on the fast stuff. As the amount of data being managed in enterprises continues to grow at an insane rate, it’s becoming more important than ever to understand just what usefulness the data offers the business.

[image courtesy of Komprise]

The variety of storage platforms available now is also a little more extensive than it was last century, and that presents some more interesting challenges in getting the data to where it needs to be. As I mentioned earlier, data growth is going berserk the world over. Add to this the problem of ubiquitous cloud access (and IT departments struggling to keep up with the governance necessary to wrangle these solutions into some sensible shape), and most enterprises looking to save money wherever possible, and data management can present real problems to most enterprise shops.

[image courtesy of Komprise]

 

Analytics To The Rescue!

Komprise has come up with an analytics-driven approach to data management that is built on some sound foundational principles. The solution needs to:

  1. Go beyond storage efficiency – it’s not just about dedupe and compression at a certain scale.
  2. Must be multi-directional – you need to be able to get stuff back.
  3. Not disrupt users and workflows – do that and you may as well throw the solution in the bin.
  4. Should create new uses for your data – it’s all about value, after all.
  5. Puts your data first.

The final point is possibly the most critical one. If I think about the storage-centric approaches to data management that I’ve seen over the years, there’s definitely been a viewpoint that the underlying storage infrastructure would heavily influence how the data is used, rather than the data dictating how the storage platforms should be architected. Some of that is a question of visibility – if you don’t understand your data, it’s hard to come up with tailored solutions. Some of the problem is also the disconnect that seems to exist between “the business” and IT departments in a large number of enterprises. It’s not an easy problem to solve, by any stretch, but it does explain some of the novel approaches to data management that I’ve seen over the years.

 

Thoughts and Further Reading

Data management is hard, and it keeps getting harder because we keep making more and more data. And we frequently don’t have the time, or take the time, to work out what value the data actually has. This problem isn’t going to go away, so it’s good to see Komprise moving the conversation past that and into the realm of how we can best focus on deriving value from the data itself. There was certainly some interesting discussion during the presentation about the term analytics,  and what that really meant in terms of the Komprise solution. Ultimately, though, I’m a fan of anything that elevates the conversation beyond “I can move your terabytes from this bucket to that bucket”. I want something that starts to tell me more about what type of data I’m storing, who’s using it, and how they’re using it. That’s when it gets interesting from a data management perspective. I think there’s a ways to go in terms of getting this solution right for everyone, but it strikes me that Komprise is on the right track, and I’m looking forward to seeing how the solution evolves alongside the storage technologies it’s using to get the most from everyone’s data. You can read more on the Komprise approach here.

Storage Field Day 22 – (Fairly) Full Disclosure

Disclaimer: I recently attended Storage Field Day 22.  Some expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my notes on gifts, etc, that I received as a conference attendee at Storage Field Day 22. This is by no stretch an interesting post from a technical perspective, but it’s a way for me to track and publicly disclose what I get and how it looks when I write about various things. With all of this stuff happening (waves hands around), it’s not going to be as lengthy as normal, but I did receive a box of stuff in the mail, so I wanted to disclose it.

The Tech Field Day team sent over some stickers, a TFD tote bag, and a TFD pin, and a TFD patch. Fujifilm kindly gave me a 16GB USB drive (with both USB 2 and Lightning connectors), a webcam cover, stylus, USB charging cable, a Bluetooth tracker, a phone cradle, and a beach towel. Komprise sent over some neat socks, three Komprise-branded Titleist golf balls, and a sticker.

It wasn’t fancy food and limos this time around, but it was nonetheless an enjoyable event. Hopefully we can get back to in-person events some time this decade. Thanks again to Stephen and the team for having me back. Thanks also to my employer for giving me time away from the office to attend.

Storage Field Day 22 – I’ll Be At Storage Field Day 22

Here’s some news that will get you excited. I’ll be virtually heading to the US this week for another Storage Field Day event. If you haven’t heard of the very excellent Tech Field Day events, you should check them out. It’s also worth visiting the Storage Field Day 22 website during the event (August 4-6) as there’ll be video streaming and updated links to additional content. You can also see the list of delegates and event-related articles that have been published.

I think it’s a great line-up of both delegates and presenting companies this time around.

 

I’d like to publicly thank in advance the nice folks from Tech Field Day who’ve seen fit to have me back, as well as my employer for letting me take time off to attend these events. Also big thanks to the companies presenting. It’s going to be a lot of fun. Last time was a little weird doing this virtually, rather than in person, but I think it still worked. As things open back up in the US you’ll start to see a blend of in-person and virtual attendance for these events. I know that Komprise will be filming its segment from the Doubletree. Hopefully we’ll get things squared away and I’ll be allowed to leave the country next year. I’m really looking forward to this, even if it means doing the night shift for a few days. Presentation times are below, and all times are US/Pacific.

Wednesday, Aug 4 8:00-9:30 Infrascale Presents at Storage Field Day 22
Wednesday, Aug 4 11:00-13:30 Intel Presents at Storage Field Day 22
Presenters: Allison GoodmanElsa AsadianKelsey PrantisKristie MannNash KleppanSagi Grimberg
Thursday, Aug 5 8:00-10:00 CTERA Presents at Storage Field Day 22
Presenters: Aron BrandJim CrookLiran Eshel
Thursday, Aug 5 11:00-13:00 Komprise Presents at Storage Field Day 22
Presenters: Krishna SubramanianMike PeercyMohit Dhawan
Friday, Aug 6 8:00-9:00 Fujifilm Presents at Storage Field Day 22
Friday, Aug 6 10:00-11:30 Pure Storage Presents at Storage Field Day 22
Presenters: Ralph RonzioStan Yanitskiy

Random Short Take #52

Welcome to Random Short Take #52. A few players have worn 52 in the NBA including Victor Alexander (I thought he was getting dunked on by Shawn Kemp but it was Chris Gatling). My pick is Greg Oden though. If only his legs were the same length. Let’s get random.

  • Penguin Computing and Seagate have been doing some cool stuff with the Exos E 5U84 platform. You can read more about that here. I think it’s slightly different to the AP version that StorONE uses, but I’ve been wrong before.
  • I still love Fibre Channel (FC), as unhealthy as that seems. I never really felt the same way about FCoE though, and it does seem to be deader than tape.
  • VMware vSAN 7.0 U2 is out now, and Cormac dives into what’s new here. If you’re in the ANZ timezone, don’t forget that Cormac, Duncan and Frank will be presenting (virtually) at the Sydney VMUG *soon*.
  • This article on data mobility from my preferred Chris Evans was great. We talk a lot about data mobility in this industry, but I don’t know that we’ve all taken the time to understand what it really means.
  • I’m a big fan of Tech Field Day, and it’s nice to see presenting companies take on feedback from delegates and putting out interesting articles. Kit’s a smart fellow, and this article on using VMware Cloud for application modernisation is well worth reading.
  • Preston wrote about some experiences he had recently with almost failing drives in his home environment, and raised some excellent points about resilience, failure, and caution.
  • Speaking of people I worked with briefly, I’ve enjoyed Siobhán’s series of articles on home automation. I would never have the patience to do this, but I’m awfully glad that someone did.
  • Datadobi appears to be enjoying some success, and have appointed Paul Repice to VP of Sales for the Americas. As the clock runs down on the quarter, I’m going two for one, and also letting you know that Zerto has done some work to enhance its channel program.

Storage Field Day 21 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Storage Field Day 21.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

This is a quick post to say thanks once again to Stephen and Ben, and the presenters at Storage Field Day 21. I had a great time. For easy reference, here’s a list of the posts I did covering the events (they may not match the order of the presentations).

Storage Field Day 21 – I’ll Be At Storage Field Day 21

Storage Field Day 21 – (Fairly) Full Disclosure

Back To The Future With Tintri

Hammerspace, Storageless Data, And One Tough Problem

Intel Optane – Challenges and Triumphs

NetApp Keystone – How Do you Want It?

Pliops – Can We Take Fast And Make It Faster?

Nasuni Puts Your Data Where You Need It

MinIO – Cloud, Edge, Everywhere …

Also, here’s a number of links to posts by my fellow delegates (in no particular order). They’re all very smart people, and you should check out their stuff, particularly if you haven’t before. I’ll attempt to keep this updated as more posts are published. But if it gets stale, the Storage Field Day 21 landing page will have updated links.

 

Jason Collier (@BocaNuts)

 

Barry Coombs (@VirtualisedReal)

#SFD21 – Storage Field Day 21 – Tintri

#SFD21 – Storage Field Day 21 – NetApp

#SFD21 – Storage Field Day 21 – Nasuni

#SFD21 – Storage Field Day 21 – MinIO Session

#SFD21 – Storage Field Day 21 – Pliops

#SFD21 – Storage Field Day 21 – Hammerspace

#SFD21 – Storage Field Day 21 – Intel

 

Becky Elliott (@BeckyLElliott)

 

Matthew Leib (@MBLeib)

 

Ray Lucchesi (@RayLucchesi)

The rise of MinIO object storage

Data Science storage with NetApp’s Python Toolkit

Storageless data!?

115-GreyBeards talk database acceleration with Moshe Twitto, CTO&Co-founder, Pliops

 

Andrea Mauro (@Andrea_Mauro)

 

Max Mortillaro (@DarkkAvenger)

Nasuni – Cloud-Scale NAS Without Cloud Worries

Storage Field Day 21 – The TECHunplugged Take on Nasuni

Pliops: Re-Imagining Storage, Crushing Bottlenecks and a Bright Future in the Cloud

 

Keiran Shelden (@Keiran_Shelden)

 

Enrico Signoretti (@esignoretti)

Object Storage Is Heating Up

Storage Options for the Distributed Enterprise

 

Paul Stringfellow (@TechStringy)

Looking ahead with Storage Field Day 21 – Barry Coombs, Jason Collier, Max Mortillaro – Ep 149

Storageless data, really? – Doug Fallstrom – Ep156

 

Frederic Van Haren (@FredericVHaren)

 

On-Premise IT Podcast

Is Storageless Storage Just Someone Else’s Storage?

 

Now please enjoy this group photo.

[image courtesy of Gestalt IT]

MinIO – Cloud, Edge, Everywhere …

Disclaimer: I recently attended Storage Field Day 21.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

MinIO recently presented at Storage Field Day 21. You can see videos of the presentation here, and download my rough notes from here.

 

What Is It?

To quote the good folks at MinIO, it is a “high performance, Kubernetes-native object store”. It is designed to be used for large-scale data infrastructure, and was built from scratch to be cloud native.

[image courtesy of MinIO]

Design Principles

MinIO has been built with the following principles in mind:

  • Cloud Native – born in the cloud with “cloud native DNA”
  • Performance Focussed – believe it is the fastest object store in existence
  • Simplicity – designed for simplicity because “simplicity scales”

S3 Compatibility

MinIO is heavily focussed on S3 compatibility. It was first to market with V4 and one of the few vendors to support S3 Select. It has also been strictly consistent from inception.

Put Me In Your Favourite Box

The cloud native part of MinIO was no accident, and as a result more than 62% of MinIO instances run in containers (according to MinIO). 43% of those instances are also managed via Kubernetes. It’s not just about jamming this solution into your favourite container solution though. The lightweight nature of it means you can deploy it pretty much anywhere. As the MinIO folks pointed out during the presentation, MinIO is going everywhere that AWS S3 isn’t.

 

Thoughts And Further Reading

I love object storage. Maybe not in the way I love my family or listening to records or beer, but I do love it. It’s not just useful for storage for the great unwashed of the Internet, but also backup and recovery, disaster recovery, data archives, and analytics. And I’m a big fan of MinIO, primarily because of the S3 compatibility and simplicity of deployment. Like it or not, S3 is the way forward in terms of a standard for object storage for cloud native (and a large number of enterprise) workloads. I’ve written before about other vendors being focussed on this compatibility, and I think it’s great that MinIO has approached this challenge with just as much vigour. There are plenty of problems to be had deploying applications at the best of times, and being able to rely on the storage vendor sticking to the script in terms of S3 compatibility takes one more potential headache away.

The simplicity of deployment is a big part of what intrigues me about MinIO too. I’m old enough to remember some deployments of early generation on-premises object storage systems that involved a bunch of hardware and complicated software interactions for what ultimately wasn’t a great experience. Something like MinIO can be up and running on some pretty tiny footprints in no time at all. A colleague of mine shared some insights into that process here.

And that’s what makes this cool. It’s not that MinIO are trying to take a piece of the AWS pie. Rather, it’s positioning the solution as one that can operate everywhere that the hyperscalers aren’t. Putting object storage solutions in edge locations has historically been a real pain to do. That’s no longer the case. Part of this has to do with the fact that we’ve got access to really small computers and compact storage. But it also has a bit to do with lightweight code that can be up and running in a snap. Like some of the other on-premises object vendors, MinIO has done a great job of turning people on to the possibility of doing cool storage for cloud native workloads outside of the cloud. It seems a bit odd until you think about all of the use cases in enterprise that might work really well in cloud, but aren’t allowed to be hosted in the cloud. It’s my opinion that MinIO has done a great job of filling that gap (and exceeding expectations) when it comes to lightweight, easy to deploy object storage. I’m looking forward to see what’s next for them, particularly as the other vendors start to leverage the solution. For another perspective on MinIO’s growth, check out Ray’s article here.

Nasuni Puts Your Data Where You Need It

Disclaimer: I recently attended Storage Field Day 21.  My flights, accommodation and other expenses were paid for by Tech Field Day. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Nasuni recently presented at Storage Field Day 21. You can see videos of the presentation here, and download my rough notes from here.

 

Nasuni?

The functionality is in the product name. It’s NAS that offers a unified file system across cloud. The key feature is that it’s cloud-native, rather than built on any particular infrastructure solution.

[image courtesy of Nasuni]

The platform is comprised of 5 key components.

UniFS

  • Consolidates files and metadata in cloud storage – “Gold Copy”
  • Ensures durability by storing files as immutable, read-only objects
  • Stores an unlimited version history of every file

Virtual Edge Appliances

  • Caches active files with 99% hit rate
  • 98% smaller footprint vs traditional file server / NAS
  • Scales across all sites, including VDI
  • Supports standard file sharing protocols
  • Built-in web server enables remote file access via web browser (HTTP)

Management Console

  • Administers appliances, volumes, shares and file recovery
  • Automated through central GUI and REST API
  • Provides centralised monitoring, reporting, and alerting

Orchestration Center

  • Multi-site file sync keeps track of versions
  • Advanced version control with Nasuni Global File Lock
  • Multi-region cloud support to ensure performance

Analytics Connector

  • Translates file data into native object storage format
  • Leverage any public cloud services (AI, data analytics, search)
  • Multi-cloud support so you can run any cloud service against your data

 

Thoughts and Further Reading

I’m the first to admit I’ve had a bit of a blind spot for Nasuni for a little while now. Not because I think the company doesn’t do cool stuff – it really does. Rather, my former employer was an investor in the tech and was keen to see how we could use the platform in every opportunity. Even when the opportunity wasn’t appropriate.

Distributed storage for file sharing has been a pain in the rear for enterprises ever since enterprises have been a thing. The real challenge has been doing something sensible about managing data across multiple locations in a cogent fashion. As local becomes global, this becomes even more of an issue, particularly when folks all across the world need to work on the same data. Email isn’t really great for this, and some of those sync and share solutions don’t cope well with the scale that is sometimes required. In the end, file serving is still a solution that can solve a problem for a lot of enterprise use cases.

The advent of public cloud has been great in terms of demonstrating that workloads can be distributed, and you don’t need to have a bunch of tin sitting in the office to get value from infrastructure. Nasuni recognised this over ten years ago, and it has put together a platform that seeks to solve that problem by taking advantage of the distributed nature of cloud, whilst acknowledging that virtualised resources can make for a useful local presence when it comes to having the right data in the right place. One of my favourite things about the solution is that you can also do stuff via the Analytics Connector to derive further value from your unstructured data. This is not a unique feature, but it’s certainly something that gives the impression that Nasuni isn’t just here to serve up your data.

The elegance of the Nasuni solution is in the fact that the complexity is well hidden from the end user. It’s a normal file access experience, but it’s hosted in the cloud. When you contrast that with what you get from the sync solutions of the world or the clumsy web-based document management systems so prevalent in the enterprise, this kind of simplicity is invaluable. It’s my opinion that there is very much a place for this kind of solution in the marketplace. The world is becoming increasingly global, but we still need solutions that can provide data where we need it. We also need those solutions to accommodate the performance and resilience needs of the enterprise.

If you’re after a great discussion on storage options for the distributed enterprise, check out Enrico’s article over at GigaOm.