Backblaze Has A (Pod) Birthday, Does Some Cool Stuff With B2

Backblaze has been on my mind a lot lately. And not just because of their recent expansion into Europe. The Storage Pod recently turned ten years old, and I was lucky enough to have the chance to chat with Yev Pusin and Andy Klein about that news and some of the stuff they’re doing with B2, Tiger Technology, and Veeam.

 

10 Years Is A Long Time

The Backblaze Storage Pod (currently version 6) recently turned 10 years old. That’s a long time for something to be around (and successful) in a market like cloud storage. I asked to Yev and Andy about where they saw the pod heading, and whether they thought there was room for Flash in the picture. Andy pointed out that, with around 900PB under management, Flash still didn’t look like the most economical medium for this kind of storage task. That said, they have seen the main HDD manufacturers starting to hit a wall in terms of the capacity per drive that they can deliver. Nonetheless, the challenge isn’t just performance, it’s also the fact that people are needing more and more capacity to store their stuff. And it doesn’t look like they can produce enough Flash to cope with that increase in requirements at this stage.

Version 7.0

We spoke briefly about what Pod 7.0 would look like, and it’s going to be a “little bit faster”, with the following enhancements planned:

  • Updating the motherboard
  • Upgrade the CPU and consider using an AMD CPU
  • Updating the power supply units, perhaps moving to one unit
  • Upgrading from 10Gbase-T to 10GbE SFP+ optical networking
  • Upgrading the SATA cards
  • Modifying the tool-less lid design

They’re looking to roll this out in 2020 some time.

 

Tiger Style?

So what’s all this about Veeam, Tiger Bridge, and Backblaze B2? Historically, if you’ve been using Veeam from the cheap seats, it’s been difficult to effectively leverage object storage to use as a repository for longer term data storage. Backblaze and Tiger Technology have gotten together to develop an integration that allows you to use B2 storage to copy your Veeam protection data to the Backblaze cloud. There’s a nice overview of the solution that you can read here, and you can read some more comprehensive instructions here.

 

Thoughts and Further Reading

I keep banging on about it, but ten years feels like a long time to be hanging around in tech. I haven’t managed to stay with one employer longer than 7 years (maybe I’m flighty?). Along with the durability of the solution, the fact that Backblaze made the design open source, and inspired a bunch of companies to do something similar, is a great story. It’s stuff like this that I find inspiring. It’s not always about selling black boxes to people. Sometimes it’s good to be a little transparent about what you’re doing, and relying on a great product, competitive pricing, and strong support to keep customers happy. Backblaze have certainly done that on the consumer side of things, and the team assures me that they’re experiencing success with the B2 offering and their business-oriented data protection solution as well.

The Veeam integration is an interesting one. While B2 is an object storage play, it’s not S3-compliant, so they can’t easily leverage a lot of the built-in options delivered by the bigger data protection vendors. What you will see, though, is that they’re super responsive when it comes to making integrations available across things like NAS devices, and stuff like this. If I get some time in the next month, I’ll look at setting this up in the lab and running through the process.

I’m not going to wax lyrical about how Backblaze is democratising data access for everyone, as they’re in business to make money. But they’re certainly delivering a range of products that is enabling a variety of customers to make good use of technology that has potentially been unavailable (in a simple to consume format) previously. And that’s a great thing. I glossed over the news when it was announced last year, but the “Rebel Alliance” formed between Backblaze, Packet and ServerCentral is pretty interesting, particularly if you’re looking for a more cost-effective solution for compute and object storage that isn’t reliant on hyperscalers. I’m looking forward to hearing about what Backblaze come up with in the future, and I recommend checking them out if you haven’t previously. You can read Ken‘s take over at Gestalt IT here.

Pure Storage Expands Portfolio, Adds Capacity And Performance

Disclaimer: I recently attended Pure//Accelerate 2019.  My flights, accommodation, and conference pass were paid for by Pure Storage. There is no requirement for me to blog about any of the content presented and I am not compensated by Pure Storage for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Pure Storage announced two additions to its portfolio of products today: FlashArray//C and DirectMemory Cache. I had the opportunity to hear about these two products at the Storage Field Day Exclusive event at Pure//Accelerate 2019 and thought I’d share some thoughts here.

 

DirectMemory Cache

DirectMemory Cache is a high-speed caching system that reduces read latency for high-locality, performance-critical applications.

  • High speed: based on Intel Optane SCM drives
  • Caching system: repeated accesses to “hot data” are sped up automatically – no tiering = no configuration
  • Read latency: only read performance is affected – no changes to latency
  • High-locality: only workloads that reuse often a dates that fits in the cache will benefit
  • Performance-Critical: high-throughput latency sensitive workloads

According to Pure, “DirectMemory Cache is the functionality within Purity that provides direct access to data and accelerates performance critical applications”. Note that this is only for read data, write caching is still done via DRAM.

How Can This Help?

Pure has used Pure1 Meta analysis to arrive at the following figures:

  • 80% of arrays can achieve 20% lower latency
  • 40% of arrays can achieve 30-50% lower latency (up to 2x boost)

So there’s some real potential to improve existing workloads via the use of this read cache.

DirectMemory Configurations

Pure Storage DirectMemory Modules plug directly into FlashArray//X70 and //X90, are inserted into the chassis, and are available in the following configurations:

  • 3TB (4x750GB) DirectMemory Modules
  • 6TB (8x750GB) DirectMemory Modules

Top of Rack Architecture

Pure are positioning the “top of rack” architecture as a way to compete some of the architectures that have jammed a bunch of flash in DAS or in compute to gain increased performance. The idea is that you can:

  • Eliminate data locality;
  • Bring storage and compute closer;
  • Provide storage services that are not possible with DAS;
  • Bring the efficiency of FlashArray to traditional DAS applications; and
  • Offload storage and networking load from application CPUs.

 

FlashArray//C

Typical challenges in Tier 2

Things can be tough in the tier 2 storage world. Pure outlined some of the challenges they were seeking to address by delivering a capacity optimised product.

Management complexity

  • Complexity / management
  • Different platforms and APIs
  • Interoperability challenges

Inconsistent Performance

  • Variable app performance
  • Anchored by legacy disk
  • Undersized / underperforming

Not enterprise class

  • <99.9999% resiliency
  • Disruptive upgrades
  • Not evergreen

The C Stands For Capacity Optimised All-Flash Array

Flash performance at disk economics

  • QLC architecture enables tier 2 applications to benefit from the performance of all-flash – predictable 2-4ms latency, 5.2PB (effective) in 9U delivers 10x consolidation for racks and racks of disk.

Optimised end-to-end for QLC Flash

  • Deep integration from software to QLC NAND solves QLC wear concerns and delivers market-leading economics. Includes the same evergreen maintenance and wear replacement as every FlashArray

“No Compromise” enterprise experience

  • Built for the same 99.9999%+ availability, Pure1 cloud management, API automation, and AI-driven predictive support of every FlashArray

Flash for every data workflow

  • Policy driven replication, snapshots, and migration between arrays and clouds – now use Flash for application tiering, DR, Test / Dev, Backup, and retention

Configuration Details

Configuration options include:

  • 366TB RAW – 1.3PB effective
  • 878TB RAW – 3.2PB effective
  • 1.39PB RAW – 5.2PB effective

Use Cases

  • Policy based VM tiering between //X and //C
  • Multi-cloud data protection and DR – on-premises and multi-site
  • Multi-cloud test / dev – workload consolidation

*File support (NFS / SMB) coming in 2020 (across the entire FlashArray family, not just //C)

 

Thoughts

I’m a fan of companies that expand their portfolio based on customer requests. It’s a good way to make more money, and sometimes it’s simplest to give the people what they want. The market has been in Pure’s ear for some time about delivering some kind of capacity storage solution. I think it was simply a matter of time before the economics and the technology intersected at a point where it made sense for it to happen. If you’re an existing Pure customer, this is a good opportunity to deploy Pure across all of your tiers of storage, and you get the benefit of Pure1 keeping an eye on everything, and your “slow” arrays will still be relatively performance-focused thanks to NVMe throughout the box. Good times in IT isn’t just about speeds and feeds though, so I think this announcement is more important in terms of simplifying the story for existing Pure customers that may be using other vendors to deliver Tier 2 capabilities.

I’m also pretty excited about DirectMemory Cache, if only because it’s clear that Pure has done its homework (i.e. they’ve run the numbers on Pure1 Meta) and realised that they could improve the performance of existing arrays via a reasonably elegant solution. A lot of the cool kids do DAS, because that’s what they’ve been told will yield great performance. And that’s mostly true, but DAS can be a real pain in the rear when you want to move workloads around, or consolidate performance, or do useful things like data services (e.g. replication). Centralised storage arrays have been doing this stuff for years, and it’s about time they were also able to deliver the performance required in order for those companies not to have to compromise.

You can read the press release here, and the Tech Field Day videos can be viewed here.

VMware – VMworld 2019 – HBI2537PU – Cloud Provider CXO Panel with Cohesity, Cloudian and PhoenixNAP

Disclaimer: I recently attended VMworld 2019 – US.  My flights and accommodation were paid for by Digital Sense, and VMware provided me with a free pass to the conference and various bits of swag. There is no requirement for me to blog about any of the content presented and I am not compensated by VMware for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my rough notes from “HBI2537PU – Cloud Provider CXO Panel with Cohesity, Cloudian and PhoenixNAP”, a panel-type presentation with the following people:

You can grab a PDF copy of my notes from here.

Introductions are done.

YR: William, given your breadth of experience, what are some of the emerging trends you’ve been seeing?

WB: Companies are struggling to keep up with the pace of information generation. Understanding the data, storing and retaining it, and protecting it. Multi-cloud adds a lot of complexity. We’ve heard studies that say 22% of data generated is actually usable. It’s just sitting there. Public cloud is still hot, but it’s settling down a little.

YR: William comes from a massive cloud provider. What are you guys using?

WB: We’ve standardised on vCloud Director (vCD) and vSphere. We came from build our own but it wasn’t providing the value that we hoped it would. Customers want a seamless way to manage multiple cloud resources.

YR: Are you guys familiar with VCPP?

AP: VCPP is the crown jewel of our partner program at VMware. 4000+ providers, 120+ countries, 10+ million VMs, 10000+ DCs. We help you save money, make money (things are services ready). We’re continuing to invest in vCD. Kubernetes, GPUs, etc. Lots of R&D.

YR: William, you mentioned you standardised on the VMware platform. Talk to us about your experience. Why vCD?

WB: It’s been a checkered past for vCD. We were one of the first five on the vCloud Express program in 2010 / 11. We didn’t like vCD in its 1.0 version. We thought we can do this better. And we did. We launched the first on-demand, pay by the hour public cloud for enterprise in 2011. But it didn’t really work out. 2012 / 13 we started to see investments being made in vCD. 5.0 / 5.5 improved. Many people thought vCD was gong to die. We now see a modern, flexible portal that can be customised. And we can take our devs and have them customise vCD, rather than build a customised portal. That’s where we can put our time and effort. We’ve always done things differently. Always been doing other things. How do we bring our work in visual cloud into that cloud provider portal with vCD?

YR: You have an extensive career at VMware.

RR: I was one of the first people to take vCD out to the world. But Enterprise wasn’t mature enough. When we focused on SPs, it was the right thing to do. DIY portals needs a lot of investment. VMware allows a lot of extensibility now. For us, as Cohesity, we want to be able to plug in to that as well.

WB: At one point we had 45 devs working on a proprietary portal.

YR: We’ve been doing a lot on the extensibility side. What role are services playing in cloud providers?

AP: It takes away the complexities of deploying the stack.

JT: We’re specifically in object. A third of our customers are service providers. You guys know that object is built for scale, easy to manage, cost-effective. 20% of the data gets used. We hear that customers want to improve on that. People are moving away from tape. There’s a tremendous opportunity for services built on storage. Amazon has shown that. Data protection like Cohesity. Big data with Splunk. You can offer an industry standard, but differentiate based on other services.

YR: As we move towards a services-oriented world, William how do you see cloud management services evolving?

WB: It’s not good enough to provide some compute infrastructure any more. You have to do something more. We’re stubbornly focussed on different types of IaaS. We’re not doing generic x86 on top of vSphere. Backup, DR – those are in our wheelhouse. From a platform perspective, more and more customers want some kind of single pane of glass across their data. For some that’s on-premises, for some its public, for some it’s SaaS. You have to be able to provide value to the customer, or they will disappear. Object storage, backup with Cohesity. You need to keep pace with data movement. Any cloud, any data, any where.

AP: I’ve been at VMware long enough not to drink the Kool-Aid. Our whole cloud provider business is rooted in some humility. vCD can help other people doing better things to integrate. vCD has always been about reducing OPEX. Now we’re hitting the top line. Any cloud management platform today needs to open, extensible, not try to do anything.

YR: Is the crowd seeing pressure on pure IaaS?

Commentator: Coming from an SP to enterprise is different. Economics. Are you able to do a show back with vCD 9 and vROps?

WB: We’re putting that in the hands of customers. Looking at CloudHealth. There’s a benefit to being in the business management space. You have the opportunity to give customers a better service. That, and more flexible business models. Moving into flexible billing models – gives more freedom to the enterprise customer. Unless you’re the largest of the large – enterprises have difficulty acting as a service provider. Citibank are an exception to this. Honeywell do it too. If you’re Discount Tire – it’s hard. You’re the guy providing the service, and you’re costing them money. There’s animosity – and there’s no choice.

Commentator: Other people have pushed to public because chargeback is more effective than internal show back with private cloud.

WB: IT departments are poorly equipped to offer a breadth of services to their customers.

JT: People are moving workloads around. They want choice and flexibility. VMware with S3 compatible storage. A common underlying layer.

YR: Economics, chargeback. Is VMware (and VCPP) doing enough?

WB: The two guys to my right (RR and JT) have committed to building products that let me do that. I’ve been working on object storage use cases. I was talking to a customer. They’re using our IaaS and connected to Amazon S3. You’ve gone to Amazon. They didn’t know about it though. Experience and cost that can be the same or better. Egress in Amazon S3 is ridiculous. You don’t know what you don’t know. You can take that service and deliver it cost-effectively.

YR: RR talk to us about the evolution of data protection.

RR: Information has grown. Data is fragmented. Information placement is almost unmanageable. Services have now become available in a way that can be audited, secured, managed. At Cohesity, first thing we did was data protection, and I knew the rest was coming. Complexity’s a problem.

YR: JT. We know Cloudian’s a leader in object storage. Where do you see object going?

JT: It’s the underlying storage layer of the cloud. Brings down cost of your storage layer. It’s all about TCO. What’s going to help you build more revenue streams? Cloudian has been around since 2011. New solutions in backup, DR, etc, to help you build new revenue streams. S3 users on Amazon are looking for alternatives. Many of Cloudian’s customers are ex-Amazon customers. What are we doing? vCD integration. Search Cloudian and vCD on YouTube. Continuously working to drive down the cost of managing storage. 1.5PB in a 4RU box in collaboration with Seagate.

WB: Expanding service delivery, specifically around object storage, is important. You can do some really cool stuff – not just backup, it’s M&E, it’s analytics. Very few of our customers are using object just to store files and folders.

YR: We have a lot of providers in the room. JT can you talk more about these key use cases?

JT: It runs the gamut. You can break it down by verticals. M&E companies are offering editing suites via service providers. People are doing that for the legal profession. Accounting – storing financial records. Dental records and health care. The back end is the same thing – compute with S3 storage behind it. Cloudian provides multi-tenanted, scalable performance. Cost is driven down as you get larger.

YR: RR your key use cases?

RR: DRaaS is hot right now. When I was at VMware we did stuff with SRM. DR is hard. It’s so simple now. Now every SP can do it themselves. Use S3 to move data around from the same interface. And it’s very needed too. Everyone should have ubiquitous access to their data. We have that capability. We can now do vulnerability scans on the data we store on the platform. We can tell you if a VM is compromised. You can orchestrate the restoration of an environment – as a service.

YR: WB what are the other services you want us to deliver?

WB: We’re an odd duck. One of our major practices is information security. The idea that we have intelligent access to data residing in our infrastructure. Being able to detect vulnerabilities, taking action, sending an email to the customer, that’s the type of thing that cloud providers have. You might not be doing it yet – but you could.

YR: Security, threat protection. RR – do you see Cohesity as the driver to solve that problem?

RR: Cohesity will provide the platform. Data is insecure because it’s fragmented. Cohesity lets you run applications on the platform. Virus scanners, run books, all kinds of stuff you can offer as a service provider.

YR: William, where does the onus lie, how do you see it fitting together?

WB: The key for us is being open. Eg Cohesity integration into vCD. If I don’t want to – I don’t have to. Freedom of choice to pick and choose where we went to deliver our own IP to the customer. I don’t have to use Cohesity for everything.

JT: That’s exactly what we’re into. Choice of hardware, management. That’s the point. Standards-based top end.

YR: Security

*They had 2 minutes to go but I ran out of time and had to get to another meeting. Informative session. 4 stars.

Formulus Black Announces Forsa 3.0

Formulus Black recently announced version 3.0 of its Forsa product. I had the opportunity to speak with Mark Iwanowski and Jing Xie about the announcement and wanted to share some thoughts here.

 

So What’s A Forsa Again?

It’s a software solution for running applications in memory without needing to re-tool your applications or hardware. You can present persistent storage (think Intel Optane) or non-persistent memory (think DRAM) as a block device to the host and run your applications on that. Here’s a look at the architecture.

[image courtesy of Formulus Black]

Is This Just a Linux Thing?

No, not entirely. There’s Ubuntu and CentOS support out of the box, and Red Hat support is imminent. If you don’t use those operating systems though, don’t stress. You can also run this using a KVM-based hypervisor. So anything supported by that can be supported by Forsa.

But What If My Memory Fails?

Formulus Black has a technology called “BLINK” which provides the ability to copy your data down to SSDs, or you can failover the data to another host.

Won’t I Need A Bunch Of RAM?

Formulus Black uses Bit Markers – a memory efficient technology (like deduplication) – to make efficient use of the available memory. They call it “amplification” as opposed to deduplication, as it amplifies the available space.

Is This Going To Cost Me?

A little, but not as much as you’d think (because nothing’s ever free). The software is licensed on a per-socket basis, so if you decide to add memory capacity you’re not up for additional licensing costs.

 

Thoughts and Further Reading

I don’t do as much work with folks requiring in-memory storage solutions as much as I’d like to do, but I do appreciate the requirement for these kinds of solutions. The big appeal here is the lack of requirement to re-tool your applications to work in-memory. All you need is something that runs on Linux or KVM and you’re pretty much good to go. Sure, I’m over-simplifying things a little, but it looks like there’s a good story here in terms of the lack of integration required to get some serious performance improvements.

Formulus Black came out of stealth around 4 and a bit months ago and have already introduced a raft of improvements over version 2.0 of their offering. It’s great to see the speed with which they’ve been able to execute on new features in their offering. I’m curious to see what’s next, as there’s obviously been a great focus on performance and simplicity.

The cool kids are all talking about the benefits of NVMe-based, centralised storage solutions. And they’re right to do this, as most applications will do just fine with these kinds of storage platforms. But there are still going to be minuscule bottlenecks associated with these devices. If you absolutely need things to run screamingly fast, you’ll likely want to run them in-memory. And if that’s the case, Formulus Black’s Forsa solution might be just what you’re looking for. Plus, it’s a pretty cool name for a company, or possibly an aspiring wizard.

Burlywood Tech Announces TrueFlash Insight

Burlywood Tech came out of stealth a few years ago, and I wrote about their TrueFlash announcement here. I had another opportunity to speak to Mike Tomky recently about Burlywood’s TrueFlash Insight announcement and thought I’d share some thoughts here.

 

The Announcement

Burlywood’s “TrueFlash” product delivers what they describe as a “software-defined SSD” drive. Since they’ve been active in the market they’ve gained traction in what they call the Tier 2 service provider segments (not the necessarily the “Big 7” hyperscalers).

They’ve announced TrueFlash Insight because, in a number of cases, customers don’t know what their workloads really look like. The idea behind TrueFlash Insight is that it can be run in a production environment for a period of time to collect metadata and drive telemetry. Engineers can also be sent on site if required to do the analysis. The data collected with TrueFlash Insight helps Burlywood with the process of designing and tuning the TrueFlash product for the desired workload.

How It Works

  • Insight is available only on Burlywood TrueFlash drives
  • Enabled upon execution of a SOW for Insight analysis services
  • Run your application as normal in a system with one or more Insight-enabled TrueFlash drives
  • Follow the instructions to download the telemetry files
  • Send telemetry data to Burlywood for analysis
  • Burlywood parses the telemetry, analyses data patterns, shares performance information, and identifies potential bottlenecks and trouble spots
  • This information can then be used to tune the TrueFlash SSDs for optimal performance

 

Thoughts and Further Reading

When I wrote about Burlywood previously I was fascinated by the scale that would be required for a company to consider deploying SSDs with workload-specific code sitting on them. And then I stopped and thought about my comrades in the enterprise space struggling to get the kind of visibility into their gear that’s required to make these kinds of decisions. But when your business relies so heavily on good performance, there’s a chance you have some idea of how to get information on the performance of your systems. The fact that Burlywood are making this offering available to customers indicates that even those customers that are on board with the idea of “Software-defined SSDs (SDSSDs?)” don’t always have the capabilities required to make an accurate assessment of their workloads.

But this solution isn’t just for existing Burlywood customers. The good news is it’s also available for customers considering using Burlywood’s product in their DC. It’s a reasonably simple process to get up and running, and my impression is that it will save a bit of angst down the track. Tomky made the comment that, with this kind of solution, you don’t need to “worry about masking problems at the drive level – [you can] work on your core value”. There’s a lot to be said for companies, even the ones with very complex technical requirements, not having to worry about the technical part of the business as much as the business part of the business. If Burlywood can make that process easier for current and future customers, I’m all for it.

StorONE Announces S1-as-a-Service

StorONE recently announced its StorONE-as-a-Service (S1aaS) offering. I had the opportunity to speak to Gal Naor about it and thought I’d share some thoughts here.

 

The Announcement

StorONE’s S1-as-a-Service (S1aaS), is a use-based solution integrating StorONE’s S1 storage services with Dell Technologies and Mellanox hardware. The idea is they’ll ship you an appliance (available in a few different configurations) and you plug it in and away you go. There’s not a huge amount to say about it as it’s fairly straightforward. If you need more that the 18TB entry-level configuration, StorONE can get you up and running with 60TB thanks to overnight shipping.

Speedonomics

The as-a-Service bit is what most people are interested in, and S1aaS starts at $999 US per month for the 18TB all-flash array that delivers up to 150000 IOPS. There are a couple of other configurations available as well, including 36TB at $1797 per month, and 54TB at $2497 per month. If, for some reason, you decide you don’t want the device any more, or you no longer have that particular requirement, you can cancel your service with 30 days’ notice.

 

Thoughts and Further Reading

The idea of consuming storage from vendors on-premises via flexible finance plans isn’t a new one. But S1aaS isn’t a leasing plan. There’s no 60-month commitment and payback plan. If you want to use this for three months for a particular project and then cancel your service, you can. Just as you could with cable. From that perspective, it’s a reasonably interesting proposition. A number of the major storage vendors would struggle to put that much capacity and speed in such a small footprint on-premises for $999 per month. This is the major benefit of a software-based storage product that, by all accounts, can get a lot out of commodity server hardware.

I wrote about StorONE when they came out of stealth mode a few years ago, and noted the impressive numbers they were posting. Are numbers the most important thing when it comes to selecting storage products? No, not always. There’s plenty to be said for “good enough” solutions that are more affordable. But it strikes me that solutions that go really fast and don’t cost a small fortune to run are going to be awfully compelling. One of the biggest impediments to deploying on-premises storage solutions “as-a-Service” is that there’s usually a minimum spend required to make it worthwhile for the vendor or service provider. Most attempts previously have taken more than 2RU of rack space as a minimum footprint, and have required the customer to sign up for minimum terms of 36 – 60 months. That all changes (for the better) when you can run your storage on a server with NVMe-based drives and an efficient, software-based platform.

Sure, there are plenty of enterprises that are going to need more than 18TB of capacity. But are they going to need more than 54TB of capacity that goes at that speed? And can they build that themselves for the monthly cost that StorONE is asking for? Maybe. But maybe it’s just as easy for them to look at what their workloads are doing and decide whether they want everything to on that one solution. And there’s nothing to stop them deploying multiple configurations either.

I was impressed with StorONE when they first launched. They seem to have a knack for getting good performance from commodity gear, and they’re willing to offer that solution to customers at a reasonable price. I’m looking forward to seeing how the market reacts to these kinds of competitive offerings. You can read more about S1aaS here.

Spectra Logic – BlackPearl Overview

I recently had the opportunity to take a briefing with Jeff Braunstein and Susan Merriman from Spectra Logic (one of those rare occasions where getting your badge scanned at a conference proves valuable), and thought I’d share some of my notes here.

 

BlackPearl Family

Spectra Logic sell a variety of products, but this briefing was focused primarily on the BlackPearl series. Braunstein described it as a “gateway” device, with both NAS and object front end interfaces, and backend capability that can move data to multiple types of archives.

[image courtesy of Spectra Logic]

It’s a hardware box, but at its core the value is in the software product. The idea is that the BlackPearl acts as a disk cache, and you configure policies to send the data to one or more storage targets. The cool thing is that it supports multiple retention policies, and these can be permanent too. By that I mean you could spool one copy to tape for long term storage, and have another copy of your data sit on disk for 90 days (or however long you wanted).

 

Local vs Remote Storage

Local

There are a few different options for local storage, including BlackPearl Object Storage Disk, functioning as “near line archive”. This is configured with 107 enterprise quality SATA drives, (and they’re looking at introducing 16TB drives next month), providing roughly 1.8PB RAW capacity. They function as power-down archive drives (using the drive spin down settings), and delivers a level of resilience and reliability by using ZFS as the file system,. There are also customer-configurable parity settings. Alternatively, you can pump data to Spectra Tape Libraries, for those of you who still want to use tape as a storage format.

 

Remote Storage Targets

In terms of remote storage targets, BlackPearl can leverage either public cloud, or other BlackPearl devices as replication targets. Replication to BlackPearl can be one way or bi-directional. Public Cloud support is available via Amazon S3 (and S3-like products such as Cloudian and Wasabi), and MS Azure. There is a concept of data immutability in the product, and you can turn on versioning to prevent your data management applications (or users) from accidentally clobbering your data.

Braunstein also pointed out that tape generations evolve, and BlackPearl has auto-migration capabilities. You can potentially have data migrate transparently from tape to tape (think LTO-6 to LTO-7), tape to disk, and tape to cloud.

 

[image courtesy of Spectra Logic]

In terms of how you leverage BlackPearl, some of that is dependent on the workflows you have in place to move your data. This could be manual, semi-automated, or automated (or potentially purpose built into existing applications). There’s a Spectra S3 RESTful API, and there’s heaps of information on developer.spectralogic.com on how to integrate BlackPearl into your existing applications and media workflows.

 

Thoughts

If you’re listening to the next-generation data protection vendors and big box storage folks, you’d wonder why companies such as Spectra Logic still focus on tape. It’s not because they have a rich heritage and deep experience in the tape market (although they do). There are plenty of use cases where tape still makes sense in terms of its ability to economically store large amounts of data in a relatively secure (off-line if required) fashion. Walk into any reasonably sized film production house and you’ll still see tape in play. From a density perspective (and durability), there’s a lot to like about tape. But BlackPearl is also pretty adept at getting data from workflows that were traditionally file-based and putting them on public cloud environments (the kind of environments that heavily leverage object storage interfaces). Sure, you can pump the data up to AWS yourself if you’re so inclined, but the real benefit of the BlackPearl approach, in my opinion, is that it’s policy-driven and fully automated. There’s less chance that you’ll fat finger the transfer of critical data to another location. This gives you the ability to focus on your core business, and not have to worry about data management.

I’ve barely scratched the surface of what BlackPearl can do, and I recommend checking out their product site for more information.

Random Short Take #15

Here are a few links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 15 – it could become a regular thing. Maybe every other week? Fortnightly even.

Random Short Take #14

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Episode 14 – giddy-up!

Random Short Take #13

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Let’s dive in to lucky number 13.