Getting Started With The Pure Storage CLI

I used to write a lot about how to manage CLARiiON and VNX storage environments with EMC’s naviseccli tool. I’ve been doing some stuff with Pure Storage FlashArrays in our lab and thought it might be worth covering off some of the basics of their CLI. This will obviously be no replacement for the official administration guide, but I thought it might come in useful as a starting point.

 

Basics

Unlike EMC’s CLI, there’s no executable to install – it’s all on the controllers. If you’re using Windows, PuTTY is still a good choice as an ssh client. Otherwise the macOS ssh client does a reasonable job too. When you first setup your FlashArray, a virtual IP (VIP) was configured. It’s easiest to connect to the VIP, and Purity then directs your session to whichever controller is the current primary controller. Note that you can also connect via the physical IP address if that’s how you want to do things.

The first step is to login to the array as pureuser, with the password that you’ve definitely changed from the default one.

login as: pureuser
pureuser@10.xxx.xxx.30's password:
Last login: Fri Aug 10 09:36:05 2018 from 10.xxx.xxx.xxx

Mon Aug 13 10:01:52 2018
Welcome pureuser. This is Purity Version 4.10.4 on FlashArray purearray
http://www.purestorage.com/

“purehelp” is the command to run to list available commands.

pureuser@purearray> purehelp
Available commands:
-------------------
pureadmin
purealert
pureapp
purearray
purecert
pureconfig
puredns
puredrive
pureds
purehelp
purehgroup
purehost
purehw
purelog
pureman
puremessage
purenetwork
purepgroup
pureplugin
pureport
puresmis
puresnmp
puresubnet
puresw
purevol
exit
logout

If you want to get some additional help with a command, you can run “command -h” (or –help).

pureuser@purearray> purevol -h
usage: purevol [-h]
               {add,connect,copy,create,destroy,disconnect,eradicate,list,listobj,monitor,recover,remove,rename,setattr,snap,truncate}
               ...

positional arguments:
  {add,connect,copy,create,destroy,disconnect,eradicate,list,listobj,monitor,recover,remove,rename,setattr,snap,truncate}
    add                 add volumes to protection groups
    connect             connect one or more volumes to a host
    copy                copy a volume or snapshot to one or more volumes
    create              create one or more volumes
    destroy             destroy one or more volumes or snapshots
    disconnect          disconnect one or more volumes from a host
    eradicate           eradicate one or more volumes or snapshots
    list                display information about volumes or snapshots
    listobj             list objects associated with one or more volumes
    monitor             display I/O performance information
    recover             recover one or more destroyed volumes or snapshots
    remove              remove volumes from protection groups
    rename              rename a volume or snapshot
    setattr             set volume attributes (increase size)
    snap                take snapshots of one or more volumes
    truncate            truncate one or more volumes (reduce size)

optional arguments:
  -h, --help            show this help message and exit

There’s also a facility to access the man page for commands. Just run “pureman command” to access it.

Want to see how much capacity there is on the array? Run “purearray list –space”.

pureuser@purearray> purearray list --space
Name        Capacity  Parity  Thin Provisioning  Data Reduction  Total Reduction  Volumes  Snapshots  Shared Space  System  Total
purearray  12.45T    100%    86%                2.4 to 1        17.3 to 1        350.66M  3.42G      3.01T         0.00    3.01T

Need to check the software version or generally availability of the controllers? Run “purearray list –controller”.

pureuser@purearray> purearray list --controller
Name  Mode       Model   Version  Status
CT0   secondary  FA-450  4.10.4   ready
CT1   primary    FA-450  4.10.4   ready

 

Connecting A Host

To connect a host to an array (assuming you’ve already zoned it to the array), you’d use the following commands.

purehost create hostname
purehost create -wwnlist WWNs hostname
purehost list
purevol connect --host [host] [volume]

 

Host Groups

You might need to create a Host Group if you’re running ESXi and want to have multiple hosts accessing the same volumes. Here’re the commands you’ll need. Firstly, create the Host Group.

purehgroup create [hostgroup]

Add the hosts to the Host Group (these hosts should already exist on the array)

purehgroup setattr --hostlist host1,host2,host3 [hostgroup]

You can then assign volumes to the Host Group

purehgroup connect --vol [volume] [hostgroup]

 

Other Volume Operations

Some other neat (and sometimes destructive) things you can do with volumes are listed below.

To resize a volume, use the following commands.

purevol setattr --size 500G [volume]
purevol truncate --size 20GB [volume]

Note that a snapshot is available for 24 hours to roll back if required. This is good if you’ve shrunk a volume to be smaller than the data on it and have consequently munted the filesystem.

When you destroy a volume it immediately becomes unavailable to host, but remains on the array for 24 hours. Note that you’ll need to remove the volume from any hosts connected to it first.

purevol disconnect [volume] --host [hostname]
purevol destroy [volume]

If you’re running short of capacity, or are just curious about when a deleted volume will disappear, use the following command.

purevol list --pending

If you need the capacity back immediately, the deleted volume can be eradicated with the following comamnd.

purevol eradicate [volume]

 

Further Reading

The Pure CLI is obviously not a new thing, and plenty of bright folks have already done a few articles about how you can use it as part of a provisioning workflow. This one from Chadd Kenney is a little old now but still demonstrates how you can bring it all together to do something pretty useful. You can obviously extend that to do some pretty interesting stuff, and there’s solid parity between the GUI and CLI in the Purity environment.

It seems like a small thing, but the fact that there’s no need to install an executable is a big thing in my book. Array vendors (and infrastructure vendors in general) insisting on installing some shell extension or command environment is a pain in the arse, and should be seen as an act of hostility akin to requiring Java to complete simple administration tasks. The sooner we get everyone working with either HTML5 or simple ssh access the better. In any csase, I hope this was a useful introduction to the Purity CLI. Check out the Administration Guide for more information.

Nexsan Announces Assureon Cloud Transfer

Announcement

Nexsan announced Cloud Transfer for their Assureon product a little while ago. I recently had the chance to catch up with Gary Watson (Founder / CTO at Nexsan) and thought it would be worth covering the announcement here.

 

Assureon Refresher

Firstly, though, it might be helpful to look at what Assureon actually is. In short, it’s an on-premises storage archive that offers:

  • Long term archive storage for fixed content files;
  • Dependable file availability, with files being audited every 90 days;
  • Unparalleled file integrity; and
  • A “policy” system for protecting and stubbing files.

Notably, there is always a primary archive and a DR archive included in the price. No half-arsing it here – which is something that really appeals to me. Assureon also doesn’t have a “delete” key as such – files are only removed based on defined Retention Rules. This is great, assuming you set up your policies sensibly in the first place.

 

Assureon Cloud Transfer

Cloud Transfer provides the ability to move data between on-premises and cloud instances. The idea is that it will:

  • Provide reliable and efficient cloud mobility of archived data between cloud server instances and between cloud vendors; and
  • Optimise cloud storage and backup costs by offloading cold data to on-premises archive.

It’s being positioned as useful for clients who have a large unstructured data footprint on public cloud infrastructure and are looking to reduce their costs for storing data up there. There’s currently support for Amazon AWS and Microsoft Azure, with Google support coming in the near future.

[image courtesy of Nexsan]

There’s stub support for those applications that support. There’s also an optional NFS / SMB interface that can be configured in the cloud as an Assureon archiving target that caches hot files and stubs cold files. This is useful for those non-Windows applications that have a lot of unstructured data that could be moved to an archive.

 

Thoughts and Further Reading

The concept of dedicated archiving hardware and software bundles, particularly ones that live on-premises, might seem a little odd to some folks who spend a lot of time failing fast in the cloud. There are plenty of enterprises, however, that would benefit from the level of rigour that Nexsan have wrapped around the Assureon product. It’s my strong opinion that too many people still don’t understand the difference between backup and recovery and archive data. The idea that you need to take archive data and make it immutable (and available) for a long time has great appeal, particularly for organisations getting slammed with a whole lot of compliance legislation. Vendors have been talking about reducing primary storage use for years, but there seems to have been some pushback from companies not wanting to invest in these solutions. It’s possible that this was also a result of some kludgy implementations that struggled to keep up with the demands of the users. I can’t speak for the performance of the Assureon product, but I like the fact that it’s sold as a pair, and with a lot of the decision-making around protection taken away from the end user. As someone who worked in an organisation that liked to cut corners on this type of thing, it’s nice to see that.

But why would you want to store stuff on-premises? Isn’t everyone moving everything to the cloud? No, they’re not. I don’t imagine that this type of product is being pitched at people running entirely in public cloud. It’s more likely that, if you’re looking at this type of solution, you’re probably running a hybrid setup, and still have a footprint in a colocation facility somewhere. The benefit of this is that you can retain control over where your archived data is placed. Some would say that’s a bit of a pain, and an unnecessary expense, but people familiar with compliance will understand that business is all about a whole lot of wasted expense in order to make people feel good. But I digress. Like most on-premises solutions, the Assureon offering compares well with a public cloud solution on a $/GB basis, assuming you’ve got a lot of sunk costs in place already with your data centre presence.

The immutability story is also a pretty good one when you start to think about organisations that have been hit by ransomware in the last few years. That stuff might roll through your organisation like a hot knife through butter, but it won’t be able to do anything with your archive data – that stuff isn’t going anywhere. Combine that with one of those fancy next generation data protection solutions and you’re in reasonable shape.

In any case, I like what the Assureon product offers, and am looking forward to seeing Nexsan move beyond the Windows-only platform support that it currently offers. You can read the Nexsan Assueron Cloud Transfer press release here. David Marshall covered the announcement over at VMblog and ComputerWeekly.com did an article as well.

NetApp Announces NetApp ONTAP AI

As a member of NetApp United, I had the opportunity to sit in on a briefing from Mike McNamara about NetApp‘s recently announced AI offering, the snappily named “NetApp ONTAP AI”. I thought I’d provide a brief overview here and share some thoughts.

 

The Announcement

So what is NetApp ONTAP AI? It’s a “proven” architecture delivered via NetApp’s channel partners. It’s comprised of compute, storage and networking. Storage is delivered over NFS. The idea is that you can start small and scale out as required.

Hardware

Software

  • NVIDIA GPU Cloud Deep Learning Stack
  • NetApp ONTAP 9
  • Trident, dynamic storage provisioner

Support

  • Single point of contact support
  • Proven support model

 

[image courtesy of NetApp]

 

Thoughts and Further Reading

I’ve written about NetApp’s Edge to Core to Cloud story before, and this offering certainly builds on the work they’ve done with big data and machine learning solutions. Artificial Intelligence (AI) and Machine Learning (ML) solutions are like big data from five years ago, or public cloud. You can’t go to any industry event, or take a briefing from an infrastructure vendor, without hearing all about how they’re delivering solutions focused on AI. What you do with the gear once you’ve bought one of these spectacularly ugly boxes is up to you, obviously, and I don’t want to get in to whether some of these solutions are really “AI” or not (hint: they’re usually not). While the vendors are gushing breathlessly about how AI will conquer the world, if you tone down the hyperbole a bit, there’re still some fascinating problems being solved with these kinds of solutions.

I don’t think that every business, right now, will benefit from an AI strategy. As much as the vendors would like to have you buy one of everything, these kinds of solutions are very good at doing particular tasks, most of which are probably not in your core remit. That’s not to say that you won’t benefit in the very near future from some of the research and development being done in this area. And it’s for this reason that I think architectures like this one, and those from NetApp’s competitors, are contributing something significant to the ongoing advancement of these fields.

I also like that this is delivered via channel partners. It indicates, at least at first glance, that AI-focused solutions aren’t simply something you can slap a SKU on and sells 100s of. Partners generally have a better breadth of experience across the various hardware, software and services elements and their respective constraints, and will often be in a better position to spend time understanding the problem at hand rather than treating everything as the same problem with one solution. There’s also less chance that the partner’s sales people will have performance accelerators tied to selling one particular line of products. This can be useful when trying to solve problems that are spread across multiple disciplines and business units.

The folks at NVIDIA have made a lot of noise in the AI / ML marketplace lately, and with good reason. They know how to put together blazingly fast systems. I’ll be interested to see how this architecture goes in the marketplace, and whether customers are primarily from the NetApp side of the fence, from the NVIDIA side, or perhaps both. You can grab a copy of the solution brief here, and there’s an AI white paper you can download from here. The real meat and potatoes though, is the reference architecture document itself, which you can find here.

Dell EMC Announces IDPA DP4400

Dell EMC announced the Integrated Data Protection Appliance (IDPA) at Dell EMC World in May 2017. They recently announced a new edition to the lineup, the IDPA DP4400. I had the opportunity to speak with Steve Reichwein about it and thought I’d share some of my thoughts here.

 

The Announcement

Overview

One of the key differences between this offering and previous IDPA products is the form factor. The DP4400 is a 2RU appliance (based on a PowerEdge server) with the following features:

  • Capacity starts at 24TB, growing in increments of 12TB, up to 96TB useable. The capacity increase is done via licensing, so there’s no additional hardware required (who doesn’t love the golden screwdriver?)
  • Search and reporting is built in to the appliance
  • There are Cloud Tier (ECS, AWS, Azure, Virtustream, etc) and Cloud DR options (S3 at this stage, but that will change in the future)
  • There’s the IDPA System Manager (Data Protection Central), along with Data Domain DD/VE (3.1) and Avamar (7.5.1)

[image courtesy of Dell EMC]

It’s hosted on vSphere 6.5, and the whole stack is referred to as IDPA 2.2. Note that you can’t upgrade the components individually.

 

Hardware Details

Storage Configuration

  • 18x 12TB 3.5″ SAS Drives (12 front, 2 rear, 4 mid-plane)
    • 12TB RAID1 (1+1) – VM Storage
    • 72TB RAID6 (6+2) – DDVE File System Spindle-group 1
    • 72TB RAID6 (6+2) – DDVE File System Spindle-group 2
  • 240GB BOSS Card
    • 240GB RAID1 (1+1 M.2) – ESXi 6.5 Boot Drive
  • 1.6TB NVMe Card
    • 960GB SSD – DDVE cache-tier

System Performance

  • 2x Intel Silver 4114 10-core 2.2GHz
  • Up to 40 vCPU system capacity
  • Memory of 256GB (8x 32GB RDIMMs, 2667MT/s)

Networking-wise, the appliance has 8x 10GbE ports using either SFP+ or Twinax. There’s a management port for initial configuration, along with an iDRAC port that’s disabled by default, but can be configured if required. If you’re using Avamar NDMP accelerator nodes in your environment, you can integrate an existing node with the DP4400. Note that it supports one accelerator node per appliance.

 

Put On Your Pointy Hat

One of the nice things about the appliance (particularly if you’ve ever had to build a data protection environment based on Data Domain and Avamar) is that you can setup everything you need to get started via a simple to use installation wizard.

[image courtesy of Dell EMC]

 

Thoughts and Further Reading

I talked to Steve about what he thought the key differentiators were for the DP4400. He talked about:

  • Ecosystem breadth;
  • Network bandwidth; and
  • Guaranteed dedupe ratio (55:1 vs 5:1?)

He also mentioned the capability of a product like Data Protection Central to manage an extremely large ROBO environment. He said these were some of the opportunities where he felt Dell EMC had an edge over the competition.

I can certainly attest to the breadth of ecosystem support being a big advantage for Dell EMC over some of its competitors. Avamar and DD/VE have also demonstrated some pretty decent chops when it comes to bandwidth-constrained environments in need of data protection. I think it’s great the Dell EMC are delivering these kinds of solutions to market. For every shop willing to go with relative newcomers like Cohesity or Rubrik, there are plenty who still want to buy data protection from Dell EMC, IBM or Commvault. Dell EMC are being fairly upfront about what they think this type of appliance will support in terms of workload, and they’ve clearly been keeping an eye on the competition with regards to usability and integration. People who’ve used Avamar in real life have been generally happy with the performance and feature set, and this is going to be a big selling point for people who aren’t fans of NetWorker.

I’m not going to tell you that one vendor is offering a better solution than the others. You shouldn’t be making strategic decisions based on technical specs and marketing brochures in any case. Some environments are going to like this solution because it fits well with their broader strategy of buying from Dell EMC. Some people will like it because it might be a change from their current approach of building their own solutions. And some people might like to buy it because they think Dell EMC’s post-sales support is great. These are all good reasons to look into the DP4400.

Preston did a write-up on the DP4400 that you can read here. The IDPA DP4400 landing page can be found here. There’s also a Wikibon CrowdChat on next generation data protection being held on August 15th (2am on the 16th in Australian time) that will be worth checking out.

Random Short Take #6

Welcome to the sixth edition of the Random Short Take. Here are a few links to a few things that I think might be useful, to someone.

Pure//Accelerate 2018 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Pure//Accelerate 2018.  My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here’s a quick post with links to the other posts I did covering Pure//Accelerate 2018, as well as links to other articles related to the event that I found interesting.

 

Gestalt IT Articles

I wrote a series of articles about Pure Storage for Gestalt IT.

Pure Storage – You’ve Come A Long Way

//X Gon Give it to Ya

Green is the New Black

The Case for Data Protection with FlashBlade

 

Event-Related

Here’re the posts I did during the show. These were mainly from the analyst sessions I attended.

Pure//Accelerate 2018 – Wednesday General Session – Rough Notes

Pure//Accelerate 2018 – Thursday General Session – Rough Notes

Pure//Accelerate 2018 – Wednesday – Chat With Charlie Giancarlo

Pure//Accelerate 2018 – (Fairly) Full Disclosure

 

Pure Storage Press Releases

Here are some of the press releases from Pure Storage covering the major product announcements and news.

The Future of Infrastructure Design: Data-Centric Architecture

Introducing the New FlashArray//X: Shared Accelerated Storage for Every Workload

Pure Storage Announces AIRI™ Mini: Complete, AI-Ready Infrastructure for Everyone

Pure Storage Delivers Pure Evergreen Storage Service (ES2) Along with Major Upgrade to Evergreen Program

Pure Storage Launches New Partner Program

 

Pure Storage Blog Posts

A New Era Of Storage With NVMe & NVMe-oF

New FlashArray//X Family: Shared Accelerated Storage For Every Workload

Building A Data-Centric Architecture To Power Digital Business

Pure’s Evergreen Delivers Right-Sized Storage, Again And Again And Again

Pure1 Expands AI Capabilities And Adds Full Stack Analytics

 

Conclusion

I had a busy but enjoyable week. I would have liked to get to more of the technical sessions, but being given access to some of the top executives and engineering talent in the company via the Analyst and Influencer Experience was invaluable. Thanks again to Pure Storage (particularly Armi Banaria and Terri McClure) for having me along to the show.

Panasas Overview

A good friend of mine is friends with someone who works at Panasas and suggested I might like to hear from them. I had the opportunity to speak to some of the team, and I thought I’d write a brief overview of what they do. Hopefully I’ll have the opportunity to cover them in the future as I think they’re doing some pretty neat stuff.

 

It’s HPC, But Not As You Know It

I don’t often like to include that slide where the vendor compares themselves to other players in the market. In this case, though, I thought Panasas’s positioning of themselves as “commercial” HPC versus the traditional HPC storage (and versus enterprise scale-out NAS) is an interesting one. We talked through this a little, and my impression is that they’re starting to deal more and more with the non-traditional HPC-like use cases, such as media and entertainment, oil and gas, genomics folks, and so forth. A number of these workloads fall outside HPC, in the sense that traditional HPC has lived almost exclusively in government and the academic sphere. The roots are clearly in HPC, but there are “enterprise” elements creeping in, such as ease of use (at scale) and improved management functionality.

[image courtesy of Panasas]

 

Technology

It’s Really Parallel

The really value in Panasas’s offering is the parallel access to the storage. The more nodes you add, the more performance improves. In a serial system, a client can access data via one node in the cluster, regardless of the number of nodes available. In a parallel system, such as this one, a client accesses data that is spread across multiple nodes.

 

What About The Hardware?

The current offering from Panasas is called ActiveStor. The platform is comprised of PanFS running on Director Blades and Storage Blades. Here’s a picture of the Director Blades (ASD-100) and the Storage Blades (ASH-100). The Director has been transitioned to a 2U4N form factor (it used to be sit in the blade chassis).

[image courtesy of Panasas]

 

Director Nodes are the Control Plane of PanFS, and handle:

  • Metadata processing: directories, file names, access control checks, timestamps, etc.
  • Uses a transaction log to ensure atomicity and durability of structural changes
  • Coordination of client system actions to ensure single-system view and data-cache-coherence
  • “Realm” membership (Panasas’s name for the storage cluster), realm self-repair, etc.
  • Realm maintenance: file reconstruction, automatic capacity balancing, scrubbing, etc.

Storage Nodes are the Data Plane of PanFS, and deal with:

  • Storage of bulk user data for the realm, accessed in parallel by client systems
  • Also stores, but does not operate on, all the metadata of the system for the Director Nodes
  • API based upon the T10 SCSI “Object-Based Storage Device” that Panasas helped define

Storage nodes offer a variety of HDD (4TB, 6TB, 8TB, 10TB, or 12TB) and SSD capacities (480GB, 960GB, 1.9TB) depending on the type of workload you’re dealing with. The SSD is used for metadata and files smaller than 60KB. Everything else is stored on the larger drives.

 

DirectFlow Protocol

DirectFlow is a big part fo what differentiates Panasas from your average scale-out NAS offering. It does some stuff that’s pretty cool, including:

  • Support for parallel delivery of data to / from Storage Nodes
  • Support for fully POSIX-compliant semantics, unlike NFS and SMB
  • Support for strong data cache-coherency across client systems

It’s a proprietary protocol between clients and ActiveStor components, and there’s an installable kernel module for each client system (Linux and macOS). They tell me that pNFS is based upon DirectFlow, and they had a hand in defining pNFS.

 

Resilience

Scale out NAS is exciting but us enterprise types want to know about resilience. It’s all fun and games until someone fat fingers a file, or a disk dies. Well, Panasas, as it happens, have a little heritage when it comes to disk resilience. They use a N + 2 RAID 6 (10 wide + P & Q). You could have more disks working for you, but this number seems to work best for Panasas customers. In terms of realms, there are 3, 5 or 7 “rep sets” per realm. There’s also a “realm president”, and every Director has a backup director. There’s also:

  • Per-file erasure coding of striped files allows the whole cluster to help rebuild a file after a failure;
  • Only need to rebuild data protection on specific files instead of entire drives(s); and
  • The percentage of files in the cluster affected by any given failure approaches zero at scale.

 

Thoughts and Further Reading

I’m the first to admit that my storage experience to date has been firmly rooted in the enterprise space. But, much like my fascination with infrastructure associated wth media and entertainment, I fancy myself as an HPC-friendly storage guy. This is for no other reason than I think HPC workloads are pretty cool, and they tend to scale beyond what I normally see in the enterprise space (keeping in mind that I work in a smallish market). You say genomics to someone, or AI, and they’re enthusiastic about the outcomes. You say SQL 2012 to someone and they’re not as interested.

Panasas are positioning themselves as being suitable, primarily, for commercial HPC storage requirements. They have a strong heritage with traditional HPC workloads, and they seem to have a few customers using their systems for more traditional, enterprise-like NAS deployments as well. This convergence of commercial HPC, traditional and enterprise NAS requirements has presented some interesting challenges, but it seems like Panasas have addressed those in the latest iteration of its hardware. Dealing with stonking great big amounts of data at scale is a challenge for plenty of storage vendors, but Panasas have demonstrated an ability adapt to the evolving demands of their core market. I’m looking forward to seeing the next incarnation of their platform, and how they incorporate technologies such as InfiniBand into their offering.

There’s a good white paper available on the Panasas architecture that you can access here (registration required). El Reg also has some decent coverage of the current hardware offering here.

SwiftStack Announces 1space

SwiftStack recently announced 1space, and I was lucky enough to snaffle some time with Joe Arnold to talk more about what it all means. I thought it would be useful to do a brief post, as I really do like SwiftStack, and I feel like I don’t spend enough time talking about them.

 

The Announcement

So what exactly is 1space? It’s basically SwiftStack delivering access to their storage across both on-premises and public cloud. But what does that mean? Well, you get some cool features as a result, including:

  • Integrated multi-cloud access
  • Scale-out & high-throughput data movement
  • Highly reliable & available policy execution
  • Policies for lifecycle, data protection & migration
  • Optional, scale-out containers with AWS S3 support
  • Native access in public cloud (direct to S3, GCS, etc.)
  • Data created in public cloud accessible on-premises
  • Native format enabling cloud-native services

[image courtesy of SwiftStack]

According to Arnold, one of the really cool things about this is that it “provides universal access to over both file protocols and object APIs to a single storage namespace, it is increasingly used for distributed workflows across multiple geographic regions and multiple clouds”.

 

Metadata Search

But wait …

One of the really nice things that SwiftStack has done is add integrated metadata search via a desktop client for Windows, macOS, and Linux. It’s called MetaSync.

 

Thoughts

This has been a somewhat brief post, but something I did want to focus on was the fact that this product has been open-sourced. SwiftStack have been pretty keen on open source as a concept, and I think that comes through when you have a look at some of their contributions to the community. These contributions shouldn’t be underestimated, and I think it’s important that we call out when vendors are contributing to the open source community. Let’s face it, a whole lot of startups are taking advantage of code generated by the open source community, and a number of them have the good sense to know that it’s most certainly a two-way street, and they can’t relentlessly pillage the community without it eventually falling apart.

But this announcement isn’t just me celebrating the contributions of neckbeards from within the vendor community and elsewhere. SwiftStack have delivered something that is really quite cool. In much the same way that storage types won’t shut up about NVMe over Fabrics, cloud folks are really quite enthusiastic about the concept of multi-cloud connectivity. There are a bunch of different use cases where it makes sense to leverage a universal namespace for your applications. If you’d like to see SwiftStack in action, check out this YouTube channel (there’s a good video about 1space here) and if you’d like to take SwiftStack for a spin, you can do that here.

Pavilion Data Systems Overview

I recently had the opportunity to hear about Pavilion Data Systems from VR Satish, CTO, and Jeff Sosa, VP of Products. I thought I’d put together a brief overview of their offering, as NVMe-based systems are currently the new hotness in the storage world.

 

It’s a Box!

And a pretty cool looking one at that. Here’s what it looks like from the front.

[image courtesy of Pavilion Data]

The storage platform is built from standard components, including x86 processors and U.2 NVMe SSDs. A big selling point, in Pavilion’s opinion, is that there are no custom ASICs and no FPGAs in the box. There are three different models available (the datasheet is here), with different connectivity and capacity options.

From a capacity perspective, you can start at 14TB and get all the way to 1PB in 4RU. The box can start at 18 NVMe drives and (growing by increments of 18) goes to 72 drives. It runs RAID 6 and presents the drives as virtual volumes to the hosts. Here’s a look at the box from a top-down perspective.

[image courtesy of Pavilion Data]

There’s a list of supported NVMe SSDs that you can use with the box, if you wanted to source those elsewhere. On the right hand side (the back of the box) are the IO controllers. You can start at 4 and go up to 20 in a box. There’s also 2 management modules and 4 power supplies for resiliency.

[image courtesy of Pavilion Data]

You can see in the above diagram that connectivity is also a big part of the story, with each pair of controllers offering 4x 100GbE ports.

 

Software? 

Sure. It’s a box but it needs something to run it. Each controller runs a customised flavour of Linux and delivers a number of the features you’d expect from a storage array, including:

  • Active-active controller support
  • Space-efficient snapshots and clones
  • Thin provisioning.

There’re also plans afoot for encryption support in the near future. Pavilion have also focused on making operations simple, providing support for RESTful API orchestration, OpenStack Cinder, Kubernetes, DMTF RedFish and SNIA Swordfish. They’ve also gone to some lengths to ensure that standard NVMe/F drivers will work for host connectivity.

 

Thoughts and Further Reading

Pavilion Data has been around since 2014 and the leadership group has some great heritage in the storage and networking industry. They tell me they wanted to move away from the traditional approach to storage arrays (the dual controller, server-based platform) to something that delivered great performance at scale. There are similarities more with high performance networking devices than high performance storage arrays, and this is by design. They tell me they really wanted to deliver a solution that wasn’t the bottleneck when it came to realising the performance capabilities of the NVMe architecture. The numbers being punted around are certainly impressive. And I’m a big fan of the approach, in terms of both throughput and footprint.

The webscale folks running apps like MySQL and Cassandra and MongoDB (and other products with similarly awful names) are doing a few things differently to the enterprise bods. Firstly, they’re more likely to wear jeans and sneakers to the office (something that drives me nuts) and they’re leveraging DAS heavily because it gives them high performance storage options for latency-sensitive situations. The advent of NVMe and NVMe over Fabrics takes away the requirement for DAS (although I’m not sure they’ll start to wear proper office attire any time soon) by delivering storage at the scale and performance they need. As a result of this, you can buy 1RU servers with compute instead of 2RU servers full of fast disk. There’s an added benefit as organisations tend to assign longer lifecycles to their storage systems, so systems like the one from Pavilion are going to have a place in the DC for five years, not 2.5 – 3 years. Suddenly lifecycling your hosts becomes simpler as well. This is good news for the jeans and t-shirt set and the beancounters alike.

NVMe (and NVMe over Fabrics) has been a hot topic for a little while now, and you’re only going to hear more about it. Those bright minds at Gartner are calling it “Shared Accelerated Storage” and you know if they’re talking about it then the enterprise folks will cotton on in a few years and suddenly it will be everywhere. In the meantime, check out Chris M. Evans’ article on NVMe over Fabrics and Chris Mellor also did an interesting piece at El Reg. The market is becoming more crowded each month and I’m interested to see how Pavilion fare.

Pure//Accelerate 2018 – (Fairly) Full Disclosure

Disclaimer: I recently attended Pure//Accelerate 2018.  My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here are my notes on gifts, etc, that I received as a conference attendee at Pure//Accelerate 2018. This is by no stretch an interesting post from a technical perspective, but it’s a way for me to track and publicly disclose what I get and how it looks when I write about various things. I’m going to do this in chronological order, as that was the easiest way for me to take notes during the week. While everyone’s situation is different, I took 5 days of unpaid leave to attend this conference.

 

Saturday

My wife dropped me at the BNE domestic airport and I had some ham and cheese and a few coffees in the Qantas Club. I flew Qantas economy class to SFO via SYD. The flights were paid for by Pure Storage. Plane food was consumed on the flight. It was a generally good experience, and I got myself caught up with Season 3 of Mr. Robot. Pure paid for a car to pick me up at the airport. My driver was the new head coach of the San Francisco City Cats ABA team, so we talked basketball most of the trip. I stayed at a friend’s place until late Monday and then checked in to the Marriott Marquis in downtown San Francisco. The hotel costs were also covered by Pure.

 

Tuesday

When I picked up my conference badge I was given a Pure Storage and Rubrik co-branded backpack. On Tuesday afternoon we kicked off the Analyst and Influencer Experience with a welcome reception at the California Academy of Sciences. I helped myself to a Calicraft Coast Kolsch and 4 Aliciella Bitters. I also availed myself of the charcuterie selection, cheese balls and some fried shrimp. The most enjoyable part of these events is catching up with good folks I haven’t seen in a while, like Vaughn and Craig.

As we left we were each given a shot glass from the Academy of Sciences that was shaped like a small beaker. Pure also had a small box of Sweet 55 chocolate delivered to our hotel rooms. That’s some seriously good stuff. Sorry it didn’t make it home kids.

After the reception I went to dinner with Alastair Cooke, Chris Evans and Matt Leib at M.Y. China in downtown SF. I had the sweet and sour pork and rice and 2 Tsingtao beers. The food was okay. We split the bill 4 ways.

 

Wednesday

We were shuttled to the event venue early in the morning. I had a sausage and egg breakfast biscuit, fruit and coffee in the Analysts and Influencers area for breakfast. I need to remind myself that “biscuits” in their American form are just not really my thing. We were all given an Ember temperature control ceramic mug. I also grabbed 2 Pure-flavoured notepads and pens and a Pure Code t-shirt. Lunch in the A&I room consisted of chicken roulade, salmon, bread roll, pasta and Perrier sparkling spring water. I also grabbed a coffee in between sessions.

Christopher went down to the Solutions Expo and came back with a Quantum sticker (I am protecting data from the dark side) and Veeam 1800mAh keychain USB charger for me. I also grabbed some stickers from Justin Warren and some coffee during another break. No matter how hard I tried I couldn’t trick myself into believing the coffee was good.

There was an A&I function at International Smoke and I helped myself to cheese, charcuterie, shrimp cocktail, ribs, various other finger foods and 3 gin and tonics. I then skipped the conference entertainment (The Goo Goo Dolls) to go with Stephen Foskett and see Terra Lightfoot and The Posies play at The Independent. The car to and from the venue and the tickets were very kindly covered by Stephen. I had two 805 beers while I was there. It was a great gig. 5 stars.

 

Thursday

For breakfast I had fruit, a chocolate croissant and some coffee. Scott Lowe kindly gave me a printed copy of ActualTech’s latest Gorilla Guide to Converged Infrastructure. I also did a whip around the Solutions Expo and grabbed:

  • A Commvault glasses cleaner;
  • 2 plastic Zerto water bottles;
  • A pair of Rubrik socks;
  • A Cisco smart wallet and pen;
  • Veeam webcam cover, retractable charging cable and $5 Starbucks card; and
  • A Catalogic pen.

Lunch was boxed. I had the Carne Asada, consisting of Mexican style rice, flat iron steak, black beans, avocado, crispy tortilla and cilantro. We were all given 1GB USB drives with a copies of the presentations from the A&I Experience on them as well. That was the end of the conference.

I had dinner at ThirstBear Brewing Co with Alastair, Matt Leib and Justin. I had the Thirstyburger, consisting of Richards Ranch grass-fed beef, mahón cheese, chorizo-andalouse sauce, arugula, housemade pickles, panorama bun, and hand-cut fried kennebec patatas. This was washed down with two glasses of The Admiral’s Blend.

 

Friday

As we didn’t fly out until Friday evening, Alastair and I spent some time visiting the Museum of Modern Art. vBrownBag covered my entry to the museum, and the Magritte exhibition was terrific. We then lunched in Chinatown at a place (Maggie’s Cafe) that reminded me a lot of the Chinese places in Brisbane. Before I went to the airport I had a few beers in the hotel bar. This was kindly paid for by Justin Warren. On Friday evening Pure paid for a car to take Justin and I to SFO for our flight back to Australia. Justin gets extra thanks for having me as his plus one in the fancier lounges that I normally don’t have access to.

Big thanks to Pure Storage for having me over for the week, and big thanks to everyone who spent time with me at the event (and after hours) – it’s a big part of why I keep coming back to these types of events.