Pure//Accelerate 2019 – Austin, USA

I will be attending Pure//Accelerate 2019 in Austin, TX as an influencer. The event runs from September 15th – 18th.

Pure Storage – Configuring ObjectEngine Bucket Security

This is a quick post as a reminder for me next time I need to do something with basic S3 bucket security. A little while I ago I was testing Pure Storage’s ObjectEngine (OE) device with a number of data protection products. I’ve done a few articles previously on what it looked like from the Cohesity and Commvault perspective, but thought it would be worthwhile to document what I did on the OE side of things.

The first step is to create the bucket in the OE dashboard.

You’ll need to call it something, and there are rules around the naming convention and length of the name.

In this example, I’m creating a bucket for Commvault to use, so I’ve called this one “commvault-test”.

Once the bucket has been created, you should add a security policy to the bucket.

Click on “Add” and you’ll be prompted to get started with the Bucket Policy Editor.

I’m pretty hopeless with this stuff, but fortunately there’s a policy generator on the AWS site you can use.

Once you’ve generated your policy, click on Save and you’ll be good to go. Keep in mind that any user you reference in the policy will need to exist in OE for the policy to work.

Here’s the policy I applied to this particular bucket. The user is commvault, and the bucket name is commvault-test.

{
  "Id": "Policy1563859773493",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1563859751962",
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::commvault-test",
      "Principal": {
        "AWS": [
          "arn:aws:iam::0:user/commvault"
        ]
      }
    },
    {
      "Sid": "Stmt1563859771357",
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::commvault-test/*",
      "Principal": {
        "AWS": [
          "arn:aws:iam::0:user/commvault"
        ]
      }
    }
  ]
}

You can read more about the policy elements here.

Cohesity Basics – Configuring An External Target For Cloud Archive

I’ve been working in the lab with Pure Storage’s ObjectEngine and thought it might be nice to document the process to set it up as an external target for use with Cohesity’s Cloud Archive capability. I’ve written in the past about Cloud Tier and Cloud Archive, but in that article I focused more on the Cloud Tier capability. I don’t want to sound too pretentious, but I’ll quote myself from the other article: “With Cloud Archive you can send copies of snapshots up to the cloud to keep as a copy separate to the backup data you might have replicated to a secondary appliance. This is useful if you have some requirement to keep a monthly or six-monthly copy somewhere for compliance reasons.”

I would like to be clear that this process hasn’t been blessed or vetted by Pure Storage or Cohesity. I imagine they are working on delivering a validated solution at some stage, as they have with Veeam and Commvault. So don’t go out and jam this in production and complain to me when Pure or Cohesity tell you it’s wrong.

There are a couple of ways you can configure an external target via the Cohesity UI. In this example, I’ll do it from the dashboard, rather than during the protection job configuration. Click on Protection and select External Target.

You’ll then be presented with the New Target configuration dialogue.

In this example, I’m calling my external target PureOE, and setting its purpose as Archival (as opposed to Tiering).

The Type of target is “S3 Compatible”.

Once you select that, you’ll be asked for a bunch of S3-type information, including Bucket Name and Access Key ID. This assumes you’ve already created the bucket and configured appropriate security on the ObjectEngine side of things.

Enter the required information. I’ve de-selected compression and source side deduplication, as I’m wanting that the data reduction to be done by the ObjectEngine. I’ve also disabled encryption, as I’m guessing this will have an impact on the ObjectEngine as well. I need to confirm that with my friends at Pure. I’m using the fully qualified domain name of the ObjectEngine as the endpoint here as well.

Once you click on Register, you’ll be presented with a summary of the configuration.

You’re then right to use this as an external target for Archival parts of protection jobs within your Cohesity environment. Once you’ve run a few protection jobs, you should start to see files within the test bucket on the ObjectEngine. Don’t forget that, as fas as I’m aware, it’s still very difficult (impossible?) to remove external targets from the the Cohesity Data Platform, so don’t get too carried away with configuring a bunch of different test targets thinking that you can remove them later.

Pure Storage – ObjectEngine and Commvault Integration

I’ve been working with Pure Storage’s ObjectEngine in our lab recently, and wanted to share a few screenshots from the Commvault configuration bit, as it had me stumped for a little while. This is a quick one, but hopefully it will help those of you out there who are trying to get it working. I’m assuming you’ve already created your bucket and user in the ObjectEngine environment, and you have the details of your OE environment at hand.

The first step is to add a Cloud Storage Library to your Libraries configuration.

You’ll need to provide a name, and select the type as Amazon S3. You’ll see in this example that I’m using the fully qualified domain name as the Service Host.

At this point you should be able to click on Detect to detect the bucket you’ll use to store data in. For some reason though, I kept getting an error when I did this.

The trick is to put http:// in front of the FQDN. Note that this doesn’t work with https://.

Now when you click on Detect, you’ll see the Bucket that you’ve configured on the OE environment (assuming you haven’t fat-fingered your credentials).

And that’s it. You can then go on and configure your storage polices and SubClient policies as required.

Random Short Take #15

Here are a few links to some random news items and other content that I recently found interesting. You might find them interesting too. Episode 15 – it could become a regular thing. Maybe every other week? Fortnightly even.

Using A Pure Storage FlashBlade As A Veeam Repository

I’ve been doing some testing in the lab recently. The focus of this testing has been primarily on Pure Storage’s ObjectEngine and its associated infrastructure. As part of that, I’ve been doing various things with Veeam Backup & Replication 9.5 Update 4, including setting up a FlashBlade NFS repository. I’ve documented the process in a document here. One thing that I thought worthy of noting separately was the firewall requirements. For my Linux Mount Server, I used a CentOS 7 VM, configured with 8 vCPUs and 16GB of RAM. I know, I normally use Debian, but for some reason (that I didn’t have time to investigate) it kept dying every time I kicked off a backup job.

In any case, I set everything up as per Pure’s instructions, but kept getting timeout errors on the job. The error I got was “5/17/2019 10:03:47 AM :: Processing HOST-01 Error: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond NFSMOUNTHOST:2500“. It felt like it was probably a firewall issue of some sort. I tried to make an exception on the Windows VM hosting the Veeam Backup server, but that didn’t help. The problem was with the Linux VM’s firewall. I used the instructions I found here to add in some custom rules. According to the Veeam documentation, Backup Repository access uses TCP ports 2500 – 5000. Your SecOps people will no doubt have a conniption, but here’s how to open those ports on CentOS.

Firstly, is the firewall running?

[danf@nfsmounthost ~]$ sudo firewall-cmd --state
[sudo] password for danf:
running

Yes it is. So let’s stop it to see if this line of troubleshooting is worth pursuing.

[danf@nfsmounthost ~]$ sudo systemctl stop firewalld

The backup job worked after that. Okay, so let’s start it up again and open up some ports to test.

[danf@nfsmounthost ~]$ sudo systemctl start firewalld
[danf@nfsmounthost ~]$ sudo firewall-cmd --add-port=2500-5000/tcp
success

That worked, so I wanted to make it a more permanent arrangement.

[danf@nfsmounthost ~]$ sudo firewall-cmd --permanent --add-port=2500-5000/tcp
success
[danf@nfsmounthost ~]$ sudo firewall-cmd --permanent --list-ports
2500-5000/tcp

Remember, it’s never the storage. It’s always the firewall. Also, keep in my mind this article is about the how. I’m not offering my opinion about whether it’s really a good idea to configure your host-based firewalls with more holes than Swiss cheese. Or whatever things have lots of holes in them.

Random Short Take #13

Here are a few links to some random news items and other content that I found interesting. You might find them interesting too. Let’s dive in to lucky number 13.

Pure Storage Goes All In On Hybrid … Cloud

I recently had the opportunity to hear from Chadd Kenney about Pure Storage’s Cloud Data Services announcement and thought it worthwhile covering here. But before I get into that, Pure have done a little re-branding recently. You’ll now hear them referring to Cloud Data Infrastructure (their on-premises instances of FlashArray, FlashBlade, FlashStack) and Cloud Data Management (being their Pure1 instances).

 

The Announcement

So what is “Cloud Data Services”? It’s comprised of:

According to Kenney, “[t]he right strategy is and not or, but the enterprise is not very cloudy, and the cloud is not very enterprise-y”. If you’ve spent time in any IT organisation, you’ll see that there is, indeed, a “Cloud divide” in play. What we’ve seen in the last 5 – 10 years is a marked difference in application architectures, consumption and management, and even storage offerings.

[image courtesy of Pure Storage]

 

Cloud Block Store

The first part of the puzzle is probably the most interesting for those of us struggling to move traditional application stacks to a public cloud solution.

[image courtesy of Pure Storage]

According to Pure, Cloud Block Store offers:

  • High reliability, efficiency, and performance;
  • Hybrid mobility and protection; and
  • Seamless APIs on-premises and cloud.

Kenney likens building a Purity solution on AWS to the approach Pure took in the early days of their existence, when they took off the shelf components and used optimised software to make them enterprise-ready. Now they’re doing the same thing with AWS, and addressing a number of the shortcomings of the underlying infrastructure through the application of the Purity architecture.

Features

So why would you want to run virtual Pure controllers on AWS? The idea is that Cloud Block Store:

  • Aggregates performance and reliability across many cloud stores;
  • Can be deployed HA across two availability zones (using active cluster);
  • Is always thin, deduplicated, and compressed;
  • Delivers instant space-saving snapshots; and
  • Is always encrypted.

Management and Orchestration

If you have previous experience with Purity, you’ll appreciate the management and orchestration experience remains the same.

  • Same management, with Pure1 managing on-premises instances and instances in the cloud
  • Consistent APIs on-premises and in cloud
  • Plugins to AWS and VMware automation
  • Open, full-stack orchestration

Use Cases

Pure say that you can use this kind of solution in a number of different scenarios, including DR, backup, and migration in and between clouds. If you want to use ActiveCluster between AWS regions, you might have some trouble with latency, but in those cases other replication options are available.

[image courtesy of Pure Storage]

Not that Cloud Block Store is available in a few different deployment configurations:

  • Test/Dev – using a single controller instance (EBS can’t be attached to more than one EC2 instance)
  • Production – ActiveCluster (2 controllers, either within or across availability zones)

 

CloudSnap

Pure tell us that we’ve moved away from “disk to disk to tape” as a data protection philosophy and we now should be looking at “Flash to Flash to Cloud”. CloudSnap allows FlashArray snapshots to be easily sent to Amazon S3. Note that you don’t necessarily need FlashBlade in your environment to make this work.

[image courtesy of Pure Storage]

For the moment, this only being certified on AWS.

 

StorReduce for AWS

Pure acquired StorReduce a few months ago and now they’re doing something with it. If you’re not familiar with them, “StorReduce is an object storage deduplication engine, designed to enable simple backup, rapid recovery, cost-effective retention, and powerful data re-use in the Amazon cloud”. You can leverage any array, or existing backup software – it doesn’t need to be a Pure FlashArray.

Features

According to Pure, you get a lot of benefits with StorReduce, including:

  • Object fabric – secure, enterprise ready, highly durable cloud object storage;
  • Efficient – Reduces storage and bandwidth costs by up to 97%, enabling cloud storage to cost-effectively replace disk & tape;
  • Fast – Fastest Deduplication engine on the market. 10s of GiB/s or more sustained 24/7;
  • Cloud Native – Native S3 interface enabling openness, integration, and data portability. All Data & Metadata stored in object store;
  • Single namespace – Stores in a single data hub across your data centre to enable fast local performance and global data protection; and
  • Scalability – Software nodes scale linearly to deliver 100s of PBs and 10s of GBs bandwidth.

 

Thoughts and Further Reading

The title of this post was a little misleading, as Pure have been doing various cloud things for some time. But sometimes I give in to my baser instincts and like to try and be creative. It’s fine. In my mind the Cloud Block Store for AWS piece of the Cloud Data Services announcement is possibly the most interesting one. It seems like a lot of companies are announcing these kinds of virtualised versions of their hardware-based appliances that can run on public cloud infrastructure. Some of them are just encapsulated instances of the original code, modified to deal with a VM-like environment, whilst others take better advantage of the public cloud architecture.

So why are so many of the “traditional” vendors producing these kinds of solutions? Well, the folks at AWS are pretty smart, but it’s a generally well understood fact that the enterprise moves at enterprise pace. To that end, they may not be terribly well positioned to spend a lot of time and effort to refactor their applications to a more cloud-friendly architecture. But that doesn’t mean that the CxOs haven’t already been convinced that they don’t need their own infrastructure anymore. So the operations folks are being pushed to migrate out of their DCs and into public cloud provider infrastructure. The problem is that, if you’ve spent a few minutes looking at what the likes of AWS and GCP offer, you’ll see that they’re not really doing things in the same way that their on-premises comrades are. AWS expects you to replicate your data at an application level, for example, because those EC2 instances will sometimes just up and disappear.

So how do you get around the problem of forcing workloads into public cloud without a lot of the safeguards associated with on-premises deployments? You leverage something like Pure’s Cloud Block Store. It overcomes a lot of the issues associated with just running EC2 on EBS, and has the additional benefit of giving your operations folks a consistent management and orchestration experience. Additionally, you can still do things like run ActiveCluster between and within Availability Zones, so your mission critical internal kitchen roster application can stay up and running when an EC2 instance goes bye bye. You’ll pay a bit less or more than you would with normal EBS, but you’ll get some other features too.

I’ve argued before that if enterprises are really serious about getting into public cloud, they should be looking to work towards refactoring their applications. But I also understand that the reality of enterprise application development means that this type of approach is not always possible. After all, enterprises are (generally) in the business of making money. If you come to them and can’t show exactly how they’ save money by moving to public cloud (and let’s face it, it’s not always an easy argument), then you’ll find it even harder to convince them to undertake significant software engineering efforts simply because the public cloud folks like to do things a certain way. I’m rambling a bit, but my point is that these types of solutions solve a problem that we all wish didn’t exist but it does.

Justin did a great write-up here that I recommend reading. Note that both Cloud Block Store and StorReduce are in Beta with planned general availability in 2019.

Getting Started With The Pure Storage CLI

I used to write a lot about how to manage CLARiiON and VNX storage environments with EMC’s naviseccli tool. I’ve been doing some stuff with Pure Storage FlashArrays in our lab and thought it might be worth covering off some of the basics of their CLI. This will obviously be no replacement for the official administration guide, but I thought it might come in useful as a starting point.

 

Basics

Unlike EMC’s CLI, there’s no executable to install – it’s all on the controllers. If you’re using Windows, PuTTY is still a good choice as an ssh client. Otherwise the macOS ssh client does a reasonable job too. When you first setup your FlashArray, a virtual IP (VIP) was configured. It’s easiest to connect to the VIP, and Purity then directs your session to whichever controller is the current primary controller. Note that you can also connect via the physical IP address if that’s how you want to do things.

The first step is to login to the array as pureuser, with the password that you’ve definitely changed from the default one.

login as: pureuser
pureuser@10.xxx.xxx.30's password:
Last login: Fri Aug 10 09:36:05 2018 from 10.xxx.xxx.xxx

Mon Aug 13 10:01:52 2018
Welcome pureuser. This is Purity Version 4.10.4 on FlashArray purearray
http://www.purestorage.com/

“purehelp” is the command to run to list available commands.

pureuser@purearray> purehelp
Available commands:
-------------------
pureadmin
purealert
pureapp
purearray
purecert
pureconfig
puredns
puredrive
pureds
purehelp
purehgroup
purehost
purehw
purelog
pureman
puremessage
purenetwork
purepgroup
pureplugin
pureport
puresmis
puresnmp
puresubnet
puresw
purevol
exit
logout

If you want to get some additional help with a command, you can run “command -h” (or –help).

pureuser@purearray> purevol -h
usage: purevol [-h]
               {add,connect,copy,create,destroy,disconnect,eradicate,list,listobj,monitor,recover,remove,rename,setattr,snap,truncate}
               ...

positional arguments:
  {add,connect,copy,create,destroy,disconnect,eradicate,list,listobj,monitor,recover,remove,rename,setattr,snap,truncate}
    add                 add volumes to protection groups
    connect             connect one or more volumes to a host
    copy                copy a volume or snapshot to one or more volumes
    create              create one or more volumes
    destroy             destroy one or more volumes or snapshots
    disconnect          disconnect one or more volumes from a host
    eradicate           eradicate one or more volumes or snapshots
    list                display information about volumes or snapshots
    listobj             list objects associated with one or more volumes
    monitor             display I/O performance information
    recover             recover one or more destroyed volumes or snapshots
    remove              remove volumes from protection groups
    rename              rename a volume or snapshot
    setattr             set volume attributes (increase size)
    snap                take snapshots of one or more volumes
    truncate            truncate one or more volumes (reduce size)

optional arguments:
  -h, --help            show this help message and exit

There’s also a facility to access the man page for commands. Just run “pureman command” to access it.

Want to see how much capacity there is on the array? Run “purearray list –space”.

pureuser@purearray> purearray list --space
Name        Capacity  Parity  Thin Provisioning  Data Reduction  Total Reduction  Volumes  Snapshots  Shared Space  System  Total
purearray  12.45T    100%    86%                2.4 to 1        17.3 to 1        350.66M  3.42G      3.01T         0.00    3.01T

Need to check the software version or generally availability of the controllers? Run “purearray list –controller”.

pureuser@purearray> purearray list --controller
Name  Mode       Model   Version  Status
CT0   secondary  FA-450  4.10.4   ready
CT1   primary    FA-450  4.10.4   ready

 

Connecting A Host

To connect a host to an array (assuming you’ve already zoned it to the array), you’d use the following commands.

purehost create hostname
purehost create -wwnlist WWNs hostname
purehost list
purevol connect --host [host] [volume]

 

Host Groups

You might need to create a Host Group if you’re running ESXi and want to have multiple hosts accessing the same volumes. Here’re the commands you’ll need. Firstly, create the Host Group.

purehgroup create [hostgroup]

Add the hosts to the Host Group (these hosts should already exist on the array)

purehgroup setattr --hostlist host1,host2,host3 [hostgroup]

You can then assign volumes to the Host Group

purehgroup connect --vol [volume] [hostgroup]

 

Other Volume Operations

Some other neat (and sometimes destructive) things you can do with volumes are listed below.

To resize a volume, use the following commands.

purevol setattr --size 500G [volume]
purevol truncate --size 20GB [volume]

Note that a snapshot is available for 24 hours to roll back if required. This is good if you’ve shrunk a volume to be smaller than the data on it and have consequently munted the filesystem.

When you destroy a volume it immediately becomes unavailable to host, but remains on the array for 24 hours. Note that you’ll need to remove the volume from any hosts connected to it first.

purevol disconnect [volume] --host [hostname]
purevol destroy [volume]

If you’re running short of capacity, or are just curious about when a deleted volume will disappear, use the following command.

purevol list --pending

If you need the capacity back immediately, the deleted volume can be eradicated with the following comamnd.

purevol eradicate [volume]

 

Further Reading

The Pure CLI is obviously not a new thing, and plenty of bright folks have already done a few articles about how you can use it as part of a provisioning workflow. This one from Chadd Kenney is a little old now but still demonstrates how you can bring it all together to do something pretty useful. You can obviously extend that to do some pretty interesting stuff, and there’s solid parity between the GUI and CLI in the Purity environment.

It seems like a small thing, but the fact that there’s no need to install an executable is a big thing in my book. Array vendors (and infrastructure vendors in general) insisting on installing some shell extension or command environment is a pain in the arse, and should be seen as an act of hostility akin to requiring Java to complete simple administration tasks. The sooner we get everyone working with either HTML5 or simple ssh access the better. In any csase, I hope this was a useful introduction to the Purity CLI. Check out the Administration Guide for more information.

Pure//Accelerate 2018 – Wrap-up and Link-o-rama

Disclaimer: I recently attended Pure//Accelerate 2018.  My flights, accommodation and conference pass were paid for by Pure Storage via the Analysts and Influencers program. There is no requirement for me to blog about any of the content presented and I am not compensated in any way for my time at the event.  Some materials presented were discussed under NDA and don’t form part of my blog posts, but could influence future discussions.

Here’s a quick post with links to the other posts I did covering Pure//Accelerate 2018, as well as links to other articles related to the event that I found interesting.

 

Gestalt IT Articles

I wrote a series of articles about Pure Storage for Gestalt IT.

Pure Storage – You’ve Come A Long Way

//X Gon Give it to Ya

Green is the New Black

The Case for Data Protection with FlashBlade

 

Event-Related

Here’re the posts I did during the show. These were mainly from the analyst sessions I attended.

Pure//Accelerate 2018 – Wednesday General Session – Rough Notes

Pure//Accelerate 2018 – Thursday General Session – Rough Notes

Pure//Accelerate 2018 – Wednesday – Chat With Charlie Giancarlo

Pure//Accelerate 2018 – (Fairly) Full Disclosure

 

Pure Storage Press Releases

Here are some of the press releases from Pure Storage covering the major product announcements and news.

The Future of Infrastructure Design: Data-Centric Architecture

Introducing the New FlashArray//X: Shared Accelerated Storage for Every Workload

Pure Storage Announces AIRI™ Mini: Complete, AI-Ready Infrastructure for Everyone

Pure Storage Delivers Pure Evergreen Storage Service (ES2) Along with Major Upgrade to Evergreen Program

Pure Storage Launches New Partner Program

 

Pure Storage Blog Posts

A New Era Of Storage With NVMe & NVMe-oF

New FlashArray//X Family: Shared Accelerated Storage For Every Workload

Building A Data-Centric Architecture To Power Digital Business

Pure’s Evergreen Delivers Right-Sized Storage, Again And Again And Again

Pure1 Expands AI Capabilities And Adds Full Stack Analytics

 

Conclusion

I had a busy but enjoyable week. I would have liked to get to more of the technical sessions, but being given access to some of the top executives and engineering talent in the company via the Analyst and Influencer Experience was invaluable. Thanks again to Pure Storage (particularly Armi Banaria and Terri McClure) for having me along to the show.