Pure Storage – Configuring ObjectEngine Bucket Security

This is a quick post as a reminder for me next time I need to do something with basic S3 bucket security. A little while I ago I was testing Pure Storage’s ObjectEngine (OE) device with a number of data protection products. I’ve done a few articles previously on what it looked like from the Cohesity and Commvault perspective, but thought it would be worthwhile to document what I did on the OE side of things.

The first step is to create the bucket in the OE dashboard.

You’ll need to call it something, and there are rules around the naming convention and length of the name.

In this example, I’m creating a bucket for Commvault to use, so I’ve called this one “commvault-test”.

Once the bucket has been created, you should add a security policy to the bucket.

Click on “Add” and you’ll be prompted to get started with the Bucket Policy Editor.

I’m pretty hopeless with this stuff, but fortunately there’s a policy generator on the AWS site you can use.

Once you’ve generated your policy, click on Save and you’ll be good to go. Keep in mind that any user you reference in the policy will need to exist in OE for the policy to work.

Here’s the policy I applied to this particular bucket. The user is commvault, and the bucket name is commvault-test.

{
  "Id": "Policy1563859773493",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1563859751962",
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::commvault-test",
      "Principal": {
        "AWS": [
          "arn:aws:iam::0:user/commvault"
        ]
      }
    },
    {
      "Sid": "Stmt1563859771357",
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::commvault-test/*",
      "Principal": {
        "AWS": [
          "arn:aws:iam::0:user/commvault"
        ]
      }
    }
  ]
}

You can read more about the policy elements here.

Updated Articles Page

I recently had the opportunity to deploy a Nasuni NMC and filer in the lab and thought I’d run through the basics. There’s a new document outlining the process on the articles page.

Cohesity Basics – Configuring An External Target For Cloud Archive

I’ve been working in the lab with Pure Storage’s ObjectEngine and thought it might be nice to document the process to set it up as an external target for use with Cohesity’s Cloud Archive capability. I’ve written in the past about Cloud Tier and Cloud Archive, but in that article I focused more on the Cloud Tier capability. I don’t want to sound too pretentious, but I’ll quote myself from the other article: “With Cloud Archive you can send copies of snapshots up to the cloud to keep as a copy separate to the backup data you might have replicated to a secondary appliance. This is useful if you have some requirement to keep a monthly or six-monthly copy somewhere for compliance reasons.”

I would like to be clear that this process hasn’t been blessed or vetted by Pure Storage or Cohesity. I imagine they are working on delivering a validated solution at some stage, as they have with Veeam and Commvault. So don’t go out and jam this in production and complain to me when Pure or Cohesity tell you it’s wrong.

There are a couple of ways you can configure an external target via the Cohesity UI. In this example, I’ll do it from the dashboard, rather than during the protection job configuration. Click on Protection and select External Target.

You’ll then be presented with the New Target configuration dialogue.

In this example, I’m calling my external target PureOE, and setting its purpose as Archival (as opposed to Tiering).

The Type of target is “S3 Compatible”.

Once you select that, you’ll be asked for a bunch of S3-type information, including Bucket Name and Access Key ID. This assumes you’ve already created the bucket and configured appropriate security on the ObjectEngine side of things.

Enter the required information. I’ve de-selected compression and source side deduplication, as I’m wanting that the data reduction to be done by the ObjectEngine. I’ve also disabled encryption, as I’m guessing this will have an impact on the ObjectEngine as well. I need to confirm that with my friends at Pure. I’m using the fully qualified domain name of the ObjectEngine as the endpoint here as well.

Once you click on Register, you’ll be presented with a summary of the configuration.

You’re then right to use this as an external target for Archival parts of protection jobs within your Cohesity environment. Once you’ve run a few protection jobs, you should start to see files within the test bucket on the ObjectEngine. Don’t forget that, as fas as I’m aware, it’s still very difficult (impossible?) to remove external targets from the the Cohesity Data Platform, so don’t get too carried away with configuring a bunch of different test targets thinking that you can remove them later.

Veeam Basics – Configuring A Scale-Out Backup Repository

I’ve been doing some integration testing with Pure Storage and Veeam in the lab recently, and thought I’d write an article on configuring a scale-out backup repository (SOBR). To learn more about SOBR configurations, you can read the Veeam documentation here. This post from Rick Vanover also covers the what and the why of SOBR. In this example, I’m using a couple of FlashBlade-based NFS repositories that I’ve configured as per these instructions. Each NFS repository is mounted on a separate Linux virtual machine. I’m using a Windows-based Veeam Backup & Replication server running version 9.5 Update 4.

 

Process

Start by going to Backup Infrastructure -> Scale-out Repositories and click on Add Scale-out Repository.

Give it a name, maybe something snappy like “Scale-out Backup Repository 1”?

Click on Add to add the backup repositories.

When you click on Add, you’ll have the option to select the backup repositories you want to use. You can select them all, but for the purpose of this exercise, we won’t.

In this example, Backup Repository 1 and 2 are the NFS locations I configured previously. Select those two and click on OK.

You’ll now see the repositories listed as Extents.

Click on Advanced to check the advanced setttings are what you expect them to be. Click on OK.

Click Next to continue. You’ll see the following message.

You then choose the placement policy. It’s strongly recommended that you stick with Data locality as the placement policy.

You can also pick object storage to use as a Capacity Tier.

You’ll also have an option to configure the age of the files to be moved, and when they can be moved. And you might want to encrypt the data uploaded to your object storage environment, depending on where that object storage lives.

Once you’re happy, click on Apply. You’ll be presented with a summary of the configuration (and hopefully there won’t be any errors).

 

Thoughts

The SOBR feature, in my opinion, is pretty cool. I particularly like the ability to put extents in maintenance mode. And the option to use object storage as a capacity tier is a very useful feature. You get some granular control in terms of where you put your backup data, and what kind of performance you can throw at the environment. And as you can see, it’s not overly difficult to configure the environment. There are a few things to keep on mind though. Make sure your extents are stored on resilient hardware. If you keep your backup sets together with the data locality option, you’l be a sad panda if that extent goes bye bye. And the same goes for the performance option. You’ll also need Enterprise or Enterprise Plus editions of Veeam Backup & Replication for this feature to work. And you can’t use this feature for these types of jobs:

  • Configuration backup job;
  • Replication jobs (including replica seeding);
  • VM copy jobs; and
  • Veeam Agent backup jobs created by Veeam Agent for Microsoft Windows 1.5 or earlier and Veeam Agent for Linux 1.0 Update 1 or earlier.

There are any number of reasons why a scale-out backup repository can be a handy feature to use in your data protection environment. I’ve had the misfortune in the past of working with products that were difficult to manage from a data mobility perspective. Too many times I’ve been stuck going through all kinds of mental gymnastics working out how to migrate data sets from one storage platform to the next. With this it’s a simple matter of a few clicks and you’re on your way with a new bucket. The tiering to object feature is also useful, particularly if you need to keep backup sets around for compliance reasons. There’s no need to spend money on these living on performance disk if you can comfortably have them sitting on capacity storage after a period of time. And if you can control this movement through a policy-driven approach, then that’s even better. If you’re new to Veeam, it’s worth checking out a feature like this, particularly if you’re struggling with media migration challenges in your current environment. And if you’re an existing Enterprise or Enterprise Plus customer, this might be something you can take advantage of.

OpenMediaVault – Good Times With mdadm

Happy 2019. I’ve been on holidays for three full weeks and it was amazing. I’ll get back to writing about boring stuff soon, but I thought I’d post a quick summary of some issues I’ve had with my home-built NAS recently and what I did to fix it.

Where Are The Disks Gone?

I got an email one evening with the following message.

I do enjoy the “Faithfully yours, etc” and the post script is the most enlightening bit. See where it says [UU____UU]? Yeah, that’s not good. There are 8 disks that make up that device (/dev/md0), so it should look more like [UUUUUUUU]. But why would 4 out of 8 disks just up and disappear? I thought it was a little odd myself. I had a look at the ITX board everything was attached to and realised that those 4 drives were plugged in to a PCI SATA-II card. It seems that either the slot on the board or the card are now failing intermittently. I say “seems” because that’s all I can think of, as the S.M.A.R.T. status of the drives is fine.

Resolution, Baby

The short-term fix to get the filesystem back on line and useable was the classic “assemble” switch with mdadm. Long time readers of this blog may have witnessed me doing something similar with my QNAP devices from time to time. After panic rebooting the box a number of times (a silly thing to do, really), it finally responded to pings. Checking out /proc/mdstat wasn’t good though.

dan@openmediavault:~$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
unused devices: <none>

Notice the lack of, erm, devices there? That’s non-optimal. The fix requires a forced assembly of the devices comprising /dev/md0.

dan@openmediavault:~$ sudo mdadm --assemble --force --verbose /dev/md0 /dev/sd[abcdefhi]
[sudo] password for dan:
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdh is identified as a member of /dev/md0, slot 7.
mdadm: /dev/sdi is identified as a member of /dev/md0, slot 6.
mdadm: forcing event count in /dev/sdd(2) from 40639 upto 40647
mdadm: forcing event count in /dev/sdc(3) from 40639 upto 40647
mdadm: forcing event count in /dev/sdf(4) from 40639 upto 40647
mdadm: forcing event count in /dev/sde(5) from 40639 upto 40647
mdadm: clearing FAULTY flag for device 3 in /dev/md0 for /dev/sdd
mdadm: clearing FAULTY flag for device 2 in /dev/md0 for /dev/sdc
mdadm: clearing FAULTY flag for device 5 in /dev/md0 for /dev/sdf
mdadm: clearing FAULTY flag for device 4 in /dev/md0 for /dev/sde
mdadm: Marking array /dev/md0 as 'clean'
mdadm: added /dev/sdb to /dev/md0 as 1
mdadm: added /dev/sdd to /dev/md0 as 2
mdadm: added /dev/sdc to /dev/md0 as 3
mdadm: added /dev/sdf to /dev/md0 as 4
mdadm: added /dev/sde to /dev/md0 as 5
mdadm: added /dev/sdi to /dev/md0 as 6
mdadm: added /dev/sdh to /dev/md0 as 7
mdadm: added /dev/sda to /dev/md0 as 0
mdadm: /dev/md0 has been started with 8 drives.

In this example you’ll see that /dev/sdg isn’t included in my command. That device is the SSD I use to boot the system. Sometimes Linux device conventions confuse me too. If you’re in this situation and you think this is just a one-off thing, then you should be okay to unmount the filesystem, run fsck over it, and re-mount it. In my case, this has happened twice already, so I’m in the process of moving data off the NAS onto some scratch space and have procured a cheap little QNAP box to fill its role.

 

Conclusion

My rush to replace the homebrew device with a QNAP isn’t a knock on the OpenMediaVault project by any stretch. OMV itself has been very reliable and has done everything I needed it to do. Rather, my ability to build semi-resilient devices on a budget has simply proven quite poor. I’ve seen some nasty stuff happen with QNAP devices too, but at least any issues will be covered by some kind of manufacturer’s support team and warranty. My NAS is only covered by me, and I’m just not that interested in working out what could be going wrong here. If I’d built something decent I’d get some alerting back from the box telling me what’s happened to the card that keeps failing. But then I would have spent a lot more on this box than I would have wanted to.

I’ve been lucky thus far in that I haven’t lost any data of real import (the NAS devices are used to store media that I have on DVD or Blu-Ray – the important documents are backed up using Time Machine and Backblaze). It is nice, however, that a tool like mdadm can bring you back from the brink of disaster in a pretty efficient fashion.

Incidentally, if you’re a macOS user, you might have a bunch of .ds_store files on your filesystem. Or stuff like .@Thumb or some such. These things are fine, but macOS doesn’t seem to like them when you’re trying to move folders around. This post provides some handy guidance on how to get rid of a those files in a jiffy.

As always, if the data you’re storing on your NAS device (be it home-built or off the shelf) is important, please make sure you back it up. Preferably in a number of places. Don’t get yourself in a position where this blog post is your only hope of getting your one copy of your firstborn’s pictures from the first day of school back.

Updated Articles Page

I recently had the opportunity to upgrade my Cohesity lab environment using Helios and thought I’d run through the basics. There’s a new document outlining the process on the articles page.

Cohesity – Cohesity Cluster Virtual Edition ESXi – A Few Notes

I’ve covered the Cohesity appliance deployment in a howto article previously. I’ve also made use of the VMware-compatible Virtual Edition in our lab to test things like cluster to cluster replication and cloud tiering. The benefits of virtual appliances are numerous. They’re generally easy to deploy, don’t need dedicated hardware, can be re-deployed quickly when you break something, and can be a quick and easy way to validate a particular process or idea. They can also be a problem with regards to performance, and are at the mercy of the platform administrator to a point. But aren’t we all? With 6.1, Cohesity have made available a clustered virtual edition (the snappily titled Cohesity Cluster Virtual Edition ESXi). If you have access to the documentation section of the Cohesity support site, there’s a PDF you can download that explains everything. I won’t go into too much detail but there are a few things to consider before you get started.

 

Specifications

Base Appliance 

Just like the non-clustered virtual edition, there’s a small and large configuration you can choose from. The small configuration supports up to 8TB for the Data disk, while the large configuration supports up to 16TB for the Data disk. The small config supports 4 vCPUs and 16GB of memory, while the large configuration supports 8 vCPUs and 32GB of memory.

Disk Configuration

Once you’ve deployed the appliance, you’ll need to add the Metadata disk and Data disk to each VM. The Metadata disk should be between 512GB and 1TB. For the large configuration, you can also apparently configure 2x 512GB disks, but I haven’t tried this. The Data disk needs to be between 512GB and 8TB for the small configuration and up to 16TB for the large configuration (with support for 2x 8TB disks). Cohesity recommends that these are formatted as Thick Provision Lazy Zeroed and deployed in Independent – Persistent mode. Each disk should be attached to its own SCSI controller as well, so you’ll have the system disk on SCSI 0:0, the Metadata disk on SCSI 1:0, and so on.

I did discover a weird issue when deploying the appliance on a Pure Storage FA-450 array in the lab. In vSphere this particular array’s datastore type is identified by vCenter as “Flash”. For my testing I had a 512GB Metadata disk and 3TB Data disk configured on the same datastore, with the three nodes living on three different datastores on the FlashArray. This caused errors with the cluster configuration, with the configuration wizard complaining that my SSD volumes were too big.

I moved the Data disk (with storage vMotion) to an all flash Nimble array (that for some reason was identified by vSphere as “HDD”) and the problem disappeared. Interestingly I didn’t have this problem with the single node configuration of 6.0.1 deployed with the same configuration. I raised a ticket with Cohesity support and they got back to me stating that this was expected behaviour in 6.1.0a. They tell me, however, that they’ve modified the behaviour of the configuration routine in an upcoming version so fools like me can run virtualised secondary storage on primary storage.

Erasure Coding

You can configure the appliance for increased resiliency at the Storage Domain level as well. If you go to Platform – Cluster – Storage Domains you can modify the DefaultStorageDomain (and other ones that you may have created). Depending on the size of the cluster you’ve deployed, you can choose the number of failures to tolerate and whether or not you want erasure coding enabled.

You can also decide whether you want EC to be a post-process activity or something that happens inline.

 

Process

Once you’ve deployed (a minimum) 3 copies of the Clustered VE, you’ll need to manually add Metadata and Data disks to each VM. The specifications for these are listed above. Fire up the VMs and go to the IP of one of the nodes. You’ll need to log in as the admin user with the appropriate password and you can then start the cluster configuration.

This bit is pretty much the same as any Cohesity cluster deployment, and you’ll need to specify things like a hostname for the cluster partition. As always, it’s a good idea to ensure your DNS records are up to date. You can get away with using IP addresses but, frankly, people will talk about you behind your back if you do.

At this point you can also decide to enable encryption at the cluster level. If you decide not to enable it you can do this on a per Domain basis later.

Click on Create Cluster and you should see something like the following screen.

Once the cluster is created, you can hit the virtual IP you’ve configured, or any one of the attached nodes, to log in to the cluster. Once you log in, you’ll need to agree to the EULA and enter a license key.

 

Thoughts

The availability of virtual appliance versions for storage and data protection solutions isn’t a new idea, but it’s certainly one I’m a big fan of. These things give me an opportunity to test new code releases in a controlled environment before pushing updates into my production environment. It can help with validating different replication topologies quickly, and validating other configuration ideas before putting them into the wild (or in front of customers). Of course, the performance may not be up to scratch for some larger environments, but for smaller deployments and edge or remote office solutions, you’re only limited by the available host resources (which can be substantial in a lot of cases). The addition of a clustered version of the virtual edition for ESXi and Hyper-V is a welcome sight for those of us still deploying on-premises Cohesity solutions (I think the Azure version has been clustered for a few revisions now). It gets around the main issue of resiliency by having multiple copies running, and can also address some of the performance concerns associated with running virtual versions of the appliance. There are a number of reasons why it may not be the right solution for you, and you should work with your Cohesity team to size any solution to fit your environment. But if you’re running Cohesity in your environment already, talk to your account team about how you can leverage the virtual edition. It really is pretty neat. I’ll be looking into the resiliency of the solution in the near future and will hopefully be able to post my findings in the next few weeks.

Updated Articles Page

I recently had the opportunity to configure multi-tenancy in my Cohesity lab environment and thought I’d run through the basics. There’s a new document outlining the process on the articles page.

Updated Articles Page

I recently had the opportunity to deploy a Vembu BDR 3.9.1 Update 1 appliance and thought I’d run through the basics of getting started. There’s a new document outlining the process on the articles page.

Getting Started With The Pure Storage CLI

I used to write a lot about how to manage CLARiiON and VNX storage environments with EMC’s naviseccli tool. I’ve been doing some stuff with Pure Storage FlashArrays in our lab and thought it might be worth covering off some of the basics of their CLI. This will obviously be no replacement for the official administration guide, but I thought it might come in useful as a starting point.

 

Basics

Unlike EMC’s CLI, there’s no executable to install – it’s all on the controllers. If you’re using Windows, PuTTY is still a good choice as an ssh client. Otherwise the macOS ssh client does a reasonable job too. When you first setup your FlashArray, a virtual IP (VIP) was configured. It’s easiest to connect to the VIP, and Purity then directs your session to whichever controller is the current primary controller. Note that you can also connect via the physical IP address if that’s how you want to do things.

The first step is to login to the array as pureuser, with the password that you’ve definitely changed from the default one.

login as: pureuser
pureuser@10.xxx.xxx.30's password:
Last login: Fri Aug 10 09:36:05 2018 from 10.xxx.xxx.xxx

Mon Aug 13 10:01:52 2018
Welcome pureuser. This is Purity Version 4.10.4 on FlashArray purearray
http://www.purestorage.com/

“purehelp” is the command to run to list available commands.

pureuser@purearray> purehelp
Available commands:
-------------------
pureadmin
purealert
pureapp
purearray
purecert
pureconfig
puredns
puredrive
pureds
purehelp
purehgroup
purehost
purehw
purelog
pureman
puremessage
purenetwork
purepgroup
pureplugin
pureport
puresmis
puresnmp
puresubnet
puresw
purevol
exit
logout

If you want to get some additional help with a command, you can run “command -h” (or –help).

pureuser@purearray> purevol -h
usage: purevol [-h]
               {add,connect,copy,create,destroy,disconnect,eradicate,list,listobj,monitor,recover,remove,rename,setattr,snap,truncate}
               ...

positional arguments:
  {add,connect,copy,create,destroy,disconnect,eradicate,list,listobj,monitor,recover,remove,rename,setattr,snap,truncate}
    add                 add volumes to protection groups
    connect             connect one or more volumes to a host
    copy                copy a volume or snapshot to one or more volumes
    create              create one or more volumes
    destroy             destroy one or more volumes or snapshots
    disconnect          disconnect one or more volumes from a host
    eradicate           eradicate one or more volumes or snapshots
    list                display information about volumes or snapshots
    listobj             list objects associated with one or more volumes
    monitor             display I/O performance information
    recover             recover one or more destroyed volumes or snapshots
    remove              remove volumes from protection groups
    rename              rename a volume or snapshot
    setattr             set volume attributes (increase size)
    snap                take snapshots of one or more volumes
    truncate            truncate one or more volumes (reduce size)

optional arguments:
  -h, --help            show this help message and exit

There’s also a facility to access the man page for commands. Just run “pureman command” to access it.

Want to see how much capacity there is on the array? Run “purearray list –space”.

pureuser@purearray> purearray list --space
Name        Capacity  Parity  Thin Provisioning  Data Reduction  Total Reduction  Volumes  Snapshots  Shared Space  System  Total
purearray  12.45T    100%    86%                2.4 to 1        17.3 to 1        350.66M  3.42G      3.01T         0.00    3.01T

Need to check the software version or generally availability of the controllers? Run “purearray list –controller”.

pureuser@purearray> purearray list --controller
Name  Mode       Model   Version  Status
CT0   secondary  FA-450  4.10.4   ready
CT1   primary    FA-450  4.10.4   ready

 

Connecting A Host

To connect a host to an array (assuming you’ve already zoned it to the array), you’d use the following commands.

purehost create hostname
purehost create -wwnlist WWNs hostname
purehost list
purevol connect --host [host] [volume]

 

Host Groups

You might need to create a Host Group if you’re running ESXi and want to have multiple hosts accessing the same volumes. Here’re the commands you’ll need. Firstly, create the Host Group.

purehgroup create [hostgroup]

Add the hosts to the Host Group (these hosts should already exist on the array)

purehgroup setattr --hostlist host1,host2,host3 [hostgroup]

You can then assign volumes to the Host Group

purehgroup connect --vol [volume] [hostgroup]

 

Other Volume Operations

Some other neat (and sometimes destructive) things you can do with volumes are listed below.

To resize a volume, use the following commands.

purevol setattr --size 500G [volume]
purevol truncate --size 20GB [volume]

Note that a snapshot is available for 24 hours to roll back if required. This is good if you’ve shrunk a volume to be smaller than the data on it and have consequently munted the filesystem.

When you destroy a volume it immediately becomes unavailable to host, but remains on the array for 24 hours. Note that you’ll need to remove the volume from any hosts connected to it first.

purevol disconnect [volume] --host [hostname]
purevol destroy [volume]

If you’re running short of capacity, or are just curious about when a deleted volume will disappear, use the following command.

purevol list --pending

If you need the capacity back immediately, the deleted volume can be eradicated with the following comamnd.

purevol eradicate [volume]

 

Further Reading

The Pure CLI is obviously not a new thing, and plenty of bright folks have already done a few articles about how you can use it as part of a provisioning workflow. This one from Chadd Kenney is a little old now but still demonstrates how you can bring it all together to do something pretty useful. You can obviously extend that to do some pretty interesting stuff, and there’s solid parity between the GUI and CLI in the Purity environment.

It seems like a small thing, but the fact that there’s no need to install an executable is a big thing in my book. Array vendors (and infrastructure vendors in general) insisting on installing some shell extension or command environment is a pain in the arse, and should be seen as an act of hostility akin to requiring Java to complete simple administration tasks. The sooner we get everyone working with either HTML5 or simple ssh access the better. In any csase, I hope this was a useful introduction to the Purity CLI. Check out the Administration Guide for more information.