EMC – CX POST output

Following on from my two previous posts on the CLARiiON POST and troubleshooting – here’s what a CX POST looks like when viewed from SP B.

CX_POST

Note that the chassis and disk WWN seeds match – this is useful if you’ve moved multiple arrays (during a data centre move, for example) and didn’t take note of the SP and DPE pairings (tsk, tsk). Note also that it’s looking at disks 1 and 3 – these are the boot disks for SP B (SP A uses disks 0 and 2).

CX_POST2

Note that you never get to a point where you can login via the serial console – it’s really only used for early diagnostics and monitoring.

CX_POST3

EMC – CX Troubleshooting – Part 2

In my previous post I talked about how error codes on a CX POST actually mean something, and can assist in your troubleshooting activities. But what does that stupid alphabet soup mean when you see a CLARiiON boot up?

CX_POST

Turns out all of those crazy letters mean something too, and you can refer to this PDF file I put together to work out at what point the array is failing its tests. Please note that I haven’t verified if these apply to the CX3, CX4 or VNX.

EMC – CX Troubleshooting – Part 1

Despite that fact that I’ve written over 270 posts in the past 5 years on this blog, one of my most popular posts has been my article on CLARiiON CX700 FLARE Recovery. I’ve been assisting someone via e-mail over the past month or so who was having problems getting a CX700 he’d acquired to boot. He’s a smart guy, but hasn’t used a CLARiiON before. And I was working from muscle memory and unable to eyeball the console for myself. So it was an interesting challenge, combined with varying time zones.

Anyway, I thought it would be interesting to do one or two posts on some basic CX stuff that may or may not assist people who are doing this for the first time. This isn’t going to be a comprehensive series but rather a few notes and examples as I think of them.

In this instance, my correspondent had a terminal connection to the array, and was seeing the following output:

AabcdefgBCDEabFabcdGHabIabcJabcKabcLabcMabcNabOabPabQabRabSabTabUabVabWabXY
EndTime: 07/29/2013 04:11:59
.... Storage System Failure - Contact your Service Representative ...
ErrorCode: 0x00000142
ErrorDesc:
Device: LCC 0 UART
FRU: STORAGE PROCESSOR
Description: LCC slot indicator Error!
Error detected when handling LCC READ command
EndError:
ErrorTime: 07/29/2013 04:11:38


Basically, the key thing was that error code ending in 142. According to this list of CX error codes I dug up, it indicates some sort of problem with the LCC. What wasn’t clear until much later, unfortunately, was which SP my correspondent was connected to. It turned out that SP A was faulty and needed to be replaced. There’s also a LCC 0 UART Sub-Menu available from the Diagnostics section of the Utility Partition. You can perform LCC diagnostics at this point to verify the POST errors you’re seeing. In short, pulling SP A allowed the system to boot, and error codes mean something to someone. Please note that I haven’t verified if these apply to the CX3, CX4 or VNX.

EMC – DIY Heatmaps

My friend Mat has developed a pretty cool script that can make pretty pictures out of Analyzer files from CLARiiON and VNX arrays. He’s decided to release it to the world, so you can download it here. I’ve also added the following instructions to a pdf document available here. Here’s a sample output file from the script. He’s after feedback as well, so send it through and I’ll make sure it gets to him.

Purpose:

The heatmap script allows you to generate heatmaps of various metrics over time from NAR files generated from EMC Clariion/VNX arrays.

Requirements/Notes:

  • This script was developed and tested using  Strawberry Perl (v5.12.3), but there is no reason it won’t work with other flavours or versions of Perl, but it is possible that other perl modulse other than the one listed below may need to be installed
  • It requires the  Text::CSV module for Perl.  Available here http://search.cpan.org/CPAN/authors/id/M/MA/MAKAMAKA/Text-CSV-1.21.tar.gz , if you can get cpan working  (run “cpan Text::CSV”) it will be easy to install, otherwise you’ll have to do it manually …

Unzip the files into eg C:\Text-CSV-1.21 and run the following commands

C:\Text-CSV-1.21>perl Makefile.PL

C:\Text-CSV-1.21>dmake

C:\Text-CSV-1.21>dmake test

C:\Text-CSV-1.21>dmake install

  • The script uses the Microsoft tool logparser.exe to manipulate various CSV files available here http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=24659
  • The script uses naviseccli to convert the NAR files into CSV files.
  • At the moment the script will only run on a windows server/PC due to its reliance on the logparser tool, if people are interested I will investigate other tools available in the Linux/Unix worlds.
  • The default min/max figures that I am using may be unrealistic, I am by no means an expert in analysing the performance of an array, any feedback is welcome.

Usage:

The following are the command line options available for the script

–logparser <path\filename>

–naviseccli <path\filename>

These options allow you to set the location of the various 3rd party binaries that the script relies on

–avg_interval <seconds>

The interval that the analyser stats are averaged down to, ie 1800 seconds will set the average interval to 30 minutes

–nar_file <filename>

The input NAR file(s), more than one NAR file can be specified

–out <filename>

The filename of the generated HTLM output file, the default is HEATMAP.HTML

–array_name <name>

The name of the array, this is for reporting purposes only

–summary

Generate a summary for each attribute type, displaying minimum, average and maximum figures for each metric

Metric

Min

Max

Avg

Disk Utilization (%)

0.27

97.38

8.16

Disk Total Throughput (IO/s)

0.72

309.06

23.03

SP Utilization (%)

38.03

61.11

49.78

SP Total Throughput (IO/s)

5588.70

25693.24

10088.56

–mash

–mashonly

This generated a mash table for each metric type (currently only Storage Processor and Disk), averaging each of the metrics for each type down into a single table, allowing you to combine multiple metrics into a single table.

The –mash option will display the mash-up table alongside the selected metrics, and the –mashonly options will only display the mash-up tables.

–get_drive_type

–array_ip <ip addresS>

This option allows you to query the array (as it is currently configured) to determine drive types (currently only SATA II, FIBRE CHANNEL, and SATA II SSD), currently this options only affects the IOPS calculations for drives.

–user_id <user id>

–pwd <password>

–scope <0|1>

These options are only used in conjunction with the –get_drive_type option, if you have cached credentials for the array configured then these options should not be necessary.

–display_drive_type

This option is only used in conjunction with the –get_drive_type option, it will display another drive table, and allow you to view the different drive type, drive size, pool and RAID Group layouts.

–disk_highlight

This option is only used in conjunction with the –get_drive_type and –display_drive_type options, it will highlight the corresponding disks in the other heatmaps, it will also allow you to select all of the drives that have the same attributes as displayed by the –display_drive_type option.

–config_file <filename>

–generate_config <filename>

These options allow you to generate a configuration file, using the –generate_config option, this will generate a config file with all of the default attributes, and to use a configuration file using the –config_file option.

–help

This option will display the following help information.

 

Heatmap Generator

 

Usage: heatmap.3.010.pl <options>

Where options can be the following:

–logparser <path\filename> – The path to the logpaser executable (c:/Program Files/Log Parser 2.2/LogParser.exe)

–naviseccli <path\filename> – The path to the naviseccli executable (c:/Program Files/EMC/Navisphere CLI/NaviSECCli.exe)

–avg_interval <seconds>     – The interval in seconds that the stats are averaged at (1800 seconds)

–nar_file <filename>        – NAR input file(s) this option can be specified multiple times

–out <filename>             – The output filename (heatmap.html)

–array_name <name>          – Set the name of the array in the report

–summary                    – Displays a summary of each metric

–mash                       – Creates a mash-up of each metric per object type, and displays alongside other metrics

–mashonly                   – Creates a mash-up of each metric per object type, and only displays the mash-ups

–get_drive_type             – Query Array to get drive type information

–array_ip <ip addresS>      – Set the IP address of the array

–user_id <user id>          – Set the user ID to log into the array

–pwd <password>             – Set the user password of the array

–scope <0|1>                – Set the Scope of the array account

–display_drive_type         – Display the drive types in the charts

–disk_highlight             – Highlight drives on mouse over

–config_file <filename>     – Use a configuration file to set attribute min/maxes

–generate_config <filename> – Generate a configuration file using the defined defaults

–help                       – This help

–attrib <attribute>         – Set the attribute to graph, where attribute can be the following

d_utilization – display stats based on disk utilization (%)

d_iops        – display stats based on disk Total Throughput (IOPS)

d_r_iops      – display stats based on disk read IOPS

d_w_iops      – display stats based on disk write IOPS

d_queue       – display stats based on disk Queue Length

d_b_queue     – display stats based on disk Average Busy Queue Length

d_response    – display stats based on disk Response Time

d_service     – display stats based on disk Service Time

d_bandwidth   – display stats based on disk Total Bandwidth

d_r_bandwidth – display stats based on disk Read Bandwidth

d_w_bandwidth – display stats based on disk Write Bandwidth

d_r_size      – display stats based on disk Read Size

d_w_size      – display stats based on disk Write Size

d_seek        – display stats based on disk Average Seek Distance

 

s_utilization – display stats based on SP utilization

s_response    – display stats based on SP Response Time

s_bandwidth   – display stats based on SP Total Bandwidth

s_iops        – display stats based on SP Total Throughput (IOPs)

s_b_queue     – display stats based on SP Average Busy Queue Length

s_service     – display stats based on SP Service Time

s_c_dirty     – display stats based on SP Cache Dirty Pages (%)

s_c_flush     – display stats based on SP Cache Flush Ratio

s_c_flush_mb  – display stats based on SP Cache MBs Flushed (MB/s)

s_c_hw_flush  – display stats based on SP Cache High Water Flush On

s_c_i_flush   – display stats based on SP Cache Idle Flush On

s_c_lw_flush  – display stats based on SP Cache Low Water Flush Off

s_wc_flush    – display stats based on SP Write Cache Flushes/s

s_fc_dirty    – display stats based on FAST Cache Dirty Pages (%)

s_fc_flush_mb – display stats based on FAST Cache MBs Flushed (MB/s)

 

Running the script without any –nar_file options will result in the script prompting the user to supply NAR file(s)

–attrib <attribute>

Allows you to select which attributes you want to display on the heatmap.

EMC – naviseccli – Basics – Part 1

For those of you unfamiliar with the EMC tool naviseccli, I thought I’d do a few posts on some useful commands that can be run to save you some time if you’ve been doing a lot of repetitive tasks. The naviseccli tool is very powerful and can be unforgiving, so as always I recommend you read the manual and be sure that you know what you’re doing before you do it. The tool is ostensibly used to send status or configuration requests to a storage system (specifically the VNX or CX4) via the command line and can be installed on a Windows, Linux, Solaris or whatever platform.

Typing naviseccli at the prompt will provide the following:

[-address IPAddress|NetworkName|-h IPAddress|NetworkName]
[-AddUserSecurity]
[-f filename]
[-m]
[-nopoll|-np]
[-parse|-p]
[-password password]
[-port port]
[-q]
[-RemoveUserSecurity]
[-scope 0|1|2]
[-timeout |-t timeout]
[-user username]
[-v]
[-xml]
CMD [optional_command_switches]

There’s a (metric) tonne of other stuff it can do via the CMD switch, but I thought for an introduction we’ll start with the basics. I’m going to try and avoid regurgitating the user manual, and instead focus on real world examples where it’s come in useful for me.  Okay, so maybe I’ll regurgitate a portion of the manual. So let’s get down to it.

-f filename. This switch specifies to store data in a file. If you’re working inside a script you may find this is a more useful option than other output options you have available.
-m. Suppresses output except for values. This option is most useful when used as part of a script. Note that this is only supported for commands that originated in Classic CLI.
-nopoll|-np. Directs the feature provider not to issue a poll request. This switch significantly increases performance when dealing with large or multiple storage systems. The feature provider automatically polls unless this switch is specified. Note that when the -nopoll switch is set, get commands may return stale data and set commands may erase previously changed settings. Use caution when the -nopoll switch is set.

-user. If you don’t want to create a security file on the machine you’re working on (I’ll cover this in a future post), you’ll need to specify a username that works on the system you’re addressing. At this point you should also provide the scope (0|1|2), with 0 being global, 1 being local to the SP you’re addressing, and 2 being LDAP credentials.

-port portnumber. Sets the port number (type) of the storage system. The default is 443. If you choose to change the default port number, management port 2163 will be supported; however, you will need to specify the -port switch and number 2163 in every subsequent command you issue. That doesn’t sound like fun does it?

New Article – Removing an array from a CLARiiON domain

A new article has been added to the Articles page. Something to do with a little problem I had removing an erased CX700 from a domain after the fact. Hopefully it’s useful.

New Article – Adding capacity to the Reserved LUN Pool

Another simple one that I thought was worth documenting for the cosmetic changes in Unisphere. You can find it here. Check out the rest of my equally exciting guides here.

EMC MirrorView – New Article

I wrote a brief article on configuring EMC MirrorView from scratch through to being ready for VMware SRM usage. It’s not really from scratch, because I don’t go through the steps required to load the MirrorView enabler on the frame. But I figured that’s more a CE-type activity in any case. I hope to follow it up with a brief article on the SRM side of things. You can find my other articles here. Enjoy!

EMC Unisphere – things I’ve liked so far – Part 1

We upgraded our CX4-960s to FLARE 30 a few days ago, and while Unisphere is old hat to a few people, this was the first time I’ve had a chance to use it beyond a few tech meetings at our local EMC office. I’m not terribly good at software reviews – tending more towards “this sucks!” or “this rocks!” – but I thought maybe a few articles on what’s working for me, along with a few “where do i set this in Unisphere?” posts might be useful.

I’m a big fan of the dashboard that Unisphere first presents – particularly the capacity overview. The only drawback is that you can only monitor FLARE 30 systems with it.

It’s not a major issue – but if you’ve got some “legacy” CLARiiON infrastructure – yes, you over there still running FC-4700s – you won’t be able to see them in the capacity overview. In our case, we have some CX3-20s and a CX700 – all of which are heading for midrange array heaven in the next 6 months. It’s a lot better than having to log in to ControlCenter …

Latest FLARE 26, CX700s, NST and processor utilisation

Has anyone else tried loading the latest FLARE 26 on a CX700? I loaded CX700-Bundle-02.26.700.5.031.pbu on two CX700s in our lab today and got this error both times.

I’m using NST NaviServiceTaskbar-Win-32-x86-loc-6.29.2.0.62-6.exe on both Windows 7 and Windows 2003 hosts. I tried the steps in emc211113 (the snappily titled “What are my options if my NDU rules check fails the CPU Utilization check?”), but didn’t have any luck. Well, I did, in the sense that I enabled Engineering Mode in the NST (CTRL-SHIFT-F12, insert password), but I get the feeling it wasn’t meant to go that way. Everything seems fine on the arrays now, and no hosts lost access. But I don’t recall ever having to bypass this check. I’m assuming that it has a lot to do with the age of the arrays, and I’ll be honest I haven’t done a lot of log checking to verify what’s been going on. We’re holding out on upgrading our CX4-960s until FLARE 30 is released, so we’ll see what happens then …