short definition
A command line interface for managing CLARiiON storage. Short for Navisphere cli.
background
host
The system I'm referring to while writing this file is a 5 board
e10k domain with 3
CLARiiON 5700's attached to it.
each CLARiiON cabinet has:
naviagent runs on a host that is connected to the cabinets. navicli or
the navisphere gui communications with naviagent.
Each LUN2 on a cabinet has a default SP and a current SP.
If both SPs are functional, each LUN will be on its default SP.
In the event that both SPs arent functional, you may (depending on whether
or not you have ATF3 installed)
have to manually failover a LUN to the
functional SP, which will become the LUN's current SP
or current owner.
Any traffic to a LUN will go to will go
over the connection to the current SP or current owner.
Basically you "bind" (create) LUNs which appear to
be regular disks to solaris (or whatever operating system you use).
People usually put these into vxvm
in case they want to move around the cabinets (deport the
diskgroup(s) in the cabinet).
organization of objects or whatever
raid groups contain LUNs.
Multiple LUNs can exist in a single raid group, but usually
I use a separate raid group for each LUN.
When unbinding a LUN, sometimes it doesnt unbind the raid group that contains it. Its good practice, when unbinding a LUN to first note the
raid group that contains it, and
then make sure that you have removed the raid group as well (obviously don't
do that if you have multiple LUNs per raid group, heh).
starting and stopping naviagent / config files
Bouncing the agent isnt a big deal, as it is only used for management.
starting the agent
# /etc/init.d/agent start
stopping the agent
# /etc/init.d/agent stop
/etc/Navisphere/agent.config is the config file for naviagent.
# more /etc/Navisphere/agent.config
clarDescr Navisphere Agent
clarContact John Smith, 800-555-1212
device c2t0d4s2 DPE-A_SPA "DPE-A_SPA"
device c4t1d5s2 DPE-A_SPB "DPE-A_SPB"
device c3t4d2s2 DPE-B_SPA "DPE-B_SPA"
device c5t5d3s2 DPE-B_SPB "DPE-B_SPB"
device c6t3d0s2 DPE-C_SPA "DPE-C_SPA"
device c7t2d1s2 DPE-C_SPB "DPE-C_SPB"
user root@127.0.0.1
user guiweakling@10.0.0.25
array 1234567 Cabinet-1
array 7654321 Cabinet-2
array 3214435 Cabinet-3
#
The only meaningfull things in this file are the device and user
lines. A CLARiiON cabinet can be managed via navicli if you
know the device of any disk on that cabinet.
Its easier to figure that stuff out once and then
put it in your agent.config. Once that is done, you can do
navicli getagent to quickly determind which disks you need to use
to manage a given cabinet.
The user lines define which users can manage the storage.
user root@127.0.0.1
indicates that the local root user is allowed to manage the
storage. Usually people don't have navicli installed on another
sun box to manage remote storage, so lines like
user guiweakling@10.0.0.25 generally
indicate that someone is using the NT navisphere gui at that host.
device c2t0d4s2 DPE-A_SPA "DPE-A_SPA"
DPE-A_SPA and "DPE-A_SPA" are arbitrary strings used by the user to
identify which cabinet corresponds with which device.
array 1234567 Cabinet-1
Cabinet-1 is also an arbitrary string, and the number before it is the serial number
of the cabinet. I'm pretty sure the array lines only get used by the
Navisphere gui (and who wants to use a gui).
navicli
navicli is actually a bunch of commands all rolled into one, if you run it with
no arguments, it will barf out a list of them:
# navicli
navicli [-p] [-v|q] [-m] [-np] [-t timeout] [-h hostname]
[-d device] [-help] CMD <optional-args>
Possible commands are: accesscontrol arrayname bind chglun
chgrg clearlog clearstats createrg failback fairness
firmware getagent getatf getcache getcontrol getcrus
getdisk getlog getloop getlun getrg getsniffer
port
r3wrbuff rebootSP removerg register setcache setloop
setraid5 setsniffer setspstime setstats storagegroup systemtype
trespass unbind SC_OFF
#
The commands focused on here are
bind, getagent, getdisk, getlog, getlun,
removerg, trespass, and unbind.
Usually commands without any parameters will give you all the information available.
By adding extra parameters the commands will return only the information
you are looking for. for example:
# navicli -d c2t0d4s2 getlun 0
Will return a Lot of information, but say you only want to know the size of the
LUN
# navicli -d c2t0d4s2 getlun 0 -capacity
Lun Capacity: 134903
#
Running "navicli getagent" with no arguments
returns a Lot of info..usually you know which
controllers4 your stuff is connected to (c2 and c4 are cabinet 1, c3 and c5
are cabinet 2, c6 and c7 are cabinet 3), so instead of looking at
format or
something, you can do
# navicli getagent -node
Node: c2t0d4s2
Node: c4t1d5s2
Node: c3t4d2s2
Node: c5t5d3s2
Node: c6t3d0s2
Node: c6t3d0s2
Node: c7t2d1s2
#
every other command takes a -d parameter to specify a device, ex:
# navicli -d c2t0d4s2 getlun 0
It doesnt matter which SP you use, if i did that same command with a -d of
c4t1d5s2 it would return the same result because disks on controllers 2 and
4 are cabinet 1 in this setup.
getlun tells you a lot of information about a LUN. Most importantly it tells you: which physical disks are involved in the LUN,
the LUN capacity, the default SP, the current SP, and the raid type.
syntax: navicli -d [device] getlun [LUN] <options>
# navicli -d c2t0d4s2 getlun 0
...
RAID Type: RAID5
RAIDGroup ID: 0
State: Bound
...
Current owner: SP B
...
Default Owner: SP B
...
Prct Rebuilt: 100
Prct Bound: 100
Lun Capacity: 134903
...
Enclosure 0 Disk 0 Enabled
...
Enclosure 0 Disk 1 Enabled
...
Enclosure 0 Disk 2 Enabled
...
Enclosure 0 Disk 3 Enabled
...
Enclosure 0 Disk 4 Enabled
...
#
disks 0-4 in enclosure 0 are involved in a 134903 meg raid5 set.
binds (creates) a LUN. Binding a LUN is a destructive action,
all data is zero'd out on the disks
(the reason it takes so long to bind a LUN).
syntax: navicli -d [device] bind [raid type] [LUN] [ ...] <options>
CLARiiONs support the following raid types (from navicli man page):
The LUN you specify has to not exist (duh). You want to alternate
binding LUNs
with default SPs of sp-a and sp-b to split the i/o load. Lets bind a
10 disk raid 5 (r5)
set on sp-b (lets pretend that LUN 9 doesnt exist and that none of the
disks on enclosure 8 are being used)
# navicli -d c2t0d4s2 bind r5 9 8_0 8_1 8_2 8_3 8_4 8_5 8_6 8_7 8_8 8_9 -sp b
#
The format (as you have noticed) is enclosure_disk5. the command returns
fast..it merely told the SP to bind that LUN. To check the binding
status:
# navicli -d c2t0d4s2 getlun 9 -state
State: Binding
#
ok so its binding...how much time until it finishes?
# navicli -d c2t0d4s2 getlun 9 -bind
Prct Bound: 10
#
When finished, the state of the lun will be bound (percent bound will be
100).
unbinds (destroys) a LUN.
syntax: navicli -d [device] unbind [LUN] <-o> <options>
the -o option will prevent navicli from prompting you "do you really want to
unbind? blah blah". Sometimes it doesnt remove the raid group
(perhaps i was imagining that..but lets be safe heh).
# navicli -d c2t0d4s2 getlun 9 -rg
RAIDGroup ID: 9
#
Note: the raid group id will sometimes differ from the LUN. don't assume
anything
# navicli -d c2t0d4s2 unbind 9 -o
# navicli -d c2t0d4s2 removerg 9 -o
#
Gets information about a disk (gee, thats a surprise): capacity, and a bunch
of other stuff.
syntax: navicli -d [device] getdisk [disk] <options>
# navcli -d c2t0d4s2 getdisk 0_0
Enclosure 0 Disk 0
Vendor Id: SEAGATE
Product Id: ST136403 CLAR36
Product Revision: 3844
Lun: 0
Type: 0: RAID5
State: Enabled
Hot Spare: 0: NO
Prct Rebuilt: 0: 100
Prct Bound: 0: 100
Serial Number: LT321123
Sectors: 69070464 (33725)
Capacity: 35458
Private: 0: 184320
Bind Signature: 0x13ac, 0, 0
Hard Read Errors: 0
Hard Write Errors: 0
Soft Read Errors: 0
Soft Write Errors: 0
Read Retries: 2714
Write Retries: 51
Remapped Sectors: 0
Number of Reads: 5270941
Number of Writes: 2810246
Number of Luns: 1
Raid Group ID: 0
#
This disk is a 36 gig (35458 meg from Capacity line), its involved a
raid 5
set on LUN 0, raid group 0, etc, etc.
trespass will failover all the LUNs on a failed SP to the functional SP.
syntax: navicli -d [device] trespass all
You don't have to use trespass all, you could specify specific LUNs. heres an
example: say sp-a fails on cabinet 2 and we can no longer see the LUNs on it
in solaris. Run a :
# navicli -d c4t1d5s2 trespass all
#
There we are. heh..another important thing, notice how I used c4, because
sp-a failed and we could no longer see those disks in solaris? If I had used
-d c2t0d4s2, it wouldn't have worked. Along the same lines, if you happen to
unbind a LUN that you have specified as a device in /etc/Navisphere/agent.config,
you need to look at format and find a disk thats on that controller and change the
entry for that SP to that disk. It doesnt matter which disk, just as long as its on the same controller.
getlog
getlog gets the SP log..usefull for troubleshooting (and sending to
EMC when you get really scary errors and/or failures).
syntax: navicli -d [device] getlog
# navicli -d c2t0d4s2 getlog
lots of output, heh
#
references
http://www.cuddletech.com/veritas/raidtheory.html
http://www.emc.com/products/systems/clariion.jsp?openfolder=storage_systems
- Storage Processor. The component that handles all of the raid.
- Logical Unit Number. I swear there used to be a LUN node.
The SP has the interfaces on it.
- Application Transparent Failover. A piece of software
that sits between the fibre channel drivers and the operating system that allows
for automatic failover of a LUN from one controller to another without interrupting
any applications accessing filesystems that exist on said LUN.
- Meaning disk controller in solaris. Related to which HBA(s) the SPs on the cabinet are attached to.
- enclosure 0 is the bottom enclosure. disk 0 is at the left of the
enclosure.