martes, 29 de julio de 2008

AoE

Hoy hablaré como verl estado de un raid (Coraid) a través de ATA Over Ethernet (aoe)

Para ello utilizaremos Coraid Ethernet Console client

Una vez descomprimido, compilado podemos ejecutar para ver el estado:


datos:~/cec-8# ./cec -s 1 eth1
Probing for shelves ... shelf 1 found.
connecting ... done.
Escape is Ctrl-\

SR shelf 1> show -l
1.0 500.108GB up
1.1 500.108GB up
1.2 500.108GB up
1.3 500.108GB up
1.4 500.108GB up
1.5 500.108GB up
1.6 500.108GB up
1.7 500.108GB up
1.8 0.000GB down
1.9 0.000GB down
1.10 0.000GB down
1.11 0.000GB down
1.12 0.000GB down
1.13 0.000GB down
1.14 0.000GB down
SR shelf 1> list
1 2000.431GB online
2 1000.216GB online
SR shelf 1> list -l
1 2000.431GB online
1.0 2000.431GB raid5 normal
1.0.0 normal 500.108GB 1.0
1.0.1 normal 500.108GB 1.1
1.0.2 normal 500.108GB 1.2
1.0.3 normal 500.108GB 1.3
1.0.4 normal 500.108GB 1.4
2 1000.216GB online
2.0 1000.216GB raid5 normal
2.0.0 normal 500.108GB 1.5
2.0.1 normal 500.108GB 1.6
2.0.2 normal 500.108GB 1.7


Yo me he hecho un pequeño script en expect para que compruebe si el estado del raid es "online". Lo que hace es lanzar el "cec", hacer un list y salir.

#!/usr/bin/expect -f
set send_slow {1 .1}
proc send {ignore arg} {
sleep .3
exp_send -s -- $arg
}

set timeout -1
spawn /root/cec-8/cec -s 1 eth1
match_max 100000
expect -exact "Probing for shelves ... shelf 1 found.\r
connecting ... done.\r
Escape is Ctrl-\\\r
"
send -- "\r"
expect -exact "\r\r
SR shelf 1> "
send -- "list\r"
send -- "\r"
expect -exact "\r\r
SR shelf 1> "
send -- ""
expect -exact ">>> "
send -- "q\r"
expect eof

sábado, 12 de julio de 2008

Raid

Hoy hablaré sobre las distintas utilidades para ver el estado de distintas controladoras raid.
No de todas que sería imposible, sino de las que monitorizo a diario y he encontrado aplicaciones para ello.

3ware

Su aplicación es tw_cli:


//backup> /c5 show all
/c5 Driver Version = 2.26.02.008
/c5 Model = 9550SXU-8LP
/c5 Available Memory = 112MB
/c5 Firmware Version = FE9X 3.04.00.005
/c5 Bios Version = BE9X 3.04.00.002
...
/c5 Number of Ports = 8
/c5 Number of Drives = 4
/c5 Number of Units = 1
/c5 Total Optimal Units = 1
/c5 Not Optimal Units = 0
...
/c5 Controller Bus Speed = 66 Mhz

Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy
------------------------------------------------------------------------------
u0 RAID-5 OK - - 64K 2095.44 ON OFF

Port Status Unit Size Blocks Serial
---------------------------------------------------------------
p0 OK u0 698.63 GB 1465149168 5QD0D6RW
p1 OK u0 698.63 GB 1465149168 5QD0D8B5
p2 OK u0 698.63 GB 1465149168 5QD0D6TG
p3 OK u0 698.63 GB 1465149168 5QD0D6TK
p4 NOT-PRESENT - - - -
..


ADAPTEC

Tenemos dos aplicaciones dependiendo la versión de la controladora, si tiene firmware antiguo utilizaremos afacli

Debemos crear primero /dev/afa0, para ello vemos major number con:
grep aac /proc/devices

y luego mknod /dev/afa0 c 0


:# afacli
CLI > open afa0
Executing: open "afa0"

AFA0> container list

Executing: container list
Num Total Oth Chunk Scsi Partition
Label Type Size Ctr Size Usage B:ID:L Offset:Size
----- ------ ------ --- ------ ------- ------ -------------
0 RAID-5 1.09TB 64KB Valid 0:00:0 64.0KB: 279GB
/dev/sda RAID5_5x300 0:01:0 64.0KB: 279GB
0:05:0 64.0KB: 279GB
0:03:0 64.0KB: 279GB
0:04:0 64.0KB: 279GB

1 Volume 279GB Valid 0:02:0 64.0KB: 279GB
/dev/sdb VOLUME_1X300



Para las controladoras con nuevo Firmware (lo sabran porque el afacli dice que la DLL que tiene es antigua) tendrán que usar arcconf utilidad de StorMan

~# /usr/src/prueba/usr/StorMan/arcconf GETCONFIG 1 AD
sh: /bin/sort: No existe el fichero o el directorio
Controllers found: 1
----------------------------------------------------------------------
Controller information
----------------------------------------------------------------------
Controller Status : Optimal
Channel description : SCSI
Controller Model : Adaptec 2020ZCR
Controller Serial Number : BAD0
Physical Slot : 2
Installed memory : 64 MB
Copyback : Disabled
Background consistency check : Disabled
Automatic Failover : Enabled
Defunct disk drive count : 0
Logical devices/Failed/Degraded : 2/0/0
--------------------------------------------------------
Controller Version Information
--------------------------------------------------------
BIOS : 5.1-0 (8458)
Firmware : 5.1-0 (8458)
Driver : 1.1-5 (2409)
Boot Flash : 0.0-0 (0)


Compaq Computer Corporation Smart Array

Utilizaremos arrayprobe

correo:~# /usr/bin/arrayprobe -r|tail -3
Logical drive 0 on controller /dev/cciss/c0d0 has state 0
Logical drive 1 on controller /dev/cciss/c0d0 has state 0
OK Arrayprobe All controllers ok


megaRAID

Utilizaremos la aplicación megaCLI


#:/opt/MegaRAID/MegaCli # /opt/MegaRAID/MegaCli/MegaCli -LDInfo -LALL -aALL


Adapter 0 -- Virtual Drive Information:
Virtual Disk: 0 (target id: 0)
Name:sistema
RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0
Size:714880MB
State: Optimal
Stripe Size: 64kB
Number Of Drives:2
Span Depth:1
Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy: Read/Write
Disk Cache Policy: Disk's Default


#:/opt/MegaRAID/MegaCli # /opt/MegaRAID/MegaCli/MegaCli -PDList -aALL

Adapter #0

Enclosure Device ID: N/A
Slot Number: 0
Device Id: 0
Sequence Number: 2
...
Last Predictive Failure Event Seq Number: 0
Raw Size: 715404MB [0x575466f0 Sectors]
...
Inquiry Data: ATA ST3750640NS E 3QD0XYL2

Enclosure Device ID: N/A
Slot Number: 1
Device Id: 1
Sequence Number: 2
...
Last Predictive Failure Event Seq Number: 0
Raw Size: 715404MB [0x575466f0 Sectors]
...
Connected Port Number: 1
Inquiry Data: ATA ST3750640NS E 3QD0ZKAX



Para algunas ADAPTEC, ADAPTEC DPT, SmartRaid V hay que utilizar dpt-i2o-raidutils / raidutils), de estas no tengo ejemplo.

Y finalmente para el raid por software:

~# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Fri Jul 11 11:46:30 2008
Raid Level : raid1
Array Size : 995904 (972.73 MiB 1019.81 MB)
Device Size : 995904 (972.73 MiB 1019.81 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Sat Jul 12 06:25:46 2008
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : cc3a522e:25fd72ba:1f0687a3:f9661a16
Events : 0.34

Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1


y otro comando para saber las unidades no óptimas:

cat /proc/mdstat | egrep '(U_|_U)' | wc -l