R720xd with Perc H310 Mini.18 Drives
16 of the drives are unraided.
2 drives are Raid1 (they are the last two drives on the bus bays 16 and 17)
O/S is Red Hat 6.7
OMSA has started populating the messages file with these
Server_Administrator: 4530 2095 - Storage Service Severity: Informational, Category: Storage, MessageID: STOR0210, Message: SCSI sense data (Sense key: B Sense code: 4B Sense qualifier: 3) received from Physical Disk 0:1:12 on Controller 0 at Connector 0.
Which disk reports the error changes.
On a totally idle system I wrote a loop to exersise the drives.
Against each drive on the system it executes:
drives=$( lvmdiskscan | grep sd | awk '{printf "%s ",$1}' )
for dx in $drives
do
echo Exercising $dx | tee -a /var/log/messages
lsscsi 2>&1 | grep $( echo $dx | cut -b1-8 ) | tee -a /var/log/messages
dd if=$dx of=/dev/null ***=1024 count=1000000
echo
done
echo Exercising done | tee -a /var/log/messages
Synopsis: It reads from disks, logs which disk its working on to /var/log/messages. In parallel the OMSA software is logging any storage events to /var/log/messages.
Output looks like this:
Exercising /dev/sda1
[0:0:0:0] disk SEAGATE ST91000640SS AS0B /dev/sda
Feb 5 13:11:13 Server_Administrator: 4530 2095 - Storage Service Severity: Informational, Category: Storage, MessageID: STOR0210, Message: SCSI sense data (Sense key: B Sense code: 4B Sense qualifier: 3) received from Physical Disk 0:1:0 on Controller 0 at Connector 0.
Exercising /dev/sdq1
[0:2:0:0] disk DELL PERC H310 2.12 /dev/sdq
Exercising /dev/sdq2
[0:2:0:0] disk DELL PERC H310 2.12 /dev/sdq
Exercising /dev/sdb1
[0:0:1:0] disk SEAGATE ST91000640SS AS0B /dev/sdb
Exercising /dev/sdc1
[0:0:2:0] disk SEAGATE ST91000640SS AS0B /dev/sdc
Exercising /dev/sdd1
[0:0:3:0] disk SEAGATE ST91000640SS AS0B /dev/sdd
Exercising /dev/sde1
[0:0:4:0] disk SEAGATE ST91000640SS AS0B /dev/sde
Exercising /dev/sdf1
[0:0:5:0] disk SEAGATE ST91000640SS AS0B /dev/sdf
Exercising /dev/sdg1
[0:0:6:0] disk SEAGATE ST91000640SS AS0B /dev/sdg
Exercising /dev/sdh1
[0:0:7:0] disk SEAGATE ST91000640SS AS0B /dev/sdh
Feb 5 13:12:17 Server_Administrator: 4530 2095 - Storage Service Severity: Informational, Category: Storage, MessageID: STOR0210, Message: SCSI sense data (Sense key: B Sense code: 4B Sense qualifier: 3) received from Physical Disk 0:1:7 on Controller 0 at Connector 0.
Feb 5 13:12:21 Server_Administrator: 4530 2095 - Storage Service Severity: Informational, Category: Storage, MessageID: STOR0210, Message: SCSI sense data (Sense key: B Sense code: 4B Sense qualifier: 3) received from Physical Disk 0:1:7 on Controller 0 at Connector 0.
Exercising /dev/sdi1
[0:0:8:0] disk SEAGATE ST91000640SS AS0B /dev/sdi
Feb 5 13:12:31 Server_Administrator: 4530 2095 - Storage Service Severity: Informational, Category: Storage, MessageID: STOR0210, Message: SCSI sense data (Sense key: B Sense code: 4B Sense qualifier: 3) received from Physical Disk 0:1:8 on Controller 0 at Connector 0.
Exercising /dev/sdj1
[0:0:9:0] disk SEAGATE ST91000640SS AS0B /dev/sdj
Exercising /dev/sdk1
[0:0:10:0] disk SEAGATE ST91000640SS AS0B /dev/sdk
Exercising /dev/sdl1
[0:0:11:0] disk SEAGATE ST91000640SS AS0B /dev/sdl
Exercising /dev/sdm1
[0:0:12:0] disk SEAGATE ST91000640SS AS0B /dev/sdm
Exercising /dev/sdn1
[0:0:13:0] disk SEAGATE ST91000640SS AS0B /dev/sdn
Exercising /dev/sdo1
[0:0:14:0] disk SEAGATE ST91000640SS AS0B /dev/sdo
Exercising /dev/sdp1
[0:0:15:0] disk SEAGATE ST91000640SS AS0B /dev/sdp
In this particular run the drive in bays 0,7 and 8 complained. On other runs it may be other drives. It may also be a varying number of drives that flag the issue.
I've updated all the firmware available via the lifecycle controller by pointing it at ftp.dell.com
All the drives are dell certified.
OMSA is at 8.2
Tried downgrading to OMSA 7.4 - but still received same messages.
Sometimes the key qualifier value is 2 other times it is 3.
In the month of January over 17,000 messages were generated to /var/log/messages.
Mixed in with these are also 83 messages in the format:
Storage Service Controller event log: Unexpected sense: Encl PD 20 Path 5e4ae0208df0bb00, CDB: 1c 01 a0 00 04 00, Sense: 5/24/00: Controller 0 (PERC H310 Mini)
The 5/24/0 messages do not occur often and are deemed (per googling) to be informational in nature and not of realy concern. I'm more concerned about the msgs shown above. I have reports when the end users are allowed to use the box that it lags behind all other boxes doing the same work. The other boxes have identical configs and do not generates these messages.
Any thoughts/advice would be much appreciated.