Hi everyone,
in a customer site a fresh new install:
3 HS22V in BladeCenter H
each with Qlogic 8 GB Mezzanine (QMI 2582 in driver detail /proc/scsi/qla2xxx),
Installed ESXi 5 U1 on USB Key ,
driver installed on base (9.01)
because driver 9.11 causes freeze of system (there is a KB VMWare on this, so we cannot update)
connected via FC to Storwize V7000.
Installation went smoothly.
Zoning OK (checked with IBM Support)
LUN presentation OK (for now 2 LUN , LUN 1 and LUN 2, each 2 TB )
Disk correctly viewed from Hosts
The problem we are facing is there are latency issue in datastore utilization.
Example: when we try to install a new VM Windows 2008 R2 (for test purpose)
we see datastore latency write randomly go to 2000-2500 ms
latency read goes beyond 500-600 ms .
Checking vmkernel.log we see this kind of messages:
2012-04-05T08:12:55.597Z cpu0:4120)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x28 (0x4124003cdf80, 16269) to dev "naa.600507680280830af000000000000001" on path "vmhba2:C0:T1:L1" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2012-04-05T08:12:55.597Z cpu0:4120)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000001" state in doubt; requested fast path state update...
2012-04-05T08:12:55.597Z cpu0:4120)ScsiDeviceIO: 2309: Cmd(0x4124003cdf80) 0x28, CmdSN 0x80000061 from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T08:13:00.760Z cpu2:4098)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000001" state in doubt; requested fast path state update...
2012-04-05T08:13:00.760Z cpu2:4098)ScsiDeviceIO: 2309: Cmd(0x412400d698c0) 0x28, CmdSN 0x8000001c from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T08:13:01.671Z cpu0:4120)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000001" state in doubt; requested fast path state update...
2012-04-05T08:13:01.671Z cpu0:4120)ScsiDeviceIO: 2309: Cmd(0x4124003cdf80) 0x28, CmdSN 0x80000003 from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T08:13:02.076Z cpu0:4120)ScsiDeviceIO: 2309: Cmd(0x4124003cdf80) 0x28, CmdSN 0x8000002d from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T08:13:02.582Z cpu0:4120)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000001" state in doubt; requested fast path state update...
2012-04-05T08:13:02.582Z cpu0:4120)ScsiDeviceIO: 2309: Cmd(0x4124003cdf80) 0x28, CmdSN 0x80000045 from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T08:13:09.112Z cpu16:4112)LinNet: map_skb_to_pkt:288: This message has repeated 458752 times: invalid vlan tag: 4095 dropped
2012-04-05T08:13:51.175Z cpu2:4098)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x28 (0x4124003cce80, 16269) to dev "naa.600507680280830af000000000000001" on path "vmhba2:C0:T1:L1" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2012-04-05T08:13:51.175Z cpu2:4098)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000001" state in doubt; requested fast path state update...
2012-04-05T08:13:51.175Z cpu2:4098)ScsiDeviceIO: 2309: Cmd(0x4124003cce80) 0x28, CmdSN 0x8000003d from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T08:13:53.908Z cpu1:4097)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000001" state in doubt; requested fast path state update...
2012-04-05T08:13:53.908Z cpu1:4097)ScsiDeviceIO: 2309: Cmd(0x4124003cdf80) 0x28, CmdSN 0x80000055 from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T08:13:54.921Z cpu2:4098)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000001" state in doubt; requested fast path state update...
2012-04-05T08:13:54.921Z cpu2:4098)ScsiDeviceIO: 2309: Cmd(0x4124003cb280) 0x28, CmdSN 0x80000057 from world 16269 to dev "naa.600507680280830af000000000000001" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T09:04:39.966Z cpu22:4118)WARNING: ScsiDeviceIO: 1218: Device naa.600507680280830af000000000000002 performance has deteriorated. I/O latency increased from average value of 9989 microseconds to 204862 microseconds.
2012-04-05T09:04:39.978Z cpu22:4118)WARNING: ScsiDeviceIO: 1218: Device naa.600507680280830af000000000000002 performance has deteriorated. I/O latency increased from average value of 11029 microseconds to 229335 microseconds.
2012-04-05T09:05:12.221Z cpu22:4118)ScsiDeviceIO: 1198: Device naa.600507680280830af000000000000002 performance has improved. I/O latency reduced from 229335 microseconds to 59624 microseconds.
2012-04-05T13:04:42.045Z cpu23:4119)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:42.045Z cpu23:4119)ScsiDeviceIO: 2309: Cmd(0x4124403ae500) 0x28, CmdSN 0x1dec3 from world 5937 to dev "naa.600507680280830af000000000000002" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T13:04:42.349Z cpu23:4119)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:43.563Z cpu16:4112)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:44.475Z cpu16:4112)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:45.082Z cpu16:4112)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x28 (0x4124403ae500, 5937) to dev "naa.600507680280830af000000000000002" on path "vmhba2:C0:T1:L2" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2012-04-05T13:04:45.386Z cpu17:4113)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:46.297Z cpu17:4113)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:47.512Z cpu17:4113)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:47.870Z cpu16:4112)ScsiDeviceIO: 2309: Cmd(0x412440ef69c0) 0x28, CmdSN 0x1dec9 from world 5937 to dev "naa.600507680280830af000000000000002" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T13:04:48.174Z cpu18:4114)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x28 (0x412440df3d80, 5937) to dev "naa.600507680280830af000000000000002" on path "vmhba1:C0:T0:L2" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. Act:EVAL
2012-04-05T13:04:48.174Z cpu18:4114)ScsiDeviceIO: 2309: Cmd(0x412440df3d80) 0x28, CmdSN 0x1decc from world 5937 to dev "naa.600507680280830af000000000000002" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T13:04:48.478Z cpu18:4114)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:48.781Z cpu18:4114)ScsiDeviceIO: 2309: Cmd(0x412440ef5cc0) 0x28, CmdSN 0x1ded1 from world 5937 to dev "naa.600507680280830af000000000000002" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T13:04:49.389Z cpu18:4114)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.600507680280830af000000000000002" state in doubt; requested fast path state update...
2012-04-05T13:04:49.389Z cpu18:4114)ScsiDeviceIO: 2309: Cmd(0x412440e70d00) 0x28, CmdSN 0x1ded4 from world 5937 to dev "naa.600507680280830af000000000000002" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T13:04:49.693Z cpu18:4114)ScsiDeviceIO: 2309: Cmd(0x412440ef5bc0) 0x28, CmdSN 0x1ded6 from world 5937 to dev "naa.600507680280830af000000000000002" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2012-04-05T13:04:49.996Z cpu18:4114)ScsiDeviceIO: 2309: Cmd(0x412440ef56c0) 0x28, CmdSN 0x1ded8 from world 5937 to dev "naa.600507680280830af000000000000002" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
Rarely also:
WARNING: ScsiDeviceIO: 1218: Device naa.600507680280830af000000000000001 performance has deteriorated. I/O latency increased from average value of 248 microseconds to 7629 microseconds
So storage utilization is unoptimized.
Checked message H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0
on vmware KB - seems Host Busy -
but busy on what ?
Also tried to install ESXi 4.1 U2 (also with driver 8.41 ) - same issue .
Tried changing qla2xxx queue length - from 32 to 64 as stated on KB - same issue .
Also a collegue has the same problem but on storage DS3500 (also on Fibre Channel )
Maybe the issue is not storage
but QLogic adapter ?
Any suggestion on how to remediate this problem
is very appreciated,
this issue is driving me crazy .
Edit:
Also tried to reseat Blades ,
problem persist