Hi,
I have a random purple screen problem on a server. I tested memory and CPU stability with memtest86 and prime without errors. I also changed the power supply and disabled the passthroughts.
The server is :
Core i5 3470, Asus P8H77-M, 32Go RAM, LSI MegaRaid 9240-4i, ESXi 5.1.0 1021289
The core dump :
2013-06-10T14:43:22.805Z cpu3:4099)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x4124007d9640, 0) to dev "mpx.vmhba36:C0:T0:L0" on path "vmhba36:C0:T0:L0" Failed: H:0x0 D:0x2
P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2013-06-10T14:43:22.805Z cpu3:4099)ScsiDeviceIO: 2329: Cmd(0x4124007d9640) 0x1a, CmdSN 0xbd4b from world 0 to dev "mpx.vmhba36:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense
data: 0x5 0x20 0x0.
2013-06-10T14:48:22.806Z cpu1:1406367)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x4124007ca3c0, 0) to dev "mpx.vmhba36:C0:T0:L0" on path "vmhba36:C0:T0:L0" Failed: H:0x0
D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2013-06-10T14:48:22.806Z cpu1:1406367)ScsiDeviceIO: 2329: Cmd(0x4124007ca3c0) 0x1a, CmdSN 0xbd4c from world 0 to dev "mpx.vmhba36:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense
data: 0x5 0x20 0x0.
2013-06-10T14:53:22.805Z cpu2:4617)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x4124007d36c0, 0) to dev "mpx.vmhba36:C0:T0:L0" on path "vmhba36:C0:T0:L0" Failed: H:0x0 D:0x2
P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2013-06-10T14:53:22.805Z cpu2:4617)ScsiDeviceIO: 2329: Cmd(0x4124007d36c0) 0x1a, CmdSN 0xbd4d from world 0 to dev "mpx.vmhba36:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense
data: 0x5 0x20 0x0.
2013-06-10T14:58:22.804Z cpu1:4097)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x4124007b8cc0, 0) to dev "mpx.vmhba36:C0:T0:L0" on path "vmhba36:C0:T0:L0" Failed: H:0x0 D:0x2
P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
2013-06-10T14:58:22.804Z cpu1:4097)ScsiDeviceIO: 2329: Cmd(0x4124007b8cc0) 0x1a, CmdSN 0xbd50 from world 0 to dev "mpx.vmhba36:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense
data: 0x5 0x20 0x0.
2013-06-10T14:59:02.540Z cpu1:4172)World: 8381: PRDA 0x418040400000 ss 0x0 ds 0x4018 es 0x4018 fs 0x4018 gs 0x4018
2013-06-10T14:59:02.540Z cpu1:4172)World: 8383: TR 0x4020 GDT 0x412201321000 (0x402f) IDT 0x418023112000 (0xfff)
2013-06-10T14:59:02.540Z cpu1:4172)World: 8384: CR0 0x80010033 CR3 0x2d40d4000 CR4 0x42668
2013-06-10T14:59:02.543Z cpu1:4172)Backtrace for current CPU #1, worldID=4172, ebp=0x41220131bc60
2013-06-10T14:59:02.544Z cpu1:4172)0x41220131bc60:[0x41802307ad2f]PanicvPanicInt@vmkernel#nover+0x56 stack: 0x3000000008, 0x41220131bd
2013-06-10T14:59:02.544Z cpu1:4172)0x41220131bd40:[0x41802307b5d7]Panic@vmkernel#nover+0xae stack: 0x0, 0x0, 0x2fb4aa, 0x41220131be3c,
2013-06-10T14:59:02.544Z cpu1:4172)0x41220131be70:[0x4180230a7b53]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0xd, 0x0, 0x0, 0x0, 0xf4
2013-06-10T14:59:02.545Z cpu1:4172)0x41220131bec0:[0x41802348ae17]UserMem_CartelFlush@<None>#<None>+0xce stack: 0x412200000000, 0x4180
2013-06-10T14:59:02.545Z cpu1:4172)0x41220131bf00:[0x4180234d9f9d]UserMemTouchedEstimate@<None>#<None>+0x124 stack: 0x100000002, 0x418
2013-06-10T14:59:02.545Z cpu1:4172)0x41220131bf60:[0x4180230f8433]WorkInvokeInternalItem@vmkernel#nover+0x46 stack: 0x0, 0x4122011a700
2013-06-10T14:59:02.546Z cpu1:4172)0x41220131bff0:[0x418023048418]helpFunc@vmkernel#nover+0x517 stack: 0x0, 0x0, 0x0, 0x0, 0x0
2013-06-10T14:59:02.546Z cpu1:4172)0x41220131bff8:[0x0]<unknown> stack: 0x0, 0x0, 0x0, 0x0, 0x0
2013-06-10T14:59:02.546Z cpu1:4172) [45m [33;1mVMware ESXi 5.1.0 [Releasebuild-1021289 x86_64] [0m
PCPU 2 locked up. Failed to ack TLB invalidate (total of 2 locked up, PCPU(s): 0,2).
2013-06-10T14:59:02.546Z cpu1:4172)cr0=0x8001003d cr2=0xd9b40000 cr3=0xcd899000 cr4=0x216c
2013-06-10T14:59:02.546Z cpu1:4172)pcpu:0 world:833498 name:"vmm1:server1" (V)
2013-06-10T14:59:02.546Z cpu1:4172)pcpu:1 world:4172 name:"helper24-0" (SH)
2013-06-10T14:59:02.546Z cpu1:4172)pcpu:2 world:1406315 name:"vmm2:server2" (V)
2013-06-10T14:59:02.546Z cpu1:4172)pcpu:3 world:1406312 name:"vmm0:server2" (V)
2013-06-10T14:59:02.546Z cpu1:4172)@BlueScreen: PCPU 2 locked up. Failed to ack TLB invalidate (total of 2 locked up, PCPU(s): 0,2).
2013-06-10T14:59:02.547Z cpu1:4172)Code start: 0x418023000000 VMK uptime: 26:01:30:39.692
2013-06-10T14:59:02.547Z cpu1:4172)0x41220131bc60:[0x41802307ad2f]PanicvPanicInt@vmkernel#nover+0x56 stack: 0x3000000008
2013-06-10T14:59:02.547Z cpu1:4172)0x41220131bd40:[0x41802307b5d7]Panic@vmkernel#nover+0xae stack: 0x0
2013-06-10T14:59:02.548Z cpu1:4172)0x41220131be70:[0x4180230a7b53]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0xd
2013-06-10T14:59:02.548Z cpu1:4172)0x41220131bec0:[0x41802348ae17]UserMem_CartelFlush@<None>#<None>+0xce stack: 0x412200000000
2013-06-10T14:59:02.548Z cpu1:4172)0x41220131bf00:[0x4180234d9f9d]UserMemTouchedEstimate@<None>#<None>+0x124 stack: 0x100000002
2013-06-10T14:59:02.549Z cpu1:4172)0x41220131bf60:[0x4180230f8433]WorkInvokeInternalItem@vmkernel#nover+0x46 stack: 0x0
2013-06-10T14:59:02.549Z cpu1:4172)0x41220131bff0:[0x418023048418]helpFunc@vmkernel#nover+0x517 stack: 0x0
2013-06-10T14:59:02.549Z cpu1:4172)0x41220131bff8:[0x0]<unknown> stack: 0x0
2013-06-10T14:59:02.550Z cpu1:4172)base fs=0x0 gs=0x418040400000 Kgs=0x0
2013-06-10T14:59:02.550Z cpu1:4172)vmkernel 0x0 .data 0x0 .bss 0x0
2013-06-10T14:59:02.550Z cpu1:4172)chardevs 0x418023470000 .data 0x417fc0000000 .bss 0x417fc00008a0
2013-06-10T14:59:02.550Z cpu1:4172)user 0x418023475000 .data 0x417fc0400000 .bss 0x417fc0413240
2013-06-10T14:59:02.550Z cpu1:4172)vprobe 0x4180234fb000 .data 0x417fc0800000 .bss 0x417fc080cb40
2013-06-10T14:59:02.550Z cpu1:4172)procfs 0x418023532000 .data 0x417fc0c00000 .bss 0x417fc0c00220
2013-06-10T14:59:02.550Z cpu1:4172)procMisc 0x418023535000 .data 0x417fc1000000 .bss 0x417fc1000000
2013-06-10T14:59:02.550Z cpu1:4172)vfat 0x418023536000 .data 0x417fc1400000 .bss 0x417fc14021c0
2013-06-10T14:59:02.550Z cpu1:4172)vmci 0x418023540000 .data 0x417fc1800000 .bss 0x417fc1805120
2013-06-10T14:59:02.550Z cpu1:4172)vmkapi_socket 0x41802355e000 .data 0x417fc1c00000 .bss 0x417fc1c00700
2013-06-10T14:59:02.550Z cpu1:4172)vmkapi_v2_0_0_0_vmkernel_shim 0x418023561000 .data 0x417fc2000000 .bss 0x417fc200b218
2013-06-10T14:59:02.550Z cpu1:4172)vmkplexer 0x418023562000 .data 0x417fc2400000 .bss 0x417fc24004a0
2013-06-10T14:59:02.550Z cpu1:4172)vmklinux_9 0x418023566000 .data 0x417fc2800000 .bss 0x417fc280e480
2013-06-10T14:59:02.550Z cpu1:4172)vmklinux_9_2_0_0 0x4180235de000 .data 0x417fc2c00000 .bss 0x417fc2c0a7f8
2013-06-10T14:59:02.550Z cpu1:4172)vmklinux_9_2_1_0 0x4180235df000 .data 0x417fc3000000 .bss 0x417fc300a9a8
2013-06-10T14:59:02.550Z cpu1:4172)iscsi_trans 0x4180235e0000 .data 0x417fc3400000 .bss 0x417fc3401420
2013-06-10T14:59:02.550Z cpu1:4172)etherswitch 0x4180235eb000 .data 0x417fc3800000 .bss 0x417fc38114c0
2013-06-10T14:59:02.550Z cpu1:4172)netsched 0x418023616000 .data 0x417fc3c00000 .bss 0x417fc3c030c0
2013-06-10T14:59:02.550Z cpu1:4172)cnic_register 0x41802361c000 .data 0x417fc4000000 .bss 0x417fc40002a0
2013-06-10T14:59:02.550Z cpu1:4172)r8168 0x41802361e000 .data 0x417fc4400000 .bss 0x417fc4400740
2013-06-10T14:59:02.550Z cpu1:4172)e1000e 0x41802362d000 .data 0x417fc4800000 .bss 0x417fc48018e0
2013-06-10T14:59:02.550Z cpu1:4172)vmkapi_v2_0_0_0_iscsi_shim 0x41802364c000 .data 0x417fc4c00000 .bss 0x417fc4c00b00
2013-06-10T14:59:02.550Z cpu1:4172)random 0x41802364d000 .data 0x417fc5000000 .bss 0x417fc5000740
2013-06-10T14:59:02.550Z cpu1:4172)usb 0x418023651000 .data 0x417fc5400000 .bss 0x417fc5401fa0
2013-06-10T14:59:02.550Z cpu1:4172)ehci-hcd 0x41802366f000 .data 0x417fc5800000 .bss 0x417fc5800500
2013-06-10T14:59:02.550Z cpu1:4172)hid 0x418023679000 .data 0x417fc5c00000 .bss 0x417fc5c00600
2013-06-10T14:59:02.550Z cpu1:4172)dm 0x41802367e000 .data 0x417fc6000000 .bss 0x417fc6000000
2013-06-10T14:59:02.550Z cpu1:4172)nmp 0x418023680000 .data 0x417fc6400000 .bss 0x417fc6403c20
2013-06-10T14:59:02.550Z cpu1:4172)vmw_satp_local 0x4180236a0000 .data 0x417fc6800000 .bss 0x417fc6800050
2013-06-10T14:59:02.550Z cpu1:4172)vmw_satp_default_aa 0x4180236a2000 .data 0x417fc6c00000 .bss 0x417fc6c00000
2013-06-10T14:59:02.550Z cpu1:4172)vmw_psp_lib 0x4180236a3000 .data 0x417fc7000000 .bss 0x417fc7000390
2013-06-10T14:59:02.550Z cpu1:4172)vmw_psp_fixed 0x4180236a5000 .data 0x417fc7400000 .bss 0x417fc7400000
2013-06-10T14:59:02.550Z cpu1:4172)vmw_psp_rr 0x4180236a7000 .data 0x417fc7800000 .bss 0x417fc7800090
2013-06-10T14:59:02.550Z cpu1:4172)vmw_psp_mru 0x4180236aa000 .data 0x417fc7c00000 .bss 0x417fc7c00000
2013-06-10T14:59:02.550Z cpu1:4172)libata_92 0x4180236ac000 .data 0x417fc8000000 .bss 0x417fc8003960
2013-06-10T14:59:02.550Z cpu1:4172)libata_9_2_0_0 0x4180236cc000 .data 0x417fc8400000 .bss 0x417fc8401ea0
2013-06-10T14:59:02.550Z cpu1:4172)usb-storage 0x4180236cd000 .data 0x417fc8800000 .bss 0x417fc8804940
2013-06-10T14:59:02.550Z cpu1:4172)vmkapi_v2_0_0_0_nmp_shim 0x4180236d9000 .data 0x417fc8c00000 .bss 0x417fc8c00f58
2013-06-10T14:59:02.550Z cpu1:4172)healthchk 0x4180236da000 .data 0x417fc9000000 .bss 0x417fc900fa20
2013-06-10T14:59:02.550Z cpu1:4172)teamcheck 0x4180236ec000 .data 0x417fc9400000 .bss 0x417fc940fe60
2013-06-10T14:59:02.550Z cpu1:4172)vlanmtucheck 0x4180236fb000 .data 0x417fc9800000 .bss 0x417fc980fc00
2013-06-10T14:59:02.550Z cpu1:4172)heartbeat 0x41802370c000 .data 0x417fc9c00000 .bss 0x417fc9c0fb00
2013-06-10T14:59:02.550Z cpu1:4172)shaper 0x41802371a000 .data 0x417fca000000 .bss 0x417fca011a80
2013-06-10T14:59:02.550Z cpu1:4172)lldp 0x41802372b000 .data 0x417fca400000 .bss 0x417fca400040
2013-06-10T14:59:02.551Z cpu1:4172)cdp 0x418023730000 .data 0x417fca800000 .bss 0x417fca811000
2013-06-10T14:59:02.551Z cpu1:4172)ipfix 0x418023744000 .data 0x417fcac00000 .bss 0x417fcac0fbc0
2013-06-10T14:59:02.551Z cpu1:4172)tcpip3 0x418023755000 .data 0x417fcb000000 .bss 0x417fcb008440
2013-06-10T14:59:02.551Z cpu1:4172)dvsdev 0x418023808000 .data 0x417fcb400000 .bss 0x417fcb400040
2013-06-10T14:59:02.551Z cpu1:4172)vdl2 0x41802380b000 .data 0x417fcb800000 .bss 0x417fcb800140
2013-06-10T14:59:02.551Z cpu1:4172)dvfilter 0x418023817000 .data 0x417fcbc00000 .bss 0x417fcbc00ea0
2013-06-10T14:59:02.551Z cpu1:4172)lacp 0x41802382f000 .data 0x417fcc000000 .bss 0x417fcc000120
2013-06-10T14:59:02.551Z cpu1:4172)vmkapi_v2_0_0_0_dvfilter_shim 0x418023834000 .data 0x417fcc400000 .bss 0x417fcc400b20
2013-06-10T14:59:02.551Z cpu1:4172)dvfilter-generic-fastpath 0x418023835000 .data 0x417fcc800000 .bss 0x417fcc810100
2013-06-10T14:59:02.551Z cpu1:4172)svmmirror 0x41802384c000 .data 0x417fccc00000 .bss 0x417fccc00100
2013-06-10T14:59:02.551Z cpu1:4172)cbt 0x418023856000 .data 0x417fcd000000 .bss 0x417fcd000080
2013-06-10T14:59:02.551Z cpu1:4172)migrate 0x418023858000 .data 0x417fcd400000 .bss 0x417fcd404fc0
2013-06-10T14:59:02.551Z cpu1:4172)esxfw 0x4180238ab000 .data 0x417fcd800000 .bss 0x417fcd8109a0
2013-06-10T14:59:02.551Z cpu1:4172)hbr_filter 0x4180238be000 .data 0x417fcdc00000 .bss 0x417fcdc00380
2013-06-10T14:59:02.551Z cpu1:4172)vmkstatelogger 0x4180238e2000 .data 0x417fce000000 .bss 0x417fce003620
2013-06-10T14:59:02.551Z cpu1:4172)libfc_92 0x4180238ff000 .data 0x417fce400000 .bss 0x417fce401020
2013-06-10T14:59:02.551Z cpu1:4172)libfcoe_92 0x418023919000 .data 0x417fce800000 .bss 0x417fce800320
2013-06-10T14:59:02.551Z cpu1:4172)libfc_9_2_0_0 0x41802391f000 .data 0x417fcec00000 .bss 0x417fcec00b48
2013-06-10T14:59:02.551Z cpu1:4172)libfcoe_9_2_0_0 0x418023920000 .data 0x417fcf000000 .bss 0x417fcf000288
2013-06-10T14:59:02.551Z cpu1:4172)ahci 0x418023921000 .data 0x417fcf400000 .bss 0x417fcf400be0
2013-06-10T14:59:02.551Z cpu1:4172)megaraid_sas 0x418023927000 .data 0x417fcf800000 .bss 0x417fcf8009a0
2013-06-10T14:59:02.551Z cpu1:4172)lvmdriver 0x418023939000 .data 0x417fcfc00000 .bss 0x417fcfc02f40
2013-06-10T14:59:02.551Z cpu1:4172)deltadisk 0x41802394d000 .data 0x417fd0000000 .bss 0x417fd0005340
2013-06-10T14:59:02.551Z cpu1:4172)vmkibft 0x418023971000 .data 0x417fd0400000 .bss 0x417fd0403380
2013-06-10T14:59:02.551Z cpu1:4172)vmfs3 0x418023974000 .data 0x417fd0800000 .bss 0x417fd0800f00
2013-06-10T14:59:02.551Z cpu1:4172)sunrpc 0x4180239ca000 .data 0x417fd0c00000 .bss 0x417fd0c029c0
2013-06-10T14:59:02.551Z cpu1:4172)nfsclient 0x4180239d5000 .data 0x417fd1000000 .bss 0x417fd1003440
Coredump to disk.
2013-06-10T14:59:02.601Z cpu1:4172)Slot 1 of 1.
Thanks for any help.