We recently upgrade a few hosts in one of our cluster from ESXi 5.0 build 702118 (Dell R710) to ESXi 5.1 U1 build 1117900 (Dell R720) and we're noticing that during vMotion operations (maintenance mode and manual) from the 5.0 host to the 5.1 host some VMs are hanging at 65% and causing the VMs to lose multiple pings (anywhere from 5 to 12 pings) but eventually completes. With that much ping loss, the applications are affected. This only happens going from a 5.0 host to a 5.1 host, not the other way around. And its completely random, some VMs will experience this while others don't. Also trying to reproduce this on just one VM multiple times is random as well. Our cluster intially had three 5.0 hosts, each time one was upgraded to 5.1 U1 we saw the problem, again random VMs on different VLANs. We saw the same problems using a single 1GB NIC vMotion vSwitch , multi-NIC 1GB vMotion vSwitch, and also on a single 10GB NIC vMotion vSwitch setup.
Our setup:
vCenter 5.1.0 Build 947673
3 hosts upgraded from ESXi 5.0 build 702118 to ESXi 5.1 U1 build 1117900 and the hardware swapped during each upgrade from Dell R710s to Dell R720s. All Intel NICs.
This issue has got us worried about upgrading our other clusters. On that note, other clusters with all ESXi 5.1 U1 build 1117900 hosts has no problems.
KB2036892 doesn't apply to us as the vMotion doesn't fail and with the 5.1 build, it includes the fix. Thoughts? We have not opened a support case on this yet.