Hi all,
I have a HP ProLiant DL360e Gen8 Server with 96 GB of RAM installed.
There are 4 1Gbit NIC (igb driver used)
~ # esxcfg-nics -l
Name PCI Driver Link Speed Duplex MAC Address MTU Description
vmnic0 0000:02:00.00 igb Up 1000Mbps Full 38:63:bb:2c:a5:b8 1500 Intel Corporation I350 Gigabit Network Connection
vmnic1 0000:02:00.01 igb Up 1000Mbps Full 38:63:bb:2c:a5:b9 9000 Intel Corporation I350 Gigabit Network Connection
vmnic2 0000:02:00.02 igb Down 0Mbps Half 38:63:bb:2c:a5:ba 1500 Intel Corporation I350 Gigabit Network Connection
vmnic3 0000:02:00.03 igb Down 0Mbps Half 38:63:bb:2c:a5:bb 1500 Intel Corporation I350 Gigabit Network Connection
~ #
I use ESXi 5.5 U2:
~ # esxcli system version get
Product: VMware ESXi
Version: 5.5.0
Build: Releasebuild-2718055
Update: 2
~ #
There are 2 local vSwitches:
~ # esxcfg-vswitch -l
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 2432 4 128 1500 vmnic0
PortGroup Name VLAN ID Used Ports Uplinks
VM Network 0 0 vmnic0
mgmt 0 1 vmnic0
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch1 2432 4 128 9000 vmnic1
PortGroup Name VLAN ID Used Ports Uplinks
san 950 1 vmnic1
~ #
~ # esxcfg-vmknic -l
Interface Port Group/DVPort/Opaque Network IP Family IP Address Netmask Broadcast MAC Address MTU TSO MSS Enabled Type
vmk0 mgmt IPv4 192.168.4.232 255.255.255.0 192.168.4.255 38:63:bb:2c:a5:bb 1500 65535 true STATIC
vmk1 san IPv4 172.25.50.232 255.255.255.0 172.25.50.255 00:50:56:67:28:d2 9000 65535 true STATIC
~ #
The VMKernel interface vmk1 is configured to connect to NFS datastore (using separated nic vmnic1 and separated VLAN 950). But there were timeout problems when trying to access NFS datastore.
What I've figured out is that I am not able to ping to NFS datastore (and reverse) with packet size bigger than 504 bytes...:
~ # vmkping -I vmk1 -d 172.25.50.233
PING 172.25.50.233 (172.25.50.233): 56 data bytes
64 bytes from 172.25.50.233: icmp_seq=0 ttl=64 time=0.296 ms
64 bytes from 172.25.50.233: icmp_seq=1 ttl=64 time=0.235 ms
64 bytes from 172.25.50.233: icmp_seq=2 ttl=64 time=0.236 ms
--- 172.25.50.233 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.235/0.256/0.296 ms
~ # vmkping -I vmk1 -s 504 -d 172.25.50.233
PING 172.25.50.233 (172.25.50.233): 504 data bytes
512 bytes from 172.25.50.233: icmp_seq=0 ttl=64 time=0.338 ms
512 bytes from 172.25.50.233: icmp_seq=1 ttl=64 time=0.268 ms
512 bytes from 172.25.50.233: icmp_seq=2 ttl=64 time=0.234 ms
--- 172.25.50.233 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.234/0.280/0.338 ms
~ # vmkping -I vmk1 -s 505 -d 172.25.50.233
PING 172.25.50.233 (172.25.50.233): 505 data bytes
--- 172.25.50.233 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
~ #
What's more, I've ran pktcap-uw tool to look into packets on network interface:
for:
# vmkping -I vmk1 -c 1 -d 172.25.50.233
There is packet visible:
12:17:14.50741[6] Captured at EtherswitchDispath point, TSO not enabled, Checksum not offloaded and not verified, VLAN tag 950, length 98.
Segment[0] ---- 98 bytes:
0x0000: 0050 5667 28d2 0cc4 7a18 3bd4 0800 4500
0x0010: 0054 1e4e 4000 4001 5e57 ac19 32e9 ac19
0x0020: 32e8 0000 e6af 4699 0000 555d ccca 0000
0x0030: c58b 0809 0a0b 0c0d 0e0f 1011 1213 1415
0x0040: 1617 1819 1a1b 1c1d 1e1f 2021 2223 2425
0x0050: 2627 2829 2a2b 2c2d 2e2f 3031 3233 3435
0x0060: 3637
but for:
# vmkping -I vmk1 -c 1 -s 505 -d 172.25.50.233
there is nothing visible on physical interface.
The packet is visible on "vmk1" layer:
~ # pktcap-uw --vmk vmk1
The name of the vmk is vmk1
No server port specifed, select 39635 as the port
Output the packet info to console.
Local CID 2
Listen on port 39635
Accept...Vsock connection from port 1028 cid 2
12:19:52.182469[1] Captured at PortInput point, TSO not enabled, Checksum not offloaded and not verified, length 547.
Segment[0] ---- 547 bytes:
0x0000: 0cc4 7a18 3bd4 0050 5667 28d2 0800 4500
0x0010: 0215 1ac7 4000 4001 601d ac19 32e8 ac19
0x0020: 32e9 0800 b1a5 de9a 0000 555d cd68 0002
0x0030: c87a 0809 0a0b 0c0d 0e0f 1011 1213 1415
0x0040: 1617 1819 1a1b 1c1d 1e1f 2021 2223 2425
0x0050: 2627 2829 2a2b 2c2d 2e2f 3031 3233 3435
0x0060: 3637 3839 3a3b 3c3d 3e3f 4041 4243 4445
0x0070: 4647 4849 4a4b 4c4d 4e4f 5051 5253 5455
0x0080: 5657 5859 5a5b 5c5d 5e5f 6061 6263 6465
0x0090: 6667 6869 6a6b 6c6d 6e6f 7071 7273 7475
0x00a0: 7677 7879 7a7b 7c7d 7e7f 8081 8283 8485
0x00b0: 8687 8889 8a8b 8c8d 8e8f 9091 9293 9495
0x00c0: 9697 9899 9a9b 9c9d 9e9f a0a1 a2a3 a4a5
0x00d0: a6a7 a8a9 aaab acad aeaf b0b1 b2b3 b4b5
0x00e0: b6b7 b8b9 babb bcbd bebf c0c1 c2c3 c4c5
0x00f0: c6c7 c8c9 cacb cccd cecf d0d1 d2d3 d4d5
0x0100: d6d7 d8d9 dadb dcdd dedf e0e1 e2e3 e4e5
0x0110: e6e7 e8e9 eaeb eced eeef f0f1 f2f3 f4f5
0x0120: f6f7 f8f9 fafb fcfd feff 0001 0203 0405
0x0130: 0607 0809 0a0b 0c0d 0e0f 1011 1213 1415
0x0140: 1617 1819 1a1b 1c1d 1e1f 2021 2223 2425
0x0150: 2627 2829 2a2b 2c2d 2e2f 3031 3233 3435
0x0160: 3637 3839 3a3b 3c3d 3e3f 4041 4243 4445
0x0170: 4647 4849 4a4b 4c4d 4e4f 5051 5253 5455
0x0180: 5657 5859 5a5b 5c5d 5e5f 6061 6263 6465
0x0190: 6667 6869 6a6b 6c6d 6e6f 7071 7273 7475
0x01a0: 7677 7879 7a7b 7c7d 7e7f 8081 8283 8485
0x01b0: 8687 8889 8a8b 8c8d 8e8f 9091 9293 9495
0x01c0: 9697 9899 9a9b 9c9d 9e9f a0a1 a2a3 a4a5
0x01d0: a6a7 a8a9 aaab acad aeaf b0b1 b2b3 b4b5
0x01e0: b6b7 b8b9 babb bcbd bebf c0c1 c2c3 c4c5
0x01f0: c6c7 c8c9 cacb cccd cecf d0d1 d2d3 d4d5
0x0200: d6d7 d8d9 dadb dcdd dedf e0e1 e2e3 e4e5
0x0210: e6e7 e8e9 eaeb eced eeef f0f1 f2f3 f4f5
0x0220: f6f7 f8
As I understand, packets are being lost somewhere between vmk1 and vmnic1 interfaces ?
I've tried to change physical interfaces, but all behave the same.
Is my hardware broken?
Is the igb driver broken ?
Cheers
Marek