Hello fellow VMers
Since weeks an issue bothers me, not even the hosting company responsible for the hardware under the vSphere system was yet able to help.
The issue
Between 2 directly connected ESXi hosts, TCP-connections (especially MS-SQL targeting) is failing very often with so-called semaphore timeouts. Also sometimes simple file transfers fail.
The setup
Both hosts are supermicro superserver, each with dual decacore broadwell XEONs and 256GB RAM. The hosts are directly connected over strictly for this interconnection used NICs (these NICs have no uplink or any other connections - just a 1GBit/s connection from A - to - B. Both of thesm are Intel I210 interfaces. There is no vMotion or Fault Tollerance traffic going on, that could interfere with the regular TCP/IP traffic. Nothing fancy routingwise (just simple subnet routing).
Both hosts running ESXi 6.7
Checked / Adjusted
I can exclude an issue with the NIC(s), as when I use the same nics for a VPN connection from the datacenter to my office, the issue does not (or not noticably) occur. Also, load and/or traffic has no influence, even with just the SQL DB on A and a client accessing it on B the problem occurs. All found suggestions on the side MS-SQL (software/OS) I tried out, I'm pretty sure now, that this part of the story is not the root/issue.
I checked thousands of lines of enhanced verbose mode log files from ESXi - nothhing really jumped into my eyes, but I have to say, in terms of network adjustments & monitoring, ESXi doesn't offer really much - I guess that's available as sort of "addons" or "upgrades.
I have to say: I'm neither very experienced in networking nor in virtualization concepts. I'm a developer, i learned (what i know - or better what i think to know) because of the requirements of the project im working on. I managed to create stable working Site-to-Site-VPN connections between my office and the datacenter, asnd that handels the same traffic absolutely flawless, but the direct cableconnection between two hosts, that barely require much clicks & inputs is winning the fight against me - this is really driving me nuts ^^
Maybe someone of you experienced guys can point me in the right direction