Quantcast
Channel: VMware Communities : All Content - All Communities
Viewing all articles
Browse latest Browse all 179681

Network connectivity lost & Network uplink redundancy lost & Reset Rules

$
0
0

Good morning to all of you,

 

We're finishing up configuring our new production VMware 6.7 based cluster, however, we're experiencing kind of strange issue associated with Alarms and particular reset rules. The case is that we don't get email notifications when 'Network connectivity lost' error is normalized and turns Green after we get any ESXi host back online. The same applies to the situation when we shutdown one port from a particular port-channel group (then network uplik redundancy is lost). We get email notifications informing us that network connectivity or uplik redundancy is lost, but our reset rules don't seem to work.

 

1. We verified if some emails were blocked on the mail server, but according to the people responsible for this service - that's not the case.

2. I found this article and thought maybe there's an issue with the notification service in general.

3. Alarm rules work just fine - we get email notifications (as written above).

4. We didn't change anything in Network connectivity lostReset rules except enabling SNMP traps and Send email notifications option and setting the proper email address.

 

Reset Rule 1.jpg

redundancy.jpg

 

When we go to a particular host and look at the Events we have the following set of events:

events.jpg

So there's no email notification after Network uplink redundancy lost changes from Red to Green.

 

We're wondering if this might have to do with the notification service itself.

 

[UPDATE, July 7th]

Today I carried out another test, and something really weird happened.

 

1. I shut down one port combined into a port-channel to trigger 'redundancy lost' alarm, and I got the notification - great!

2. I restored the previous status of that particular switch port so that the redundancy was recovered, notification was gone in vCenter, however, there was no email notification that the status had changed from Red to Green.

3. I decided to display hostd.log in 'live' mode to see what was (and is) happening.

4. At that point both vmnic0 and vmnic4 were online (port-channel was fully operational).

5. Then I noticed (in hostd.log file) that ESXi was - BY TURNS - generating the following events: esx.problem.net.redundancy.lost and esx.clear.net.redundancy.restored - it seemed to be like an infinite loop. Those log entries were being generated over and over again. It was like that up to the moment when I decided to restart management agents. Along the way we got two additional email notifications informing that redundancy was lost even though the port-channel was already up and running.

 

I'm pretty sure, it should not work like that.

 

Message was edited by: Marcin Siemion


Viewing all articles
Browse latest Browse all 179681

Trending Articles