1. Confirmed that the CPU utilization was good at the Host and VM levels.
2. Confirmed that there was no memory swapping.
3. Confirmed that the VMs were using the VMXNet 3 Nic vs. the E1000.
4. Confirmed that updates outlined in VMware KB2052917 have been applied to all hosts.
All of the above checked out clean in the environment.
Then I ran RESXTOP (Remote ESXTOP gives
you real-time info of the host). Sure
enough, every few cycles, %DRPRX (Dropped packets received) would go from 0.00
to showing double digit numbers, then back down to 0.00… I checked another ESXi host, and it exhibited
the same behavior.
I then disabled the Virtual Distributed Switch (vDS) “Health
Check” function, and that stopped the spikes in the dropped packets. (Must be done through the Web Client)
It turns out, the Health Check uses periodic broadcasts during
the Teaming, VLAN and MTU check process.
The “Health
Check” feature is disabled by default. However,
during the vDS setup process, I saw the Health Check option and figured “What can it hurt?”.Well, now we know....
Moving forward, I'll enable this feature temporarily after any configuration changes to
confirm that Teaming, VLAN and MTU is setup correctly then immediately disable the
feature. (ex. VLAN changes, ESXi host
replacements etc.)
We've been asking VMware support about why some VMs on a DVS are showing high values for %DRPRX in ESXTop and they weren't much help. Turning off the health check did the trick, thanks!
ReplyDelete