NodeNetworkTransmitErrs¶
Meaning¶
Network interface is reporting many transmit errors.
Impact¶
Applications on the node may no longer be able to operate with other services. Network attached storage performance issues or even data loss.
Diagnosis¶
Investigate networking issues on the node and to connected hardware. Check network interface saturation. Check CPU usage saturation. Check physical cables, check networking firewall rules and so on.
Mitigation¶
In general mitigation landscape is quite vast, some suggestions:
- Ensure some node capacity is left unallocated (cpu/memory) for handling networking.
- Increase TX queue length
- Spread services to other nodes/pods.
- Replace physical cables, change ports.
- Look into introducting Quality of Service or other TCP congestion avoidance algorithms