NodeFileDescriptorLimit¶
Meaning¶
This alert is triggered when a node's kernel is found to be running out of
available file descriptors -- a warning level alert at greater than 70% usage
and a critical level alert at greater than 90% usage.
Impact¶
Applications on the node may no longer be able to open and operate on files. This is likely to have severe consequences for anything scheduled on this node.
Diagnosis¶
You can open a shell on the node and use the standard Linux utilities to diagnose the issue:
$ NODE_NAME='<value of instance label from alert>'
$ oc debug "node/$NODE_NAME"
# sysctl -a | grep 'fs.file-'
fs.file-max = 1597016
fs.file-nr = 7104 0 1597016
# lsof -n
Mitigation¶
Reduce the number of files opened simultaneously by either adjusting application configuration or by moving some applications to other nodes.