KServe

kserve.jpg

Logging

Troubleshooting

The Impact of Low inotify Watches

When deploying KServe on Kubernetes, you might encounter a snag that leaves you scratching your head. A common yet perplexing issue arises with the following error messages:

...failed to get informer from cache...
...error retrieving resource lock...
...leader election lost...
...unable to run the manager...

Among these, a particularly notable message is the system-level error too many open files. This is an indicator of a broader issue related to the number of inotify watches available on your system.

The Culprit: Low inotify Watches

Inotify watches are crucial for monitoring file system events. When running Kubernetes clusters, and specifically with tools like KServe, each watch becomes a valuable resource due to the multitude of file-related operations. When the limit is too low, you’re bound to hit a wall, as the system can’t keep an eye on all the necessary files, leading to the dreaded errors mentioned above.

This problem is known to manifest more frequently when using Kubernetes IN Docker (KinD), as many users have discovered. KinD is an excellent tool for local Kubernetes testing, but it does come with its own set of quirks, as documented in their known issues section.

The Solution: Increase the Limit

Thankfully, the fix is straightforward. The KinD documentation provides a solution that involves increasing the number of allowable inotify watches. For systems using sysctl, the command would be:

sysctl fs.inotify.max_user_watches=524288

This command increases the limit significantly, allowing KServe and other Kubernetes resources to deploy without running into the “too many open files” error. For a permanent fix, you can also add fs.inotify.max_user_watches=524288 to your sysctl configuration file (usually located at /etc/sysctl.conf or a file within /etc/sysctl.d/).

The deployment of KServe on Kubernetes can hit a roadblock if the system runs out of inotify watches. This is a known issue, especially on KinD clusters, but the solution is as simple as increasing the inotify watch limit. By doing so, you not only smooth out the deployment of KServe but also enhance the overall robustness of your local Kubernetes environment. Keep this tip in mind, and your path to deploying KServe should be clear and error-free.