2024-02-23

Kubernetes leases

If you’re developing or managing a Kubernetes operator, you might encounter an error similar to this:

E0223 11:11:46.086693       1 leaderelection.go:330] error retrieving resource lock system/b7e9931f.trustyai.opendatahub.io: leases.coordination.k8s.io "b7e9931f.trustyai.opendatahub.io" is forbidden: User "system:serviceaccount:system:trustyai-service-operator-controller-manager" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "system".

This error indicates that the operator’s service account lacks the necessary permissions to access the leases resource in the coordination.k8s.io API group, necessary for tasks like leader election. Here’s how to fix this issue by granting the appropriate permissions.

Adding Permissions in the Controller

To address this, you can use Kubebuilder annotations to ensure your controller has the necessary RBAC permissions. Add the following line to your controller’s code:

//+kubebuilder:rbac:groups=coordination.k8s.io,resources=leases,verbs=get;list;watch;create;update;patch;delete

This annotation tells Kubebuilder to generate the required RBAC permissions for managing leases. After adding it, regenerate your RBAC configuration and apply it to your cluster.

Alternatively, Manually Update Roles

If you’re not using Kubebuilder or prefer a manual approach, you can directly edit your RBAC roles. Create or update a ClusterRole with the necessary permissions:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: lease-access
rules:
- apiGroups: ["coordination.k8s.io"]
  resources: ["leases"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

Then, bind this ClusterRole to your service account using a ClusterRoleBinding.

Understanding Leases and Coordination

The leases resource in the coordination.k8s.io API group is important for implementing leader election in Kubernetes. Leader election ensures that only one instance of your operator is active at a time and preventing conflicts. You can read more on leases in here.

KServe Deployment: Increase inotify Watches

Deploying KServe on Kubernetes can run into a “too many open files” error, especially when using Kind. This error is due to a low limit of inotify watches, which are needed to track file changes.

KinD users often face this issue, detailed in their known issues page. The solution is to increase the inotify watches limit:

sysctl fs.inotify.max_user_watches=524288

This command raises the watch limit, facilitating a smoother KServe deployment. For a permanent fix, add the line to your /etc/sysctl.conf or /etc/sysctl.d/ files.

By adjusting this setting, you should avoid common file monitoring errors and enjoy a seamless KServe experience on KinD.