Overview of managing workloads with Kueue

Alauda Build of Kueue is a Kubernetes-native system that manages quotas and how jobs consume them. Alauda Build of Kueue decides when a job should wait, when a job should be admitted to start (as in pods can be created) and when a job should be preempted (as in active pods should be deleted).

Alauda Build of Kueue does not replace any existing components in a Kubernetes cluster, but instead integrates with the existing Kubernetes API server, scheduler, and cluster autoscaler components.

Alauda Build of Kueue supports all-or-nothing semantics. This means that either an entire job with all of its components is admitted to the cluster, or the entire job is rejected if it does not fit on the cluster.

Supported workload types

The following workload types can be managed by Alauda Build of Kueue:

Workload typeDescription
JobStandard Kubernetes batch jobs
RayJobRay-based distributed computing jobs managed by KubeRay
RayClusterRay clusters for distributed workloads managed by KubeRay
PyTorchJobDistributed PyTorch training jobs managed by Kubeflow
InferenceServiceModel serving workloads managed by KServe
PipelineRunCI/CD pipeline runs managed by Alauda DevOps Pipelines (Tekton)
JobSetA group of related Kubernetes Jobs managed together as a single unit
LeaderWorkerSetA leader-worker pattern where a single leader coordinates multiple workers

Queue enforcement for projects

Alauda Build of Kueue uses a label-based mechanism to manage workloads. To enable Kueue management for a namespace:

  1. Create a LocalQueue in the namespace that points to a ClusterQueue.
  2. Workloads that carry the kueue.x-k8s.io/queue-name label are managed by Kueue.
  3. If a default local queue (named default) exists in the namespace, workloads without the label are automatically assigned to it.

When a workload is submitted to a Kueue-managed queue, it enters a Suspended or SchedulingGated state until Kueue admits it based on the available quota. Once admitted, the workload's pods are created and scheduled normally by Kubernetes.

INFO

Note: For workloads created from the Alauda AI dashboard (such as workbenches or model servers), the queue name label is automatically applied when the user selects a local queue during creation.

Restrictions

When using Alauda Build of Kueue to manage workloads, be aware of the following restrictions:

  • Every requestable resource must be covered. Any resource that a workload might request (CPU, memory, GPU, ephemeral-storage, etc.) must be listed in the ClusterQueue's coveredResources with a nominalQuota value (even if 0). If a workload requests a resource that is not covered, it will not be admitted.
  • Workbenches (Notebooks) are not suspendable. Because notebook sessions cannot be gracefully suspended and resumed, they must be assigned to a local queue backed by a non-preemptive ClusterQueue. If a workbench is assigned to a preemptive queue, it may be terminated without the ability to recover. See Configuring preemption for how to configure non-preemptive queues.
  • Use ResourceFlavors for node placement instead of node selectors. When using Kueue, do not rely on node selectors or tolerations directly in the workload pod template for node placement. Instead, configure ResourceFlavor objects with the appropriate nodeLabels and tolerations. Kueue injects the node affinity from the matched ResourceFlavor when it admits the workload.
  • InferenceService workloads require ephemeral-storage. When using Kueue with InferenceService (KServe), ensure that ephemeral-storage is included in the ClusterQueue's coveredResources, as KServe typically requests this resource.

Kueue workflow

The typical workflow for using Alauda Build of Kueue involves different roles:

Cluster administrator

  1. Install the Alauda Build of Kueue cluster plugin. See Install.
  2. Configure ResourceFlavor objects to represent the different node types in the cluster.
  3. Configure ClusterQueue objects to define resource quotas and fair sharing rules.
  4. Configure LocalQueue objects in each namespace that requires Kueue-managed workloads.
  5. Set up RBAC permissions for batch administrators and users. See Setup RBAC.

Batch administrator

  1. Manage ClusterQueue and LocalQueue objects for their respective teams.
  2. Monitor pending workloads and resource utilization using the Visibility API. See Monitoring pending workloads.
  3. Adjust quotas, fair sharing weights, and preemption policies as needed.

Data scientist / ML engineer

  1. Submit workloads (training jobs, inference services, etc.) to a local queue by adding the kueue.x-k8s.io/queue-name label to the workload manifest. See Running jobs with Kueue.
  2. Monitor the status of their workloads in the queue.
  3. If a default local queue is configured in the namespace, workloads are automatically assigned to it without needing to specify the label.

Key concepts

Alauda Build of Kueue introduces several key resources that work together to manage workload scheduling and resource allocation:

  • ResourceFlavor: A cluster-scoped resource that represents different resource variations in the cluster. Resource flavors use node labels, taints, and tolerations to associate workloads with specific node types.
  • ClusterQueue: A cluster-scoped resource that governs a pool of resources such as CPU, memory, pods, and GPUs. Cluster queues define usage limits, quotas for resource flavors, and fair sharing rules.
  • LocalQueue: A namespace-scoped resource that groups related workloads belonging to a single namespace. A local queue points to a cluster queue, allocating the cluster queue's resources to workloads in that namespace.
  • Workload: The unit of admission in Alauda Build of Kueue. A workload is an application that runs to completion and can be composed of one or multiple pods.
  • Cohort: A group of cluster queues that can share borrowable resources with each other, enabling better resource utilization across teams.

Resource hierarchy

ResourceFlavor (cluster-scoped)
  Represents different node types (e.g., GPU models, CPU architectures)


ClusterQueue (cluster-scoped)
  Defines resource quotas per flavor (CPU, memory, GPU limits)
  Optionally belongs to a Cohort for resource sharing


LocalQueue (namespace-scoped)
  Points to a ClusterQueue, allocating its resources to a namespace


Workloads (Job, RayJob, PyTorchJob, InferenceService, PipelineRun, ...)
  Submitted to a LocalQueue via the kueue.x-k8s.io/queue-name label