Configuring preemption
Preemption is the process of evicting admitted workloads so that higher-priority or fairly-shared workloads can be admitted. Alauda Build of Kueue supports several preemption strategies that allow administrators to control how workloads compete for resources.
TOC
Preemption policiesClusterQueue preemption configurationExample ClusterQueue with preemption configurationPreemption field referencewithinClusterQueuereclaimWithinCohortborrowWithinCohortNon-preemptive queuesPreemption policies
Alauda Build of Kueue supports the following global preemption policies, which can be configured when deploying the Alauda Build of Kueue cluster plugin:
ClusterQueue preemption configuration
You can configure preemption behavior at the ClusterQueue level using the spec.preemption field. This allows fine-grained control over when and how workloads in a specific ClusterQueue can be preempted.
Example ClusterQueue with preemption configuration
preemption: Configures the preemption behavior for this ClusterQueue.reclaimWithinCohort: Controls whether workloads in this ClusterQueue can preempt workloads in other ClusterQueues of the same cohort to reclaim resources that were borrowed. Possible values:Never,Any,LowerPriority.borrowWithinCohort: Controls whether workloads in this ClusterQueue can preempt workloads in other ClusterQueues of the same cohort to borrow resources.borrowWithinCohort.policy:LowerPrioritymeans only lower-priority workloads in other queues can be preempted.Neverdisables preemption for borrowing.withinClusterQueue: Controls whether workloads within this ClusterQueue can preempt each other.LowerPrioritymeans higher-priority workloads can preempt lower-priority ones.Neverdisables intra-queue preemption.
Preemption field reference
withinClusterQueue
Controls whether a pending workload can preempt other workloads in the same ClusterQueue.
reclaimWithinCohort
Controls whether a pending workload can preempt workloads in other ClusterQueues of the same cohort to reclaim nominal quota that is being borrowed.
borrowWithinCohort
Controls whether a pending workload can preempt workloads in other ClusterQueues of the same cohort to borrow unused resources.
Non-preemptive queues
Some workload types, such as interactive notebook sessions, are not suspendable. These workloads should only be assigned to a local queue backed by a non-preemptive ClusterQueue.
A non-preemptive ClusterQueue keeps the default preemption settings (all set to Never):
Note: If you assign a non-suspendable workload (such as a Notebook) to a preemptive queue, the workload might be preempted and fail because it cannot be gracefully suspended and resumed.