Configuring preemption

Preemption is the process of evicting admitted workloads so that higher-priority or fairly-shared workloads can be admitted. Alauda Build of Kueue supports several preemption strategies that allow administrators to control how workloads compete for resources.

Preemption policies

Alauda Build of Kueue supports the following global preemption policies, which can be configured when deploying the Alauda Build of Kueue cluster plugin:

Policy	Description
`Classical`	Standard preemption behavior based on priority and ClusterQueue-level preemption policies. This is the default.
`FairSharing`	Enables fair sharing preemption strategy, where workloads are preempted based on the relative share of resources consumed by each ClusterQueue in a cohort.

ClusterQueue preemption configuration

You can configure preemption behavior at the ClusterQueue level using the spec.preemption field. This allows fine-grained control over when and how workloads in a specific ClusterQueue can be preempted.

Example ClusterQueue with preemption configuration

apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
  name: team-a-queue
spec:
  namespaceSelector: {}
  resourceGroups:
  - coveredResources: ["cpu", "memory", "nvidia.com/gpu"]
    flavors:
    - name: "default-flavor"
      resources:
      - name: "cpu"
        nominalQuota: 16
      - name: "memory"
        nominalQuota: 64Gi
      - name: "nvidia.com/gpu"
        nominalQuota: 4
  preemption:
    reclaimWithinCohort: Any
    borrowWithinCohort:
      policy: LowerPriority
    withinClusterQueue: LowerPriority

preemption: Configures the preemption behavior for this ClusterQueue.
reclaimWithinCohort: Controls whether workloads in this ClusterQueue can preempt workloads in other ClusterQueues of the same cohort to reclaim resources that were borrowed. Possible values: Never, Any, LowerPriority.
borrowWithinCohort: Controls whether workloads in this ClusterQueue can preempt workloads in other ClusterQueues of the same cohort to borrow resources.
borrowWithinCohort.policy: LowerPriority means only lower-priority workloads in other queues can be preempted. Never disables preemption for borrowing.
withinClusterQueue: Controls whether workloads within this ClusterQueue can preempt each other. LowerPriority means higher-priority workloads can preempt lower-priority ones. Never disables intra-queue preemption.

Preemption field reference

`withinClusterQueue`

Controls whether a pending workload can preempt other workloads in the same ClusterQueue.

Value	Description
`Never`	(Default) No preemption within the same ClusterQueue.
`LowerPriority`	A pending workload can preempt active workloads with lower priority in the same ClusterQueue.
`LowerOrNewerEqualPriority`	A pending workload can preempt active workloads with lower priority, or with equal priority that were admitted more recently.

`reclaimWithinCohort`

Controls whether a pending workload can preempt workloads in other ClusterQueues of the same cohort to reclaim nominal quota that is being borrowed.

Value	Description
`Never`	(Default) No preemption to reclaim borrowed resources.
`Any`	A pending workload can preempt workloads in other ClusterQueues of the cohort, regardless of priority.
`LowerPriority`	A pending workload can only preempt lower-priority workloads in other ClusterQueues of the cohort.

`borrowWithinCohort`

Controls whether a pending workload can preempt workloads in other ClusterQueues of the same cohort to borrow unused resources.

Value	Description
`Never`	(Default) No preemption to borrow resources.
`LowerPriority`	A pending workload can preempt lower-priority workloads in other ClusterQueues of the cohort to borrow resources.

Non-preemptive queues

Some workload types, such as interactive notebook sessions, are not suspendable. These workloads should only be assigned to a local queue backed by a non-preemptive ClusterQueue.

A non-preemptive ClusterQueue keeps the default preemption settings (all set to Never):

apiVersion: kueue.x-k8s.io/v1beta2
kind: ClusterQueue
metadata:
  name: non-preemptive-queue
spec:
  namespaceSelector: {}
  resourceGroups:
  - coveredResources: ["cpu", "memory"]
    flavors:
    - name: "default-flavor"
      resources:
      - name: "cpu"
        nominalQuota: 16
      - name: "memory"
        nominalQuota: 64Gi
  preemption:
    withinClusterQueue: Never
    reclaimWithinCohort: Never
    borrowWithinCohort:
      policy: Never

INFO

Note: If you assign a non-suspendable workload (such as a Notebook) to a preemptive queue, the workload might be preempted and fail because it cannot be gracefully suspended and resumed.

#Configuring preemption

#TOC

#Preemption policies

#ClusterQueue preemption configuration

#Example ClusterQueue with preemption configuration

#Preemption field reference

#withinClusterQueue

#reclaimWithinCohort

#borrowWithinCohort

#Non-preemptive queues