Configuring workload management with Kueue
As an administrator, you can configure Alauda Build of Kueue to manage workloads in your cluster. This procedure describes the end-to-end process for setting up Kueue-based workload management.
TOC
PrerequisitesProcedure1. Install Alauda Build of Kueue2. Configure Kueue resources3. Set up RBAC4. Set up project namespaces5. Verify the configurationNext stepsPrerequisites
- You have cluster administrator permissions.
- The Alauda Container Platform Web CLI has communication with your cluster.
Procedure
1. Install Alauda Build of Kueue
Install the Alauda Build of Kueue cluster plugin. See Install for detailed instructions.
Verify the installation:
2. Configure Kueue resources
Create the required Kueue resources to enable quota management:
-
Configure ResourceFlavor objects to represent the different node types in your cluster:
For GPU nodes, add node labels and tolerations:
See Configuring quotas for more details.
-
Configure ClusterQueue objects to define resource quotas and admission rules:
-
Configure LocalQueue objects in each namespace that requires Kueue-managed workloads:
3. Set up RBAC
Configure role-based access control for batch administrators and users. See Setup RBAC.
4. Set up project namespaces
For each project namespace that should use Kueue:
- Create a project and namespace in Alauda Container Platform.
- Switch to Alauda AI, click Namespace Manage in Admin > Management Namespace, and select the previously created namespace to complete the management.
- Create a
LocalQueuein the namespace pointing to the appropriateClusterQueue. - (Optional) Create a default local queue to automatically manage all workloads in the namespace. See Managing jobs and label policies.
5. Verify the configuration
-
Verify the ClusterQueue is active:
The ClusterQueue should show as
Active. -
Verify LocalQueues are connected:
-
Submit a test workload to verify admission:
-
Verify the workload was admitted:
Next steps
- Configure fair sharing and cohorts for multi-team resource sharing.
- Configure preemption policies for workload priority management.
- Configure borrowing and lending limits for controlled resource sharing.
- See Running jobs with Kueue for submitting various workload types.