Integrate with LeaderWorkerSet
This page shows how to use Alauda Build of Kueue to manage LeaderWorkerSet (LWS) workloads. A LeaderWorkerSet is a Kubernetes API that enables a common deployment pattern where a single leader coordinates multiple workers.
LeaderWorkerSet is particularly useful for distributed inference and training scenarios where a leader process manages the execution of multiple worker processes.
Prerequisites
- You have installed the Alauda Build of Kueue.
- You have installed the LeaderWorkerSet controller.
- The
LeaderWorkerSetframework is enabled in the Kueue configuration. - The Alauda Container Platform Web CLI has communication with your cluster.
- You have created a
ClusterQueue,ResourceFlavor, andLocalQueue.
Procedure
-
Create a
LeaderWorkerSetresource with thekueue.x-k8s.io/queue-namelabel:kueue.x-k8s.io/queue-name: Specifies the LocalQueue that manages this LeaderWorkerSet. Kueue admits all groups together.replicas: 2: Creates 2 leader-worker groups, each managed independently.size: 3: Each group consists of 1 leader and 2 workers (size = total pods including leader).
-
Apply the LeaderWorkerSet:
-
Monitor the admission:
-
Check the LeaderWorkerSet status:
-
View individual pods: