RoleBasedGroup (RBG)

RoleBasedGroup (RBG) is a Kubernetes API for orchestrating distributed, stateful AI inference workloads with multi-role collaboration and built-in service discovery.

It provides a common deployment pattern for production LLM inference, especially disaggregated architectures such as prefill/decode separation.

Why RBG?

Traditional Kubernetes primitives (e.g. plain StatefulSets / Deployments) are ill-suited for LLM inference services that:

run as multi-role topologies (gateway / router / prefill / decode),
are performance-sensitive to GPU / network topology,
and require atomic, cross-role operations (deploy, upgrade, scale, failover).

RBG treats an inference service as a role-based group, not a loose set of workloads. It models the service as a topologized, stateful, coordinated multi-role organism and manages it as a single unit.

Key Concepts

Role

The basic scheduling and rollout unit. Each role (e.g. prefill, decode) has its own spec, lifecycle and policies.

RoleBasedGroup

A group of roles that together form one logical service (e.g. one LLM inference deployment).

Project Status

Version	Kubernetes Version	LeaderWorkerSet Version
main	>=v1.22.x	>=v0.7.0
v0.4.0	>=v1.28.x	>=v0.7.0
v0.3.0	>=v1.28.x	>=v0.6.0

Why RBG?​

Key Concepts​

Role​

RoleBasedGroup​

Project Status​

Community​

Why RBG?

Key Concepts

Role

RoleBasedGroup

Project Status

Community