Skip to main content

Quick Start

RoleBasedGroup (RBG) is a custom resource that models a group of roles (each role represents a workload type and set of pods) and the relationships between them.

Conceptual View

rbg

Key Features

PD Colocation

When a request comes into an LLM inference engine, the system will first take the user input to generate the first token (prefill), then generate outputs token-by-token autoregressively (decode).

colocation

Example Deployments

Single Node Examples

Multi Nodes Examples

PD-Disaggregated Examples