Skip to main content

SGLang PD-Disaggregated Deployment

Deploy prefill/decode disaggregated inference with SGLang.

apiVersion: rbgs.sgl-project.dev/v1
kind: RoleBasedGroup
metadata:
name: sglang-pd
spec:
roles:
- name: prefill
workloadType: StatefulSet
replicas: 2
roleSpec:
# Prefill configuration
- name: decode
workloadType: LeaderWorkerSet
replicas: 4
roleSpec:
# Decode configuration

rbg-pd

View the complete example: SGLang PD-Disagg