DPO Status
DPO is not documented as current KT fine-tuning support yet.
The current public fine-tuning section should stay focused on MoE LoRA SFT through LLaMA-Factory. DPO can be added only after the exact KT path is confirmed:
| Item | Required evidence |
|---|---|
| Training entry | Current LLaMA-Factory command, not an old patching path. |
| KT backend | Explicit kt_config backend that maps to current KT SFT code. |
| Model | Exact checkpoint and prepared expert weights if needed. |
| Runtime | Smoke result on a named machine and environment. |
| Output | Adapter files and at least one minimal post-training sanity check. |
Until then, older DPO pages should be treated as historical references rather than user-facing instructions.