Popular Model Usage
Use this page as the model-family router. The exact support claim lives in Support Matrix; individual model pages may still need runtime smoke before their status is upgraded.
| Model family | Current entry | Notes |
|---|---|---|
| DeepSeek V4-Flash | kt run deepseek-v4-flash or manual MXFP4 launch | Narrow path; verify package and attention constraints before publishing production claims. |
| DeepSeek V3.2 | kt run deepseek-v3.2 for registry FP8; tutorial path still mentions AMXINT4 | Needs reconciliation between registry default and tutorial method. |
| DeepSeek V3 / R1 | kt run deepseek-v3, kt run deepseek-r1 | Current registry exists; older DeepSeek pages using legacy servers should not be copied. |
| Kimi K2 Thinking | kt run kimi-k2-thinking or RAWINT4 manual launch | Keep backend-specific behavior conservative until smoke is recorded. |
| MiniMax M2 / M2.1 | kt run m2, kt run m2.1 | Registry includes parser defaults and tensor-parallel constraints. |
| MiniMax M2.5 | Manual SGLang-KT tutorial | Needs smoke before being treated like a registry model. |
| Qwen3 / Qwen3.5 / Qwen3-Coder-Next | Manual BF16, FP8, or GPTQ_INT4 examples | Choose method by exact checkpoint and CPU backend. |
| GLM-5 / GLM-5.1 | Manual BF16, FP8, or FP8_PERCHANNEL examples | Treat transformer version constraints as part of the support tuple. |
Rule for Copying Commands
Only copy a command when all of these match:
model family + checkpoint + KT method + CPU ISA/backend + GPU count + package versions
If any field changes, treat the command as a starting point and rerun the smoke checklist.