KTransformers

Popular Model Usage

Use this page as the model-family router. The exact support claim lives in Support Matrix; individual model pages may still need runtime smoke before their status is upgraded.

Model familyCurrent entryNotes
DeepSeek V4-Flashkt run deepseek-v4-flash or manual MXFP4 launchNarrow path; verify package and attention constraints before publishing production claims.
DeepSeek V3.2kt run deepseek-v3.2 for registry FP8; tutorial path still mentions AMXINT4Needs reconciliation between registry default and tutorial method.
DeepSeek V3 / R1kt run deepseek-v3, kt run deepseek-r1Current registry exists; older DeepSeek pages using legacy servers should not be copied.
Kimi K2 Thinkingkt run kimi-k2-thinking or RAWINT4 manual launchKeep backend-specific behavior conservative until smoke is recorded.
MiniMax M2 / M2.1kt run m2, kt run m2.1Registry includes parser defaults and tensor-parallel constraints.
MiniMax M2.5Manual SGLang-KT tutorialNeeds smoke before being treated like a registry model.
Qwen3 / Qwen3.5 / Qwen3-Coder-NextManual BF16, FP8, or GPTQ_INT4 examplesChoose method by exact checkpoint and CPU backend.
GLM-5 / GLM-5.1Manual BF16, FP8, or FP8_PERCHANNEL examplesTreat transformer version constraints as part of the support tuple.

Rule for Copying Commands

Only copy a command when all of these match:

model family + checkpoint + KT method + CPU ISA/backend + GPU count + package versions

If any field changes, treat the command as a starting point and rerun the smoke checklist.