Support Matrix

Use this matrix to understand which model combinations have been validated. Results apply directly when the model family, checkpoint, KT method/backend, hardware class, and serving or training entry all match.

For label meanings, see Model Status Guide. Rerun a minimum serving or training check before using a result on different hardware, backends, or package versions.

Status

Status	Meaning
Current	Code entry exists and the documented interface matches the current repo.
Current, narrow	Supported under explicit model, hardware, package, or backend constraints.
Needs smoke	Code or docs exist, but the target combination should be revalidated before use in the current environment.
Needs reconciliation	Multiple current-looking paths exist and still need one consistent recommendation.
Not current support	Not a current KTransformers capability.
Supported direction	The project direction is valid, but model-specific docs need hardware validation.
Legacy	Depends on old `local_chat.py`, `ktransformers/server/main.py`, `balance_serve`, or `kt_optimize_rule` paths.

Inference Methods

Method	Status	Notes
`BF16`	Current	Native BF16 MoE expert inference for documented model paths.
`FP8`	Current	Native FP8 path used by DeepSeek, MiniMax, Qwen, and GLM-style pages.
`FP8_PERCHANNEL`	Current, narrow	Per-channel FP8; applies only to matching checkpoints.
`RAWINT4`	Current / Needs smoke	Kimi-style native INT4 path; backend behavior differs by CPU ISA.
`GPTQ_INT4`	Needs smoke	Current inference method, but not a universal INT4 recommendation.
`AMXINT4`	Current	AMX converted INT4 expert weights.
`AMXINT8`	Current	AMX converted INT8 expert weights.
`MXFP4`	Current, narrow	DeepSeek V4-Flash specific.
`LLAMAFILE`	Current, secondary	GGUF / llamafile compatibility backend.

Inference Models

Model / family	Precision	Entry	Status
DeepSeek V4-Flash	`MXFP4`	`kt run deepseek-v4-flash`	Needs smoke
DeepSeek V3.2	`FP8` registry; `AMXINT4` tutorial path	`kt run deepseek-v3.2` or model tutorial	Needs reconciliation
DeepSeek V3-0324 / R1-0528	`AMXINT4` registry default	`kt run deepseek-v3` / `kt run deepseek-r1`	Current / Needs docs
Kimi K2 Thinking	`RAWINT4`	`kt run kimi-k2-thinking`	Current / Needs smoke
Kimi K2.5	`RAWINT4`	Manual SGLang-KT tutorial	Needs smoke
MiniMax M2 / M2.1	`FP8`	`kt run m2` / `kt run m2.1`	Current / Needs smoke
MiniMax M2.5	`FP8`	Manual SGLang-KT tutorial	Needs smoke
Qwen3 / Qwen3.5 / Qwen3-Coder-Next	`BF16`, `FP8`, `GPTQ_INT4` examples	Model tutorials and AVX2 docs	Needs smoke
GLM-5 / GLM-5.1	`BF16`, `FP8`, `FP8_PERCHANNEL`	Model tutorials	Needs smoke
Ascend NPU old pages	Old server or `local_chat` paths	Historical docs	Not current support
Intel xPU old pages	Old server or old Docker paths	Historical docs	Not current support
ROCm old pages	Old `local_chat` paths	Historical docs	Legacy
AMD CPU path	AMD BLIS / CPU-side path	Hardware-specific docs after public validation	Supported direction / Needs AMD validation

Fine-Tuning Backends

Current KT SFT means MoE LoRA SFT through LLaMA-Factory. The KT backend name is about CPU expert execution, not global training mixed precision.

KT backend	Actual method	Status	Notes
`AMXBF16`	`AMXBF16_SFT`	Current	Uses BF16 expert checkpoints.
`AMXINT8`	`AMXINT8_SFT`	Current	Uses prepared INT8 expert weights.
`AMXINT4`	`AMXINT4_SFT`	Current / Needs smoke	Requires a matching weight preparation path.
`AMX*_SkipLoRA`	SkipLoRA SFT variants	Advanced	Not the default quick-start path.
`AMXINT4_1` / `KGroup`	Historical enum-level names	Historical	Not exposed by the current public SFT backend map.

Fine-Tuning Models

Model / family	Example	Backend	Status
DeepSeek V2 Lite	`deepseek_v2_lora_sft_kt.yaml`	`AMXBF16`, `AMXINT8`, `AMXINT4`	Current / Needs smoke
DeepSeek V3-0324 BF16	`deepseek_v3_lora_sft_kt.yaml`	`AMXBF16`, `AMXINT8`, `AMXINT4`	Current / Needs smoke
Qwen3-235B-A22B	`qwen3moe_lora_sft_kt.yaml`	`AMXBF16`, `AMXINT8`, `AMXINT4`	Current / Needs smoke
Qwen3.5-397B-A17B	`qwen3_5moe_lora_sft_kt.yaml`	`AMXINT8` first, BF16/INT4 as applicable	Needs smoke
Kimi K2 / K2.5 SFT	Old Kimi SFT guide	Old branch or optimize-rule path	Not current support
DPO	Old DPO tutorial	Historical path	Unconfirmed / Needs validation

Validation Scope

Each row corresponds to a validated combination:

model family + checkpoint + KT method/backend + hardware class + serving/training entry + package/version caveat

If your model path, backend, hardware, or package versions differ, run a minimum validation first before comparing throughput or training results.