KTransformers

Installation

KTransformers has two current public package paths. Choose the one that matches your task.

TaskPackagesUse when
Inferencekt-kernel sglang-ktYou want to run a model server with kt run or SGLang-KT.
Fine-tuningktransformers[sft]You want to run MoE LoRA SFT through LLaMA-Factory.

Inference Install

pip install kt-kernel sglang-kt

kt-kernel provides the KT MoE expert backend. sglang-kt provides the SGLang serving path used by current KTransformers inference docs.

Verify the command-line tools:

kt version
kt doctor

Then continue with First inference server.

Fine-Tuning Install

For LLaMA-Factory based SFT:

cd /path/to/LLaMA-Factory
pip install -e .
pip install -r requirements/ktransformers.txt

requirements/ktransformers.txt should contain the public KT SFT entry:

ktransformers[sft]

This installs ktransformers and its SFT dependencies, including kt-kernel, transformers-kt, and accelerate-kt. It does not install sglang-kt, because sglang-kt is inference-only.

Then continue with First LoRA SFT run.

Before Running a Model

Check Support Matrix before treating a model tutorial as current support. Older pages may still exist for traceability, but old local_chat.py, ktransformers/server/main.py, balance_serve, and kt_optimize_rule paths are legacy unless explicitly rewritten.

Useful follow-up pages: