LoRA SFT with LLaMA-Factory
KTransformers fine-tuning is driven through LLaMA-Factory. Install LLaMA-Factory first, then install the KT SFT extra from its KT requirements file:
cd /path/to/LLaMA-Factory
pip install -e .
pip install -r requirements/ktransformers.txt
The KT requirements file should contain:
ktransformers[sft]
Installed Components
ktransformers[sft] brings in the KT SFT stack:
| Package | Role |
|---|---|
ktransformers | User-facing package and KT integration. |
kt-kernel | CPU expert kernels and SFT backend implementations. |
transformers-kt | KT-compatible Transformers integration. |
accelerate-kt | KT-aware Accelerate config support. |
It does not install sglang-kt; that package belongs to inference serving.
Example Layout
Current examples live under examples/ktransformers/ in LLaMA-Factory:
examples/ktransformers/train_lora/*.yaml
examples/ktransformers/accelerate/fsdp2_kt_*.yaml
The training YAML enables KT with:
use_kt: true
The Accelerate config selects the KT backend with:
kt_config:
enabled: true
kt_backend: AMXBF16
Launch Shape
Use the LLaMA-Factory KT examples as the starting point:
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch \
--config_file examples/ktransformers/accelerate/fsdp2_kt_int8.yaml \
src/train.py \
examples/ktransformers/train_lora/qwen3_5moe_lora_sft_kt.yaml
The global training mixed precision is separate from the KT backend name. Current examples usually use BF16 training while selecting a KT expert backend through kt_config.
Before treating an example as production-ready, record the exact runtime tuple in the Runtime Smoke Checklist.