Troubleshooting
Start with:
kt version
kt doctor
Install Issues
| Symptom | Check |
|---|
kt command not found | Confirm the environment where kt-kernel was installed is active. |
| Import or wheel error | Confirm Python version and Linux x86-64 environment. |
| CUDA-related failure | Confirm driver, PyTorch CUDA variant, and GPU architecture. |
| AMX kernel not selected | Check lscpu flags and BIOS/kernel AMX support. |
Serving Issues
| Symptom | Check |
|---|
| Server starts but model loading fails | Verify --model-path, --kt-weight-path, and --kt-method match. |
| Output format is unexpected | Check chat template, parser options, and served model name. |
| OOM during startup | Lower GPU expert count or token limits; confirm model-specific memory assumptions. |
| Slow prefill | Check method, CPU backend, NUMA settings, and layerwise prefill threshold. |
Fine-Tuning Issues
| Symptom | Check |
|---|
| KT backend not enabled | Confirm use_kt: true in training YAML and kt_config.enabled: true in Accelerate config. |
| Backend mismatch | Match kt_backend with BF16 or converted INT8/INT4 expert weights. |
| LLaMA-Factory cannot import KT packages | Confirm pip install -r requirements/ktransformers.txt was run in the LLaMA-Factory environment. |
Escalation Data
When filing an issue, include the runtime tuple, full launch command, package versions, hardware summary, and the first blocking log lines.