CLI Reference
The kt command is the user-facing CLI from kt-kernel.
Main Commands
| Command | Role |
|---|---|
kt version | Show version and environment information. |
kt doctor | Diagnose environment issues. |
kt run | Start a model inference server. |
kt chat | Chat with a running model. |
kt model | Manage model registry and model paths. |
kt config | Manage CLI configuration. |
kt quant | Quantize model weights where supported. |
kt bench / kt microbench | Run benchmarks. |
kt sft | Fine-tuning helper surface for LLaMA-Factory workflows. |
Common Inference Commands
kt version
kt doctor
kt model list
kt model search m2
kt run m2.1 --dry-run
kt run m2.1
Common Run Options
| Option | Role |
|---|---|
--host, --port | Server bind address. |
--gpu-experts | Registry-level alias for GPU expert placement. |
--cpu-threads | Registry-level alias for CPU worker count. |
--tensor-parallel-size | Tensor parallel size. |
--kt-method | Override KT method when appropriate. |
--attention-backend | Override attention backend. |
--max-total-tokens | Token memory limit. |
--dry-run | Print launch command without starting the server. |
Use kt run --help in the target environment for the definitive command surface.