KTransformers

CLI Reference

The kt command is the user-facing CLI from kt-kernel.

Main Commands

CommandRole
kt versionShow version and environment information.
kt doctorDiagnose environment issues.
kt runStart a model inference server.
kt chatChat with a running model.
kt modelManage model registry and model paths.
kt configManage CLI configuration.
kt quantQuantize model weights where supported.
kt bench / kt microbenchRun benchmarks.
kt sftFine-tuning helper surface for LLaMA-Factory workflows.

Common Inference Commands

kt version
kt doctor
kt model list
kt model search m2
kt run m2.1 --dry-run
kt run m2.1

Common Run Options

OptionRole
--host, --portServer bind address.
--gpu-expertsRegistry-level alias for GPU expert placement.
--cpu-threadsRegistry-level alias for CPU worker count.
--tensor-parallel-sizeTensor parallel size.
--kt-methodOverride KT method when appropriate.
--attention-backendOverride attention backend.
--max-total-tokensToken memory limit.
--dry-runPrint launch command without starting the server.

Use kt run --help in the target environment for the definitive command surface.