KTransformers

Runtime Smoke Checklist

Use this checklist before upgrading a model, precision, hardware, or fine-tuning path from Needs smoke to Current.

Inference Smoke

Record:

  • commit or package versions for kt-kernel, sglang-kt, and relevant dependencies
  • hardware: CPU SKU, GPU SKU/count, RAM, NUMA count
  • model checkpoint and --kt-weight-path
  • exact launch command
  • request command or OpenAI client snippet
  • first-token success and a short generated response
  • logs for selected KT method/backend
  • known warnings that are benign or blocking

Minimum request:

curl -s http://127.0.0.1:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"my-model","messages":[{"role":"user","content":"Say hello in one sentence."}],"max_tokens":32}'

Fine-Tuning Smoke

Record:

  • LLaMA-Factory commit
  • requirements/ktransformers.txt
  • training YAML
  • Accelerate config
  • kt_config
  • first training steps and loss logging
  • checkpoint/adaptor output behavior

When Smoke Is Not Enough

Smoke confirms the path starts and produces output. It does not prove long-context stability, performance, quality, tool calling, or multi-node behavior. Those require separate benchmark or evaluation records.