CPU/GPU Requirements
KTransformers serving depends on both GPU memory and CPU expert throughput. A usable setup is a tuple of GPU, CPU ISA, memory capacity, NUMA layout, model checkpoint, and KT method.
Baseline Requirements
| Component | Guidance |
|---|---|
| OS | Linux x86-64 for current public packages. |
| Python | Python 3.10, 3.11, or 3.12 for kt-kernel wheels. |
| GPU | NVIDIA Ampere or newer is the current main path for serving. |
| CPU | AVX2 minimum for compatibility paths; AVX512 or AMX for higher-throughput paths. |
| Memory | Large MoE models need high system RAM; size depends on method and CPU weight format. |
| NUMA | Multi-socket systems should tune --kt-threadpool-count and CPU placement. |
Current Hardware Scope
| Platform | Website status |
|---|---|
| NVIDIA GPU + x86 CPU | Main documented path. sapphire4-style systems are the current validation target for NVIDIA/AMX workflows. |
| AMD CPU path | Supported direction; publish only after AMD hardware validation records exact tuples. |
| Ascend NPU | Not current public support. |
| Intel xPU | Not current public support. |
| ROCm | Historical documentation only until a current package path is validated. |
Planning Rule
Start from the model support tuple:
model + method + CPU backend + GPU count + system RAM + package versions
Then tune:
--kt-cpuinfer--kt-threadpool-count--kt-num-gpu-experts- prefill threshold and deferred experts if applicable
Production Claim Boundary
Do not write "supports hardware X" unless at least one model/method tuple has been smoke-tested on that hardware class. Hardware support should be specific, not generic.