CPU/GPU Requirements

KTransformers serving depends on both GPU memory and CPU expert throughput. A usable setup depends on the GPU, CPU ISA, memory capacity, NUMA layout, model checkpoint, and KT method.

Baseline Requirements

Component	Guidance
OS	Linux x86-64 for current public packages.
Python	Python 3.10, 3.11, or 3.12 for `kt-kernel` wheels.
GPU	NVIDIA Ampere or newer is the current main path for serving.
CPU	AVX2 minimum for compatibility paths; AVX512 or AMX for higher-throughput paths.
Memory	Large MoE models need high system RAM; size depends on method and CPU weight format.
NUMA	Multi-socket systems should tune `--kt-threadpool-count` and CPU placement.

Current Hardware Scope

Platform	Current status
NVIDIA GPU + x86 CPU	Main documented path for NVIDIA/AMX workflows.
AMD CPU path	Supported direction; no public model validation combination yet.
Ascend NPU	Not current public support.
Intel xPU	Not current public support.
ROCm	Historical documentation only until a current package path is validated.

Planning Approach

Start from the model support combination:

model + method + CPU backend + GPU count + system RAM + package versions

Then tune:

--kt-cpuinfer
--kt-threadpool-count
--kt-num-gpu-experts
prefill threshold and deferred experts if applicable

Hardware Scope

Hardware support is based on concrete model and method smoke results. When you see a hardware class listed, still check that the model, method, CPU backend, GPU count, system memory, and package versions match your environment.