Hardware support

Psyllama performance depends on the underlying runtime (llama.cpp) and your CPU/GPU setup.

NVIDIA

For CUDA acceleration, install NVIDIA drivers compatible with your GPU and OS. Ensure your build/runtime has CUDA support enabled.

AMD support depends on your platform and backend support. On Linux, ROCm or Vulkan paths may be available depending on your build.

On macOS, Metal acceleration can provide strong performance on Apple Silicon.

Some environments may use Vulkan for GPU offload depending on build options.

- Use quantized models to reduce memory usage
- Start with smaller models to validate GPU acceleration